Businesses now live in a data-saturated environment – a data swamp if you will, but with fewer ogres – with millions of data points, capable of being extracted from data lakes or generated from IoT sensors daily. Data which solves problems and progresses business.
The Applied Innovation team understand this and research into products and services that aid in developing solutions more easily and quickly than ever before. One subset of this rich data swamp revolves around the change of data over time; the area of time-series which I talked about in my last post here, but that’s not the focus area of this. For this blog, I want to talk about a tool that creates time-series predictions: AWS Forecast.
Amazon’s dedicated cloud-based time-series forecasting tool named appropriately, Forecast provides a cost-effective and intuitive solution to time-series forecasting.
Forecast has a number of impressive features. First, and foremost it supports FIVE algorithms; I’ll tell you how to exploit in a bit. It is also capable of modelling the most absurd time-series problems with data up-to 100 million records due to its custom domain setting. The forecast granularity can be as large as years or as precise as minutes and can incorporate countries’ holiday calendars to aid in making predictions more accurate, e.g. predicting clothing item sales which will increase during the festive periods.
Let me give you a 30-second, high-level tour of how Forecast works:
Firstly, you create a project in Forecast which it calls a Dataset Group. Data which is stored in an S3 bucket can be loaded into Forecast via a Dataset Import Job to create a Dataset. It is worth mentioning that you can supply additional datasets to aid in prediction accuracy. Next, we move onto the exciting part – the data model – which Forecast calls a Predictor. This is where the beauty of Forecast comes in by allowing you to select an option called, ‘AutoML’ which says to Forecast, “Hey! Why don’t you try all of your fancy algorithms on my data and when you’re done, give me back which performed best, I’m away for a cuppa.” Now, this is where it gets confusing (not by complexity, but by naming) because Forecast now generates future data points from the model in what it calls a Forecast (see, confusing) which you can query and filter until your heart’s content.
Well, that’s Forecast in a nutshell, oh and all of this can be done through either: AWS’ shiny UI, API calls or the ever-popular AWS console (for those of you who can’t get enough of the terminal).
The benefits of Forecast are numerous. The fact that it is a fully-fledged AWS service means that you can incorporate all of your favourite existing AWS services into it, straight out of the box! However, by far the greatest benefit of Forecast is when you consider it as a utilitarian service; the ability to run a dataset through Forecast’s pipeline – with AutoML selected – allows for data to be loaded, analysed and evaluated for applicability within a few hours.
Unfortunately, Forecast is not without fault. It is extremely rigid in serving time-series data – which sounds like a good thing! However, if you come from a ML background then you may find it challenging to add other fields into Forecast with many pitfalls to catch you out. However, one of its main drawbacks is its inapplicability to analyse real-time data due to the static nature of its storage and import configuration – S3 buckets. This is a major blocker for Forecast with the rise of time-series usage being down to the IoT devices analytics, however, I remain hopeful that AWS’ very own time-series database, Timestream can rectify this when it is released.
If this blog has interested you and you want to know more email the Applied Innovation team at email@example.com.