7 Time Series Datasets For Machine Learning
7 Time Series Datasets For Machine Learning
7 Time Series Datasets For Machine Learning
Search...
Tweet Tweet
Share Share
These are problems where a numeric or categorical value must be predicted, but the rows of data are ordered by
time.
A problem when getting started in time series forecasting with machine learning is finding good quality standard
datasets on which to practice.
In this post, you will discover 8 standard time series datasets that you can use to get started and practice time
series forecasting with machine learning.
Kick-start your project with my new book Time Series Forecasting With Python, including step-by-step tutorials
and the Python source code files for all examples.
There are many sources of time series dataset, such as the “Time Series Data Library” created by Rob Hyndman,
Professor of Statistics at Monash University, Australia
Below are 4 univariate time series datasets that you can download from a range of fields such as Sales,
Meteorology, Physics and Demography.
Click to sign-up and also get a free PDF Ebook version of the course.
The units are a sales count and there are 36 observations. The original dataset is credited to Makridakis,
Wheelwright and Hyndman (1998).
Below is a sample of the first 5 rows of data including the header row.
The dataset shows an increasing trend and possibly some seasonal component.
The units are in degrees Celsius and there are 3650 observations. The source of the data is credited as the
Australian Bureau of Meteorology.
Below is a sample of the first 5 rows of data including the header row.
The dataset shows a strong seasonality component and has a nice fine grained detail to work with.
The units are a count and there are 2,820 observations. The source of the dataset is credited to Andrews &
Herzberg (1985).
Below is a sample of the first 5 rows of data including the header row.
The units are a count and there are 365 observations. The source of the dataset is credited to Newton (1988).
Below is a sample of the first 5 rows of data including the header row.
A great source of multivariate time series data is the UCI Machine Learning Repository.
At the time of writing, there are 63 time series datasets that you can download for free and work with.
Below is a selection of 3 recommended multivariate time series datasets from Meteorology, Medicine and
Monitoring domains.
The objective of the problem is to predict whether eyes are open or closed given EEG data alone.
This is a classification predictive modeling problems and there are a total of 14,980 observations and 15 input
variables. The class value of ‘1’ indicates the eye-closed and ‘0’ the eye-open state. Data is ordered by time and
observations were recorded over a period of 117 seconds.
Data is ordered by time and observations were recorded over a period of 117 seconds.
1 4329.23,4009.23,4289.23,4148.21,4350.26,4586.15,4096.92,4641.03,4222.05,4238.46,4211.28,4280.51,4635.9,
2 4324.62,4004.62,4293.85,4148.72,4342.05,4586.67,4097.44,4638.97,4210.77,4226.67,4207.69,4279.49,4632.82
3 4327.69,4006.67,4295.38,4156.41,4336.92,4583.59,4096.92,4630.26,4207.69,4222.05,4206.67,4282.05,4628.72
4 4328.72,4011.79,4296.41,4155.9,4343.59,4582.56,4097.44,4630.77,4217.44,4235.38,4210.77,4287.69,4632.31,
5 4326.15,4011.79,4292.31,4151.28,4347.69,4586.67,4095.9,4627.69,4210.77,4244.1,4212.82,4288.21,4632.82,4
Learn More
There are 20,560 one-minute observations taken over the period of a few weeks. This is a classification prediction
problem. There are 7 attributes including various light and climate properties of the room.
The source for the data is credited to Luis Candanedo from UMONS.
Below is a sample of the first 5 rows of data including the header row.
1 "date","Temperature","Humidity","Light","CO2","HumidityRatio","Occupancy"
2 "1","2015-02-04 17:51:00",23.18,27.272,426,721.25,0.00479298817650529,1
3 "2","2015-02-04 17:51:59",23.15,27.2675,429.5,714,0.00478344094931065,1
4 "3","2015-02-04 17:53:00",23.15,27.245,426,713.5,0.00477946352442199,1
5 "4","2015-02-04 17:54:00",23.15,27.2,426,708.25,0.00477150882608175,1
6 "5","2015-02-04 17:55:00",23.1,27.2,426,704.5,0.00475699293331518,1
7 "6","2015-02-04 17:55:59",23.1,27.2,419,701,0.00475699293331518,1
The data is provided in 3 files that suggest the splits that may be used for training and testing a model.
Learn More
The dataset contains 2,536 observations and 73 attributes. This is a classification prediction problem and the final
attribute indicates the class value as “1” for an ozone day and “0” for a normal day.
Two versions of the data are provided, eight-hour peak set and one-hour peak set. I would suggest using the one
hour peak set for now.
1 1/1/1998,0.8,1.8,2.4,2.1,2,2.1,1.5,1.7,1.9,2.3,3.7,5.5,5.1,5.4,5.4,4.7,4.3,3.5,3.5,2.9,3.2,3.2,2.8,2.6,
2 1/2/1998,2.8,3.2,3.3,2.7,3.3,3.2,2.9,2.8,3.1,3.4,4.2,4.5,4.5,4.3,5.5,5.1,3.8,3,2.6,3,2.2,2.3,2.5,2.8,5.
3 1/3/1998,2.9,2.8,2.6,2.1,2.2,2.5,2.5,2.7,2.2,2.5,3.1,4,4.4,4.6,5.6,5.4,5.2,4.4,3.5,2.7,2.9,3.9,4.1,4.6,
4 1/4/1998,4.7,3.8,3.7,3.8,2.9,3.1,2.8,2.5,2.4,3.1,3.3,3.1,2.3,2.1,2.2,3.8,2.8,2.4,1.9,3.2,4.1,3.9,4.5,4.
5 1/5/1998,2.6,2.1,1.6,1.4,0.9,1.5,1.2,1.4,1.3,1.4,2.2,2,3,3,3.1,3.1,2.7,3,2.4,2.8,2.5,2.5,3.7,3.4,3.7,2.
6 1/6/1998,3.1,3.5,3.3,2.5,1.6,1.7,1.6,1.6,2.3,1.8,2.5,3.9,3.4,2.7,3.4,2.5,2.2,4.4,4.3,3.2,6.2,6.8,5.1,4,
Learn More
Summary
In this post, you discovered a suite of standard time series forecast datasets that you can use to get started and
practice time series forecasting with machine learning methods.
Did you use one of the above datasets in your own project?
Tweet Tweet
Share Share