Introduction to Pandas rolling() function
Pandas rolling()
function is used to provide the window calculations for the given pandas object. By using rolling we can calculate statistical operations like mean()
, min()
, max()
and sum()
on the rolling window.
mean()
will return the average value, sum()
will return the total value, min()
will return the minimum value and max()
will return the maximum value in the given size of rolling window.
Syntax:
DataFrame.rolling(window, on=None, axis=None)
Parameters
- window - It represents the size of the moving window, which will take an integer value
- on - It represents the column label or column name for which window calculation is applied
- axis - axis - 0 represents rows and axis -1 represents column.
Create sample DataFrame
Let's create a dataframe with 2 columns with one column as date and another is values.
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
#add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# display
print(data)
Output:
date values
0 2022-01-01 00:00:00 23
1 2022-01-01 01:00:00 45
2 2022-01-01 02:00:00 32
3 2022-01-01 03:00:00 4
4 2022-01-01 04:00:00 55
5 2022-01-01 05:00:00 44
6 2022-01-01 06:00:00 34
7 2022-01-01 07:00:00 34
8 2022-01-01 08:00:00 67
9 2022-01-01 09:00:00 89
10 2022-01-01 10:00:00 55
11 2022-01-01 11:00:00 34
1. Calculate rolling mean()
Example 1 : In this example, we are going to calculate mean value by setting 2 as rolling window
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
#add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# display
print(data['values'].rolling(2).mean())
Output:
0 NaN
1 34.0
2 38.5
3 18.0
4 29.5
5 49.5
6 39.0
7 34.0
8 50.5
9 78.0
10 72.0
11 44.5
Name: values, dtype: float64
Example 2 : In this example, we are going to calculate mean value by setting 5 as rolling window
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# display
print(data['values'].rolling(5).mean())
Output:
0 NaN
1 NaN
2 NaN
3 NaN
4 31.8
5 36.0
6 33.8
7 34.2
8 46.8
9 53.6
10 55.8
11 55.8
Name: values, dtype: float64
2. Calculate rolling min()
Example 1 : In this example, we are going to calculate minimum value by setting 2 as rolling window
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# display
print(data['values'].rolling(2).min())
Output:
0 NaN
1 23.0
2 32.0
3 4.0
4 4.0
5 44.0
6 34.0
7 34.0
8 34.0
9 67.0
10 55.0
11 34.0
Name: values, dtype: float64
Example 2 : In this example, we are going to calculate minimum value by setting 5 as rolling window
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# display
print(data['values'].rolling(5).min())
Output:
0 NaN
1 NaN
2 NaN
3 NaN
4 4.0
5 4.0
6 4.0
7 4.0
8 34.0
9 34.0
10 34.0
11 34.0
Name: values, dtype: float64
3. Calculate rolling max()
Example 1 : In this example, we are going to calculate maximum value by setting 2 as rolling window
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# display
print(data['values'].rolling(2).max())
Output:
0 NaN
1 45.0
2 45.0
3 32.0
4 55.0
5 55.0
6 44.0
7 34.0
8 67.0
9 89.0
10 89.0
11 55.0
Name: values, dtype: float64
Example 2 : In this example, we are going to calculate maximum value by setting 7 as rolling window
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# display
print(data['values'].rolling(7).max())
Output:
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 55.0
7 55.0
8 67.0
9 89.0
10 89.0
11 89.0
Name: values, dtype: float64
5. Calculate rolling sum()
Example 1 : In this example, we are going to calculate sum/total value by setting 2 as rolling window
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# display
print(data['values'].rolling(2).sum())
Output:
0 NaN
1 68.0
2 77.0
3 36.0
4 59.0
5 99.0
6 78.0
7 68.0
8 101.0
9 156.0
10 144.0
11 89.0
Name: values, dtype: float64
Example 2 : In this example, we are going to calculate sum/total value by setting 45 as rolling window
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# display
print(data['values'].rolling(45).sum())
Output:
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 NaN
11 NaN
Name: values, dtype: float64
6. Multiple rolling window calculations
Here we are going to change the parameter of a rolling window to 0.
Example : In this example, we are performing all window calculations
# import pandas
import pandas
import numpy
# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()
# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])
# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]
# window sum calculation
print(data['values'].rolling(0).sum())
print()
# window min calculation
print(data['values'].rolling(0).min())
print()
# window max calculation
print(data['values'].rolling(0).max())
print()
# window mean calculation
print(data['values'].rolling(0).mean())
Output:
0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
5 0.0
6 0.0
7 0.0
8 0.0
9 0.0
10 0.0
11 0.0
Name: values, dtype: float64
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 NaN
11 NaN
Name: values, dtype: float64
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 NaN
11 NaN
Name: values, dtype: float64
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 NaN
11 NaN
Name: values, dtype: float64
Summary
In this tutorial we discussed how to use rolling function and covered all the calculations - sum(), min() , min() and mean() by setting different rolling window values. We elaborated the following with two examples each.
- rolling by
mean()
- rolling by
min()
- rolling by
max()
- rolling by
sum()
References