Pandas DataFrame.rolling() Explained [Practical Examples]

Pandas DataFrame.rolling() Explained [Practical Examples]

Introduction to Pandas rolling() function

Pandas rolling() function is used to provide the window calculations for the given pandas object. By using rolling we can calculate statistical operations like mean(), min(), max() and sum() on the rolling window.

mean() will return the average value, sum() will return the total value, min() will return the minimum value and max() will return the maximum value in the given size of rolling window.

Syntax:

DataFrame.rolling(window, on=None, axis=None)

Parameters

  1. window - It represents the size of the moving window, which will take an integer value
  2. on - It represents the column label or column name for which window calculation is applied
  3. axis - axis - 0 represents rows and axis -1 represents column.

Create sample DataFrame

Let’s create a dataframe with 2 columns with one column as date and another is values.

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

#add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data)

Output:

                  date  values
0  2022-01-01 00:00:00      23
1  2022-01-01 01:00:00      45
2  2022-01-01 02:00:00      32
3  2022-01-01 03:00:00       4
4  2022-01-01 04:00:00      55
5  2022-01-01 05:00:00      44
6  2022-01-01 06:00:00      34
7  2022-01-01 07:00:00      34
8  2022-01-01 08:00:00      67
9  2022-01-01 09:00:00      89
10 2022-01-01 10:00:00      55
11 2022-01-01 11:00:00      34

1. Calculate rolling mean()

Example 1 : In this example, we are going to calculate mean value by setting 2 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

#add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(2).mean())

Output:

0      NaN
1     34.0
2     38.5
3     18.0
4     29.5
5     49.5
6     39.0
7     34.0
8     50.5
9     78.0
10    72.0
11    44.5
Name: values, dtype: float64

Example 2 : In this example, we are going to calculate mean value by setting 5 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(5).mean())

Output:

0      NaN
1      NaN
2      NaN
3      NaN
4     31.8
5     36.0
6     33.8
7     34.2
8     46.8
9     53.6
10    55.8
11    55.8
Name: values, dtype: float64

2. Calculate rolling min()

Example 1 : In this example, we are going to calculate minimum value by setting 2 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(2).min())

Output:

0      NaN
1     23.0
2     32.0
3      4.0
4      4.0
5     44.0
6     34.0
7     34.0
8     34.0
9     67.0
10    55.0
11    34.0
Name: values, dtype: float64

Example 2 : In this example, we are going to calculate minimum value by setting 5 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(5).min())

Output:

0      NaN
1      NaN
2      NaN
3      NaN
4      4.0
5      4.0
6      4.0
7      4.0
8     34.0
9     34.0
10    34.0
11    34.0
Name: values, dtype: float64

3. Calculate rolling max()

Example 1 : In this example, we are going to calculate maximum value by setting 2 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(2).max())

Output:

0      NaN
1     45.0
2     45.0
3     32.0
4     55.0
5     55.0
6     44.0
7     34.0
8     67.0
9     89.0
10    89.0
11    55.0
Name: values, dtype: float64

Example 2 : In this example, we are going to calculate maximum value by setting 7 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(7).max())

Output:

0      NaN
1      NaN
2      NaN
3      NaN
4      NaN
5      NaN
6     55.0
7     55.0
8     67.0
9     89.0
10    89.0
11    89.0
Name: values, dtype: float64

5. Calculate rolling sum()

Example 1 : In this example, we are going to calculate sum/total value by setting 2 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(2).sum())

Output:

0       NaN
1      68.0
2      77.0
3      36.0
4      59.0
5      99.0
6      78.0
7      68.0
8     101.0
9     156.0
10    144.0
11     89.0
Name: values, dtype: float64

Example 2 : In this example, we are going to calculate sum/total value by setting 45 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(45).sum())

Output:

0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    NaN
8    NaN
9    NaN
10   NaN
11   NaN
Name: values, dtype: float64

6. Multiple rolling window calculations

Here we are going to change the parameter of a rolling window to 0.

Example : In this example, we are performing all window calculations

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# window sum calculation
print(data['values'].rolling(0).sum())

print()

# window min calculation
print(data['values'].rolling(0).min())

print()

# window max calculation
print(data['values'].rolling(0).max())

print()

# window mean calculation
print(data['values'].rolling(0).mean())

Output:

0     0.0
1     0.0
2     0.0
3     0.0
4     0.0
5     0.0
6     0.0
7     0.0
8     0.0
9     0.0
10    0.0
11    0.0
Name: values, dtype: float64

0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    NaN
8    NaN
9    NaN
10   NaN
11   NaN
Name: values, dtype: float64

0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    NaN
8    NaN
9    NaN
10   NaN
11   NaN
Name: values, dtype: float64

0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    NaN
8    NaN
9    NaN
10   NaN
11   NaN
Name: values, dtype: float64

Summary

In this tutorial we discussed how to use rolling function and covered all the calculations - sum(), min() , min() and mean() by setting different rolling window values. We elaborated the following with two examples each.

  • rolling by mean()
  • rolling by min()
  • rolling by max()
  • rolling by sum()

References

Pandas dataframe rolling

Deepak Prasad

Deepak Prasad

R&D Engineer

Founder of GoLinuxCloud with over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels across development, DevOps, networking, and security, delivering robust and efficient solutions for diverse projects.