Pandas DataFrame.rolling() Explained [Practical Examples]


Python Pandas

Introduction to Pandas rolling() function

Pandas rolling() function is used to provide the window calculations for the given pandas object. By using rolling we can calculate statistical operations like mean(), min(), max() and sum() on the rolling window.

mean() will return the average value, sum() will return the total value, min() will return the minimum value and max() will return the maximum value in the given size of rolling window.

Syntax:

DataFrame.rolling(window, on=None, axis=None)

Parameters

  1. window - It represents the size of the moving window, which will take an integer value
  2. on - It represents the column label  or column name for which window calculation is applied
  3. axis - axis - 0 represents rows and axis -1 represents column.

 

Create sample DataFrame

Let's create a dataframe with 2 columns with one column as date and another is values.

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

#add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data)

Output:

                  date  values
0  2022-01-01 00:00:00      23
1  2022-01-01 01:00:00      45
2  2022-01-01 02:00:00      32
3  2022-01-01 03:00:00       4
4  2022-01-01 04:00:00      55
5  2022-01-01 05:00:00      44
6  2022-01-01 06:00:00      34
7  2022-01-01 07:00:00      34
8  2022-01-01 08:00:00      67
9  2022-01-01 09:00:00      89
10 2022-01-01 10:00:00      55
11 2022-01-01 11:00:00      34

 

1. Calculate rolling mean()

Example 1 : In this example, we are going to calculate mean value by setting 2 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

#add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(2).mean())

Output:

0      NaN
1     34.0
2     38.5
3     18.0
4     29.5
5     49.5
6     39.0
7     34.0
8     50.5
9     78.0
10    72.0
11    44.5
Name: values, dtype: float64

 

Example 2 : In this example, we are going to calculate mean value by setting 5 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(5).mean())

Output:

0      NaN
1      NaN
2      NaN
3      NaN
4     31.8
5     36.0
6     33.8
7     34.2
8     46.8
9     53.6
10    55.8
11    55.8
Name: values, dtype: float64

 

2. Calculate rolling min()

Example 1 : In this example, we are going to calculate minimum  value by setting 2 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(2).min())

Output:

0      NaN
1     23.0
2     32.0
3      4.0
4      4.0
5     44.0
6     34.0
7     34.0
8     34.0
9     67.0
10    55.0
11    34.0
Name: values, dtype: float64

 

Example 2 : In this example, we are going to calculate minimum  value by setting 5 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(5).min())

Output:

0      NaN
1      NaN
2      NaN
3      NaN
4      4.0
5      4.0
6      4.0
7      4.0
8     34.0
9     34.0
10    34.0
11    34.0
Name: values, dtype: float64

 

3. Calculate rolling max()

Example 1 : In this example, we are going to calculate maximum value by setting 2 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(2).max())

Output:

0      NaN
1     45.0
2     45.0
3     32.0
4     55.0
5     55.0
6     44.0
7     34.0
8     67.0
9     89.0
10    89.0
11    55.0
Name: values, dtype: float64

 

Example 2 : In this example, we are going to calculate maximum value by setting 7 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(7).max())

Output:

0      NaN
1      NaN
2      NaN
3      NaN
4      NaN
5      NaN
6     55.0
7     55.0
8     67.0
9     89.0
10    89.0
11    89.0
Name: values, dtype: float64

 

5. Calculate rolling sum()

Example 1 : In this example, we are going to calculate sum/total value by setting 2 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(2).sum())

Output:

0       NaN
1      68.0
2      77.0
3      36.0
4      59.0
5      99.0
6      78.0
7      68.0
8     101.0
9     156.0
10    144.0
11     89.0
Name: values, dtype: float64

 

Example 2 : In this example, we are going to calculate sum/total value by setting 45 as rolling window

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# display
print(data['values'].rolling(45).sum())

Output:

0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    NaN
8    NaN
9    NaN
10   NaN
11   NaN
Name: values, dtype: float64

 

6. Multiple rolling window calculations

Here we are going to change the parameter of a rolling window to 0.

Example : In this example, we are performing all window calculations

# import pandas 
import pandas 
import numpy

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H').to_list()

# create dataframe- date column for the date data
data = pandas.DataFrame(data,columns=['date'])

# add values column to the dataframe
data['values']=[23,45,32,4,55,44,34,34,67,89,55,34]

# window sum calculation
print(data['values'].rolling(0).sum())

print()

# window min calculation
print(data['values'].rolling(0).min())

print()

# window max calculation
print(data['values'].rolling(0).max())

print()

# window mean calculation
print(data['values'].rolling(0).mean())

Output:

0     0.0
1     0.0
2     0.0
3     0.0
4     0.0
5     0.0
6     0.0
7     0.0
8     0.0
9     0.0
10    0.0
11    0.0
Name: values, dtype: float64

0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    NaN
8    NaN
9    NaN
10   NaN
11   NaN
Name: values, dtype: float64

0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    NaN
8    NaN
9    NaN
10   NaN
11   NaN
Name: values, dtype: float64

0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    NaN
8    NaN
9    NaN
10   NaN
11   NaN
Name: values, dtype: float64

 

Summary

In this tutorial we discussed how to use rolling function and covered all the calculations - sum(), min() , min() and mean() by setting different rolling window values. We elaborated the following with two examples each.

  • rolling by mean()
  • rolling by min()
  • rolling by max()
  • rolling by sum()

 

References

Pandas dataframe rolling

 

Deepak Prasad

Deepak Prasad

He is the founder of GoLinuxCloud and brings over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels in various domains, from development to DevOps, Networking, and Security, ensuring robust and efficient solutions for diverse projects. You can connect with him on his LinkedIn profile.

Can't find what you're searching for? Let us assist you.

Enter your query below, and we'll provide instant results tailored to your needs.

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can send mail to admin@golinuxcloud.com

Thank You for your support!!

Leave a Comment