Table of Contents
Getting started with Pandas reset_index()
function
In the Pandas library, a DataFrame is a resizable two-dimensional heterogeneous tabular data structure with labeled axes (rows and columns). An index is a label or name attached to each row and column of a DataFrame.
The reset_index()
function is used to reset the index of DataFrame. After calling this function, the DataFrame's current index will be replaced with the default integer index starting from 0. By default, the current index is added to the DataFrame as a column named index.
Below is an example using the reset_index()
function.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['a', 'b', 'c'])
# Print the DataFrame
print(df)
# A B C
# a 1 4 7
# b 2 5 8
# c 3 6 9
# Reset the index
df = df.reset_index()
# Print the DataFrame
print(df)
# index A B C
# 0 a 1 4 7
# 1 b 2 5 8
# 2 c 3 6 9
In this example, the original DataFrame has an index of ['a', 'b', 'c']. When we call reset_index()
, the current index is replaced with a default integer index, starting from 0 and the current index is added as a column to the DataFrame with the name 'index'.
You can also use the reset_index(drop=True)
to remove the current index from the DataFrame and not include it as a column.
df = df.reset_index(drop=True)
This will reset the index and drop the original index from the dataframe.
Example-1: Pandas reset_index()
and change column name
In this example, the current index is reset by calling reset_index()
and the resulting DataFrame has a column named 'index' which holds the old index values. Then the rename()
function is used to rename the 'index
' column to 'original_index
'.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['a', 'b', 'c'])
# Reset the index
df = df.reset_index()
# Rename the 'index' column to 'original_index'
df = df.rename(columns={'index': 'original_index'})
# Print the DataFrame
print(df)
# original_index A B C
# 0 a 1 4 7
# 1 b 2 5 8
# 2 c 3 6 9
Example-2: Pandas reset_index()
and start at 1
By default, when you reset the index of a DataFrame in Pandas using the reset_index()
function, the new index starts at 0. If you want the new index to start at 1, you can use the rename()
function to change the index values after resetting the index.
In this example, the current index is reset by calling reset_index()
and all the index values are incremented by 1
.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['a', 'b', 'c'])
# Reset the index
df = df.reset_index()
# Add 1 to all index values
df.index = df.index + 1
# Print the DataFrame
print(df)
# index A B C
# 1 a 1 4 7
# 2 b 2 5 8
# 3 c 3 6 9
Example-3: Pandas reset_index()
after groupby()
In Pandas, when you use the groupby()
function to group a DataFrame by one or more columns, the resulting DataFrame has the grouping columns as the index.
If you want to reset the index of the grouped DataFrame, you can use the reset_index()
function after calling the groupby()
function.
Here is an example:
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3, 1, 2, 3],
'B': [4, 5, 6, 4, 5, 6],
'C': [7, 8, 9, 7, 8, 9]})
# Group the DataFrame by column 'A'
grouped_df = df.groupby('A').sum()
# Reset the index
grouped_df = grouped_df.reset_index()
# Print the grouped DataFrame
print(grouped_df)
# A B C
# 0 1 8 14
# 1 2 10 13
# 2 3 12 15
Example-4: Pandas Series reset_index()
A pandas series is a one-dimensional labeled array that can contain any data type. Similar to DataFrame , series have indices. This is the label or name given to each item in the series.
The reset_index()
function can also be used to reset the series index, similar to DataFrame. Calling this function replaces the current index of the series with a standard 0-based integer index. The current index is added to the series as a new column called "Index
".
Here is an example of using the reset_index()
function for a series.
import pandas as pd
# Create a sample Series
s = pd.Series([1, 2, 3], index=['a', 'b', 'c'])
# Print the Series
print(s)
# a 1
# b 2
# c 3
# dtype: int64
# Reset the index
s = s.reset_index()
# Print the Series
print(s)
# index 0
# 0 a 1
# 1 b 2
# 2 c 3
In this example, the original Series has an index of ['a', 'b', 'c']. When we call reset_index()
, the current index is replaced with a default integer index, starting from 0, and the current index is added as a column to the Series with the name 'index'.
Example-5: Pandas reset_index()
after sort()
In Pandas, when you sort a DataFrame or Series using the sort_values()
function, the resulting DataFrame or Series will contain new indices based on the sorted order. If you want to reset the index of a sorted DataFrame or sorted series, you can use the reset_index()
function after calling the sort_values()
function.
Here is an example of using the reset_index()
function on a DataFrame after sorting.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'A': [3, 2, 1], 'B': [6, 5, 4], 'C': [9, 8, 7]}, index=['c', 'b', 'a'])
# Sort the DataFrame by column 'A'
sorted_df = df.sort_values('A')
# Reset the index
sorted_df = sorted_df.reset_index()
# Print the sorted DataFrame
print(sorted_df)
# index A B C
# 0 a 1 4 7
# 1 b 2 5 8
# 2 c 3 6 9
In this example, the DataFrame is sorted by the column 'A' using the sort_values()
function and the resulting DataFrame has a new index based on the sorted order. The reset_index()
function is then used to reset the index and the original index values are added to the DataFrame as a new column named 'index'.
Example-6: Pandas reset_index()
after filter
In Pandas, when you filter a DataFrame using query()
or Boolean index, the resulting DataFrame keeps the original index, but the filtering operation only shows the rows that match the filter condition.
If you want to reset the index of the filtered DataFrame, you can use the reset_index()
function after applying the filter.
Here is an example of using the reset_index()
function on a DataFrame after filtering with the query()
function.
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['a', 'b', 'c'])
# Filter the DataFrame
filtered_df = df.query("A > 1")
# Reset the index
filtered_df = filtered_df.reset_index()
# Print the filtered DataFrame
print(filtered_df)
# index A B C
# 0 b 2 5 8
# 1 c 3 6 9
In this example, the DataFrame is filtered by the query function, only rows where A>1 are selected, the resulting DataFrame has a new index based on the filtered order. The reset_index()
function is then used to reset the index and the original index values are added to the DataFrame as a new column named 'index'.
Summary
Pandas' reset_index()
function is used to reset the index of a DataFrame or Series. Calling this function replaces the current index with a standard 0-based integer index and adds the current index as a new column named 'index' to the DataFrame or Series (if you don't use the drop=True
parameter ).
It is used to reset the indices of a DataFrame or Series when the current indices are no longer needed or when new indices are needed. For example, when you sort or group a DataFrame, the resulting DataFrame will have new indices based on the sorting or grouping order. In this case, you can reset the index to the original index or the default integer index.
This is also used after filtering the DataFrame. When you filter a DataFrame using the query()
function or boolean index, the resulting DataFrame keeps the original index, but the filtering operation only shows the rows that match the filter condition. To keep the filtered DataFrame in a consistent format, it is recommended to reset the index so that the index starts from 0 again.
Additionally, resetting the index makes the data easier to identify and manipulate, making the data frame more readable and consistent.
In summary, the reset_index
function is used to reset the index of the dataframe to the default integer index starting from 0 or add the current index as a new column to the dataframe. This is used after sorting, grouping, filtering, and other operations that change the index of the data frame, and resets the index to a consistent, readable format.
References