Table of Contents
Different methods to iterate over rows in Pandas DataFrame
In this tutorial we will discuss how to iterate over rows in the Pandas DataFrame using the following methods:
- Using index attribute
- Using
loc[]
function - Using
iloc[]
function - Using
iterrows()
method - Using
itertuples()
method
Define DataFrame & Creation with example data
Pandas is a module used for data analysis and processing data. It allows the data to store in an organized format. The word Pandas came from the panel data structure, so it is named PAN(panel ) - DAS (data structure).
DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/ tuples and columns refers to the attributes.
We can create the DataFrame by using pandas.DataFrame() method.
Syntax:
pandas.DataFrame(input_data,columns,index)
Parameters:
It will take mainly three parameters
- input_data is represents a list of data
- columns represent the columns names for the data
- index represent the row numbers/values
We can also create a DataFrame using dictionary by skipping columns and indices.
Let’s see an example.
Example:
Python Program to create a dataframe for market data from a dictionary of food items
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#display the dataframe
print(dataframe)
Output:
id name cost quantity
0 foo-23 ground-nut oil 567.00 1
1 foo-13 almonds 562.56 2
2 foo-02 flour 67.00 3
3 foo-31 cereals 76.09 2
Method-1: Using index attribute
Here, we are going to use index attribute to iterate over rows using column names in the DataFrame. index attribute will return the index of the dataframe.
Syntax:
dataframe.index
We are going to use for loop to iterate over all rows for the columns.
Syntax:
for iterator in dataframe.index:
print(dataframe[‘column’][ iterator],…………..)
Where,
-
dataframe
is the input DataFrame - iterator refer to the index value
- column is the dataframe columns where rows are returned in this column
Example:
In this example, we are going to iterate rows from id and name columns.
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#iterate over rows id and name column using index
for i in dataframe.index:
print(dataframe['id'][i], dataframe['name'][i])
Output:
foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals
Method-2: Using loc[] function
Here we are going to use loc[]
function. loc[]
stands for location , which will return the row using index position with column name. Index position starts with 0
.
Syntax:
dataframe.loc[index,”column”]
where,
- index refers to the row to be returned
- column is the column name where row can be returned from this column.
If we want to iterate through entire dataframe, then we have to use for loop and pass the iterator in place of index.
Syntax:
for iterator in range(len(dataframe)):
print(dataframe.loc[iterator,”column”],…………..)
Example:
In this example, we are going to iterate rows from id and name columns.
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#iterate over rows id and name column using loc[] function
for i in range(len(dataframe)) :
print(dataframe.loc[i, "id"], dataframe.loc[i, "name"])
Output:
foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals
Method-3: Using iloc[] function
Here we are going to use iloc[]
function. iloc[]
stands for location , which will return the row using index position of a row with column index. Index position and column index starts with 0.
Syntax:
dataframe.iloc[row_index,column_index]
where,
-
row_index
refers to the row to be returned -
column_index
is the column position where row can be returned from this column.
If we want to iterate through entire dataframe
, then we have to use for loop and pass the iterator in place of row_index
.
Syntax:
for iterator in range(len(dataframe)):
print(dataframe.loc[iterator,column_index],…………..)
Example:
In this example, we are going to iterate rows from id and name columns.
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#iterate over rows id and name column using iloc[] function
for i in range(len(dataframe)) :
print(dataframe.iloc[i, 0], dataframe.iloc[i, 1])
Output:
foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals
Method-4: Using iterrows() method
We are going to use another method called iterrows()
, which is used to iterate rows from the dataframe with index and row.
Syntax:
dataframe.iterrows()
If we want to iterate over entire dataframe , we have to specify this method with iterator inside the for loop.
Syntax:
for index, iterator in dataframe.iterrows():
print (iterator ["column"], ………………..)
where,
- dataframe is the input dataframe
- index refers to the row index
- column will be the column names such that rows can be returned in that column only.
Example:
In this example, we are going to iterate rows from id and name columns.
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#iterate over rows id and name column using iterrows() method
for index, i in dataframe.iterrows():
print (i["id"], i["name"])
Output:
foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals
Method-5: Using itertuples() method
We are going to use another method called itertuples()
, which is used to iterate rows from the dataframe with index and row.
Syntax:
dataframe.itertuples()
If we want to iterate over entire dataframe , we have to specify this method with iterator inside the for loop. We have to use getattr()
function to get the attribute for the given index/iterator.
Syntax:
for iterator in dataframe.itertuples():
print (getattr(iterator,”column”), ………………..)
where,
- dataframe is the input dataframe
- column will be the column names such that rows can be returned in that column only.
Example:
In this example, we are going to iterate rows from id and name columns.
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#iterate over rows id and name column using itertuples() method
for i in dataframe.itertuples():
print (getattr(i, "id"), getattr(i, "name"))
Output:
foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals
Summary
In this tutorial , we came to point that we can organize the data in the DataFrame using Pandas module and we discussed how to iterate over rows using loc[], iloc[]functions, iterrows(), itertuples() methods and index attribute. So we have also noticed that we can rows from one or many columns at a time using these methods/functions. We implemented these functions/methods using for loop to iterate rows over entire dataframe.
References