How to iterate over rows in Pandas DataFrame [5 methods]

Different methods to iterate over rows in Pandas DataFrame

In this tutorial we will discuss how to iterate over rows in the Pandas DataFrame using the following methods:

  • Using index attribute
  • Using loc[] function
  • Using iloc[] function
  • Using iterrows() method
  • Using itertuples() method

 

Define DataFrame & Creation with example data

Pandas is a module used for data analysis and processing data. It allows the data to store in an organized format. The word Pandas came from the panel data structure, so it is named PAN(panel ) - DAS (data structure).

Advertisement

DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/ tuples and columns refers to the attributes.

We can create the DataFrame by using pandas.DataFrame() method.

Syntax:

pandas.DataFrame(input_data,columns,index)

Parameters:

It will take mainly three parameters

  1. input_data is represents a list of data
  2. columns represent the columns names for the data
  3. index represent the row numbers/values

We can also create a DataFrame using dictionary by skipping columns and indices.

Advertisement

Let’s see an example.

 

Example:

Python Program to create a dataframe for market data from a dictionary of food items

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#display the dataframe
print(dataframe)

Output:

       id            name    cost  quantity
0  foo-23  ground-nut oil  567.00         1
1  foo-13         almonds  562.56         2
2  foo-02           flour   67.00         3
3  foo-31         cereals   76.09         2

 

Method-1: Using index attribute

Here, we are going to use index attribute to iterate over rows using column names in the DataFrame. index attribute will return the index of the dataframe.

Syntax:

Advertisement
dataframe.index

We are going to use for loop to iterate over all rows for the columns.

Syntax:

for iterator in dataframe.index:
            print(dataframe[‘column’][ iterator],…………..)

Where,

  1. dataframe is the input DataFrame
  2. iterator refer to the index value
  3. column is the dataframe columns where rows are returned in this column

Example:

In this example, we are going to iterate rows from id and name columns.

Advertisement
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#iterate over rows id and name column using index
for i in dataframe.index:
     print(dataframe['id'][i], dataframe['name'][i])

Output:

foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals

 

Method-2: Using loc[] function

Here we are going to use loc[] function. loc[] stands for location , which will return the row using index position with column name. Index position starts with .

Syntax:

dataframe.loc[index,”column”]

where,

  1. index refers to the row to be returned
  2. column is the column name where row can be returned from this column.

If we want to iterate through entire dataframe, then we have to use for loop and pass the iterator in place of index.

Syntax:

Advertisement
for iterator in range(len(dataframe)):
             print(dataframe.loc[iterator,”column”],…………..)

Example:

In this example, we are going to iterate rows from id and name columns.

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#iterate over rows id and name column using loc[] function
for i in range(len(dataframe)) :
  print(dataframe.loc[i, "id"], dataframe.loc[i, "name"])

Output:

foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals

 

Method-3: Using iloc[] function

Here we are going to use iloc[] function. iloc[] stands for location , which will return the row using index position of a row with column index. Index position and column index starts with 0.

Syntax:

Advertisement
dataframe.iloc[row_index,column_index]

where,

  1. row_index refers to the row to be returned
  2. column_index is the column position where row can be returned from this column.

If we want to iterate through entire dataframe, then we have to use for loop and pass the iterator in place of row_index.

Syntax:

for iterator in range(len(dataframe)):
    print(dataframe.loc[iterator,column_index],…………..)

Example:

In this example, we are going to iterate rows from id and name columns.

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#iterate over rows id and name column using iloc[] function
for i in range(len(dataframe)) :
  print(dataframe.iloc[i, 0], dataframe.iloc[i, 1])

Output:

Advertisement
foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals

 

Method-4: Using iterrows() method

We are going to use another method called iterrows() , which is used to iterate rows from the dataframe with index and row.

Syntax:

dataframe.iterrows()

If we want to iterate over entire dataframe , we have to specify this method with iterator inside the for loop.

Syntax:

for index, iterator in dataframe.iterrows():
    print (iterator  ["column"], ………………..)

where,

  1. dataframe is the input dataframe
  2. index refers to the row index
  3. column will be the column names such that rows can be returned in that column only.

Example:

In this example, we are going to iterate rows from id and name columns.

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#iterate over rows id and name column using iterrows() method
for index, i in dataframe.iterrows():
    print (i["id"], i["name"])

Output:

foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals

 

Method-5: Using itertuples() method

We are going to use another method called itertuples() , which is used to iterate rows from the dataframe with index and row.

Syntax:

dataframe.itertuples()

If we want to iterate over entire dataframe , we have to specify this method with iterator inside the for loop. We have to use getattr() function to get the attribute for the given index/iterator.

Syntax:

for iterator in dataframe.itertuples():
  print (getattr(iterator,”column”), ………………..)

where,

  1. dataframe is the input dataframe
  2. column will be the column names such that rows can be returned in that column only.

Example:

In this example, we are going to iterate rows from id and name columns.

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#iterate over rows id and name column using itertuples() method
for i in dataframe.itertuples():
    print (getattr(i, "id"), getattr(i, "name"))

Output:

foo-23 ground-nut oil
foo-13 almonds
foo-02 flour
foo-31 cereals

 

Summary

In this tutorial , we came to point that we can organize the data in the DataFrame using Pandas module and we discussed how to iterate over rows using loc[], iloc[]functions, iterrows(), itertuples() methods and index attribute. So we have also noticed that we can rows from one or many columns at a time using these methods/functions. We implemented these functions/methods using for loop to iterate rows over entire dataframe.

 

References

Python - DataFrame

 

Didn't find what you were looking for? Perform a quick search across GoLinuxCloud

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can either use the comments section or contact me form.

Thank You for your support!!

Leave a Comment

X