5 ways to select multiple columns in a pandas DataFrame


Python Pandas

Different methods to select multiple columns in pandas DataFrame

In this tutorial we will discuss how to select multiple columns  using the following methods:

  • Using column name with  []
  • Using columns method
  • Using loc[] function
  • Using iloc[] function
  • Using drop() method

 

Create pandas DataFrame with example data

DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/ tuples and columns refers to the attributes.

We can create the DataFrame by using pandas.DataFrame() method.

Syntax:

pandas.DataFrame(input_data,columns,index)

Parameters:

It will take mainly three parameters

  1. input_data is represents a list of data
  2. columns represent the columns names for the data
  3. index represent the row numbers/values

We can also create a DataFrame using dictionary by skipping columns and indices.

 

Example: Python Program to create a dataframe for market data from a dictionary of food items by specifying the column names.

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display the dataframe
print(dataframe)

Output:

            id            name    cost  quantity
item-1  foo-23  ground-nut oil  567.00         1
item-2  foo-13         almonds  562.56         2
item-3  foo-02           flour   67.00         3
item-4  foo-31         cereals   76.09         2

 

Method 1 : Select multiple columns using column name with []

In this method we are going to select the columns using [] with dataframe column name. we have to use [[]] (double) to select multiple columns.

It will display the column name along with rows present in the column

Syntax:

dataframe.[['column',.......,'column']]

where,

  1. dataframe is the input dataframe
  2. column is the column name

 

Example1 : Python program to select id and name column

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display id and name columns from the dataframe
print(dataframe[['id','name']])

Output:

            id            name
item-1  foo-23  ground-nut oil
item-2  foo-13         almonds
item-3  foo-02           flour
item-4  foo-31         cereals

 

Example 2: Python program to get the select id, cost and quantity columns

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display id ,cost and quantity columns from the dataframe
print(dataframe[['id','cost','quantity']])

Output:

            id    cost  quantity
item-1  foo-23  567.00         1
item-2  foo-13  562.56         2
item-3  foo-02   67.00         3
item-4  foo-31   76.09         2

 

Method 2 : Select multiple columns using columns method

columns() method is used to return columns from the pandas dataframe, To get multiple columns we have to provide column index values range through slice operator. Indexing starts with 0.

Syntax:

dataframe[dataframe.columns[start_index:stop_index]]

where,

  1. dataframe is the input dataframe
  2. columns is the method
  3. start_index refers to the starting index column
  4. end_index refers to the ending index column

 

Example 1: Python program to select name, cost and quantity columns

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display name ,cost and quantity columns from the dataframe
print(dataframe[dataframe.columns[1:4]])

Output:

                  name    cost  quantity
item-1  ground-nut oil  567.00         1
item-2         almonds  562.56         2
item-3           flour   67.00         3
item-4         cereals   76.09         2

 

Example 2: Python program to select name and cost  columns

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display name ,cost  columns from the dataframe
print(dataframe[dataframe.columns[1:3]])

Output:

                  name    cost
item-1  ground-nut oil  567.00
item-2         almonds  562.56
item-3           flour   67.00
item-4         cereals   76.09

 

Method 3 : Select multiple columns using loc[] function

Here we are going to use loc[] function to select multiple columns.

We need to specify the column names to be selected inside loc[] function.

Syntax:

dataframe.loc[:,['column',........,'column']]

where,

  1. dataframe is the input dataframe
  2. column refers to the column names
  3. : operator is used to select all rows from the column

 

Example 1: Python program to select name, cost and quantity columns.

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display name ,cost and quantity columns from the dataframe
print(dataframe.loc[:,['name','cost','quantity']])

Output:

                  name    cost  quantity
item-1  ground-nut oil  567.00         1
item-2         almonds  562.56         2
item-3           flour   67.00         3
item-4         cereals   76.09         2

 

Example 2: Python program to select name and cost  columns

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display name ,cost  columns from the dataframe
print(dataframe.loc[:,['name','cost']])

Output:

                  name    cost
item-1  ground-nut oil  567.00
item-2         almonds  562.56
item-3           flour   67.00
item-4         cereals   76.09

 

Method 4 : Select multiple columns using iloc[] function

Here we are going to use iloc[] function to select multiple columns.

We need to specify the column indices to be selected inside iloc[] function.

Syntax:

dataframe.loc[:,['start_column_index':'end_column_index']]

where,

  1. dataframe is the input dataframe
  2. start_column_index refers to the starting column
  3. end_column_index refers to the ending column
  4. : operator is used to select all rows from the column

 

Example 1: Python program to select name, cost and quantity columns.

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display name ,cost and quantity columns from the dataframe
print(dataframe.iloc[:,1:4])

Output:

                  name    cost  quantity
item-1  ground-nut oil  567.00         1
item-2         almonds  562.56         2
item-3           flour   67.00         3
item-4         cereals   76.09         2

 

Example 2: Python program to select name and cost  columns

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display name ,cost  columns from the dataframe
print(dataframe.iloc[:,1:3])

Output:

                  name    cost
item-1  ground-nut oil  567.00
item-2         almonds  562.56
item-3           flour   67.00
item-4         cereals   76.09

 

Method 5 : Select multiple columns using drop() method

Here we are going to remove/drop unwanted columns to be displayed by using drop(). with this we can select multiple columns from the dataframe.

Syntax:

dataframe.drop(['column'],axis=1)

where,

  1. dataframe is the input dataframe
  2. column refers to the column name to be dropped
  3. axis=1 refers to the column

 

Example 1: Python program to select name, cost and quantity columns by dropping id columns.

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display name ,cost and quantity columns from the dataframe
print(dataframe.drop(['id'],axis=1))

Output:

                  name    cost  quantity
item-1  ground-nut oil  567.00         1
item-2         almonds  562.56         2
item-3           flour   67.00         3
item-4         cereals   76.09         2

 

Example 2: Python program to select name and cost  columns by dropping id and quantity columns.

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display name ,cost and quantity columns from the dataframe
print(dataframe.drop(['id','quantity'],axis=1))

Output:

                  name    cost
item-1  ground-nut oil  567.00
item-2         almonds  562.56
item-3           flour   67.00
item-4         cereals   76.09

 

Summary

In this tutorial we discussed how to select multiple column using loc, iloc[], [], columns and drop() methods. We observed that drop() has an advantage to select multiple columns by dropping unwanted columns.

 

References

 

Deepak Prasad

Deepak Prasad

He is the founder of GoLinuxCloud and brings over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels in various domains, from development to DevOps, Networking, and Security, ensuring robust and efficient solutions for diverse projects. You can connect with him on his LinkedIn profile.

Can't find what you're searching for? Let us assist you.

Enter your query below, and we'll provide instant results tailored to your needs.

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can send mail to admin@golinuxcloud.com

Thank You for your support!!

Leave a Comment