4 ways to drop columns in pandas DataFrame


Python Pandas

Different methods to drop columns in pandas DataFrame

In this tutorial we will discuss how to drop columns in pandas DataFrame using the following methods:

  • Drop single/multiple columns using drop()
  • Drop single/multiple columns using drop() with columns method
  • Drop single/multiple column using drop() with iloc[] function
  • Drop single/multiple column using drop() with loc[] function

 

Create pandas DataFrame with example data

DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/ tuples and columns refers to the attributes.

We can create the DataFrame by using pandas.DataFrame() method.

Syntax:

pandas.DataFrame(input_data,columns,index)

Parameters:

It will take mainly three parameters

  1. input_data is represents a list of data
  2. columns represent the columns names for the data
  3. index represent the row numbers/values

We can also create a DataFrame using dictionary by skipping columns and indices.

Let’s see an example.

Example:

Python Program to create a dataframe for market data from a dictionary of food items

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#display the dataframe
print(dataframe)

Output:

       id            name    cost  quantity
0  foo-23  ground-nut oil  567.00         1
1  foo-13         almonds  562.56         2
2  foo-02           flour   67.00         3
3  foo-31         cereals   76.09         2

 

Method 1: Drop single/multiple columns using drop()

drop() in Python is used to remove the columns from the pandas dataframe.

We have to provide axis=1 , that specifies the column.

Syntax:

dataframe.drop(['column'],axis=1)

where,

  1. dataframe is the input dataframe
  2. column is the column to  dropped/removed

Example:

In this example, we are going to drop name column

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
print(dataframe.drop(['name'],axis=1))

Output:

       id    cost  quantity
0  foo-23  567.00         1
1  foo-13  562.56         2
2  foo-02   67.00         3
3  foo-31   76.09         2

If we want to drop multiple columns , we have to specify the multiple column names separated by comma.

Example:

In this example, we are going to  remove id, name and quantity

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#drop id , name and quantity
print(dataframe.drop(['name','id','quantity'],axis=1))

Output:

     cost
0  567.00
1  562.56
2   67.00
3   76.09

 

Method 2: Drop single/multiple columns using drop() with columns method

drop() in Python is used to remove the columns from the pandas dataframe. We are using columns() to get the columns using column index, index starts with 0.

We have to provide axis=1 , that specifies the column.

Syntax:

dataframe.drop(dataframe.columns[[index]],axis=1)

where,

  1.  dataframe is the input dataframe
  2. index represent the column position

Example:

In this example, we are going to drop id column

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#drop id column
print(dataframe.drop(dataframe.columns[[0]],axis=1))

Output:

             name    cost  quantity
0  ground-nut oil  567.00         1
1         almonds  562.56         2
2           flour   67.00         3
3         cereals   76.09         2

If we want to drop multiple columns , we have to specify the multiple column names separated by comma.

Example:

In this example, we are going to  remove id, name and cost

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#drop id , name and cost
print(dataframe.drop(dataframe.columns[[0, 1,2]],axis=1))

Output:

   quantity
0         1
1         2
2         3
3         2

 

Method 3: Drop single/multiple columns using drop() with iloc[] function.

drop() in Python is used to remove the columns from the pandas dataframe. We are using iloc[] function  to get the columns using column index, index starts with 0.

We have to provide axis=1 , that specifies the column to be dropped.

Syntax:

dataframe.drop(dataframe.iloc[:, index_slice],axis=1)

where,

  1.  dataframe is the input dataframe
  2. index_slice represent the column positions from start index to end index.

Example:

In this example, we are going to drop id column

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#drop id column
print(dataframe.drop(dataframe.iloc[:, 0:1],axis=1))

Output:

             name    cost  quantity
0  ground-nut oil  567.00         1
1         almonds  562.56         2
2           flour   67.00         3
3         cereals   76.09         2

If we want to drop multiple columns , we have to specify the multiple column names separated by comma.

Example:

In this example, we are going to  remove id, name and cost

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#drop  id,name and cost
print(dataframe.drop(dataframe.iloc[:, 0:3],axis=1))

Output:

   quantity
0         1
1         2
2         3
3         2

 

Method 4: Drop single/multiple columns using drop() with loc[] function.

drop() in Python is used to remove the columns from the pandas dataframe. We are using loc[] function  to get the columns using column names.

We have to provide axis=1 , that specifies the column to be dropped.

Syntax:

dataframe.drop(dataframe.iloc[:, column_slice],axis=1)

where,

  1.  dataframe is the input dataframe
  2. index_slice represent the column positions from start column to end column.

Example:

In this example, we are going to drop id column

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#drop id column
print(dataframe.drop(dataframe.loc[:, :'id'],axis=1))

Output:

             name    cost  quantity
0  ground-nut oil  567.00         1
1         almonds  562.56         2
2           flour   67.00         3
3         cereals   76.09         2

If we want to drop multiple columns , we have to specify the multiple column names separated by comma.

Example:

In this example, we are going to  remove id, name and cost

#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)

#drop  id,name and cost
print(dataframe.drop(dataframe.loc[:, 'id':'cost'],axis=1))

Output:

   quantity
0         1
1         2
2         3
3         2

 

Summary

In this tutorial we discussed how to drop the columns in the pandas DataFrame using drop() function. using this function , we are also applied loc[], iloc[] functions and columns() method . By using these functions/methods we can also drop multiple columns at a time.

 

References

Pandas - drop()

 

 

Deepak Prasad

Deepak Prasad

He is the founder of GoLinuxCloud and brings over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels in various domains, from development to DevOps, Networking, and Security, ensuring robust and efficient solutions for diverse projects. You can connect with him on his LinkedIn profile.

Can't find what you're searching for? Let us assist you.

Enter your query below, and we'll provide instant results tailored to your needs.

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can send mail to admin@golinuxcloud.com

Thank You for your support!!

Leave a Comment