Different methods to drop columns in pandas DataFrame
In this tutorial we will discuss how to drop columns in pandas DataFrame
using the following methods:
- Drop single/multiple columns using
drop()
- Drop single/multiple columns using
drop()
withcolumns
method - Drop single/multiple column using
drop()
withiloc[]
function - Drop single/multiple column using
drop()
withloc[]
function
Create pandas DataFrame with example data
DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/ tuples and columns refers to the attributes.
We can create the DataFrame by using pandas.DataFrame() method.
Syntax:
pandas.DataFrame(input_data,columns,index)
Parameters:
It will take mainly three parameters
- input_data is represents a list of data
columns
represent the columns names for the dataindex
represent the row numbers/values
We can also create a DataFrame using dictionary by skipping columns and indices.
Let’s see an example.
Example:
Python Program to create a dataframe for market data from a dictionary of food items
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#display the dataframe
print(dataframe)
Output:
id name cost quantity
0 foo-23 ground-nut oil 567.00 1
1 foo-13 almonds 562.56 2
2 foo-02 flour 67.00 3
3 foo-31 cereals 76.09 2
Method 1: Drop single/multiple columns using drop()
drop()
in Python is used to remove the columns from the pandas dataframe.
We have to provide axis=1
, that specifies the column.
Syntax:
dataframe.drop(['column'],axis=1)
where,
- dataframe is the input dataframe
- column is the column to dropped/removed
Example:
In this example, we are going to drop name column
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
print(dataframe.drop(['name'],axis=1))
Output:
id cost quantity
0 foo-23 567.00 1
1 foo-13 562.56 2
2 foo-02 67.00 3
3 foo-31 76.09 2
If we want to drop multiple columns , we have to specify the multiple column names separated by comma.
Example:
In this example, we are going to remove id, name and quantity
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#drop id , name and quantity
print(dataframe.drop(['name','id','quantity'],axis=1))
Output:
cost
0 567.00
1 562.56
2 67.00
3 76.09
Method 2: Drop single/multiple columns using drop() with columns method
drop()
in Python is used to remove the columns from the pandas dataframe. We are using columns()
to get the columns using column index, index starts with 0.
We have to provide axis=1
, that specifies the column.
Syntax:
dataframe.drop(dataframe.columns[[index]],axis=1)
where,
- dataframe is the input dataframe
index
represent the column position
Example:
In this example, we are going to drop id column
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#drop id column
print(dataframe.drop(dataframe.columns[[0]],axis=1))
Output:
name cost quantity
0 ground-nut oil 567.00 1
1 almonds 562.56 2
2 flour 67.00 3
3 cereals 76.09 2
If we want to drop multiple columns , we have to specify the multiple column names separated by comma.
Example:
In this example, we are going to remove id, name and cost
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#drop id , name and cost
print(dataframe.drop(dataframe.columns[[0, 1,2]],axis=1))
Output:
quantity
0 1
1 2
2 3
3 2
Method 3: Drop single/multiple columns using drop() with iloc[] function.
drop()
in Python is used to remove the columns from the pandas dataframe. We are using iloc[]
function to get the columns using column index, index starts with 0
.
We have to provide axis=1 , that specifies the column to be dropped.
Syntax:
dataframe.drop(dataframe.iloc[:, index_slice],axis=1)
where,
- dataframe is the input dataframe
index_slice
represent the column positions from start index to end index.
Example:
In this example, we are going to drop id column
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#drop id column
print(dataframe.drop(dataframe.iloc[:, 0:1],axis=1))
Output:
name cost quantity
0 ground-nut oil 567.00 1
1 almonds 562.56 2
2 flour 67.00 3
3 cereals 76.09 2
If we want to drop multiple columns , we have to specify the multiple column names separated by comma.
Example:
In this example, we are going to remove id, name and cost
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#drop id,name and cost
print(dataframe.drop(dataframe.iloc[:, 0:3],axis=1))
Output:
quantity
0 1
1 2
2 3
3 2
Method 4: Drop single/multiple columns using drop() with loc[] function.
drop()
in Python is used to remove the columns from the pandas dataframe. We are using loc[]
function to get the columns using column names
.
We have to provide axis=1 , that specifies the column to be dropped.
Syntax:
dataframe.drop(dataframe.iloc[:, column_slice],axis=1)
where,
- dataframe is the input dataframe
index_slice
represent the column positions from start column to end column.
Example:
In this example, we are going to drop id column
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#drop id column
print(dataframe.drop(dataframe.loc[:, :'id'],axis=1))
Output:
name cost quantity
0 ground-nut oil 567.00 1
1 almonds 562.56 2
2 flour 67.00 3
3 cereals 76.09 2
If we want to drop multiple columns , we have to specify the multiple column names separated by comma.
Example:
In this example, we are going to remove id, name and cost
#import the module
import pandas
#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
#pass this food to the dataframe
dataframe=pandas.DataFrame(food_input)
#drop id,name and cost
print(dataframe.drop(dataframe.loc[:, 'id':'cost'],axis=1))
Output:
quantity
0 1
1 2
2 3
3 2
Summary
In this tutorial we discussed how to drop the columns in the pandas DataFrame using drop()
function. using this function , we are also applied loc[]
, iloc[]
functions and columns()
method . By using these functions/methods we can also drop multiple columns at a time.
References