Different methods to add column to existing DataFrame in pandas
In this tutorial we will discuss how to add column to existing pandas DataFrame using the following methods:
- Using
[]
withNone
value - Using
[]
withConstant
value - Using
[]
withvalues
- Using
insert()
method - Using
assign()
method - Using
[]
withNaN
value
Create pandas DataFrame with example data
DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/ tuples and columns refers to the attributes.
We can create the DataFrame by using pandas.DataFrame() method.
Syntax:
pandas.DataFrame(input_data,columns,index)
Parameters:
It will take mainly three parameters
input_data
is represents a list of datacolumns
represent the columns names for the dataindex
represent the row numbers/values
We can also create a DataFrame using dictionary by skipping columns and indices.
Example: Python Program to create a dataframe for market data from a dictionary of food items by specifying the column names.
# import the module
import pandas
# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
# pass this food to the dataframe by specifying rows
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])
# display the dataframe
print(dataframe)
Output:
id name cost quantity item-1 foo-23 ground-nut oil 567.00 1 item-2 foo-13 almonds 562.56 2 item-3 foo-02 flour 67.00 3 item-4 foo-31 cereals 76.09 2
Method 1 : Using [] with None value
In this method we are going to add a column by filling None
values in that column using []
.
Syntax:
dataframe['new_column']=None
where,
- dataframe is the input dataframe
new_column
is the new column nameNone
is the value to be assigned to this new column for None values
Example: In this example we are going to add a column named stock and pass None values
# import the module
import pandas
# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
# pass this food to the dataframe by specifying rows
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])
# add column - empty column
dataframe['stock']=None
# display dataframe
print(dataframe)
Output:
id name cost quantity stock
item-1 foo-23 ground-nut oil 567.00 1 None
item-2 foo-13 almonds 562.56 2 None
item-3 foo-02 flour 67.00 3 None
item-4 foo-31 cereals 76.09 2 None
Method 2 : Using [] with Constant value
In this method we are going to add a column by filling constant value
in that column using []
.
Syntax:
dataframe['new_column']=value
where,
- dataframe is the input dataframe
new_column
is the new column name- value is the
constant value
which is same in the new column
Example: In this example we are going to add a column named stock and pass value - 45
# import the module
import pandas
# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
# pass this food to the dataframe by specifying rows
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])
# add column - 45 value
dataframe['stock']=45
# display dataframe
print(dataframe)
Output:
id name cost quantity stock
item-1 foo-23 ground-nut oil 567.00 1 45
item-2 foo-13 almonds 562.56 2 45
item-3 foo-02 flour 67.00 3 45
item-4 foo-31 cereals 76.09 2 45
Method 3 : Using [] with values
In this method we are going to add a column by filling values from a list
in that column using []
.
Syntax:
dataframe['new_column']=[value,............,value]
where,
- dataframe is the input dataframe
new_column
is the new column name- value is the value from the
list of values
assigned to each row in the column
Example: In this example we are going to add a column named stock and pass the list of values
# import the module
import pandas
# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
# pass this food to the dataframe by specifying rows
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])
# add column - stock
dataframe['stock']=['yes','no','no','yes']
# display dataframe
print(dataframe)
Output:
id name cost quantity stock
item-1 foo-23 ground-nut oil 567.00 1 yes
item-2 foo-13 almonds 562.56 2 no
item-3 foo-02 flour 67.00 3 no
item-4 foo-31 cereals 76.09 2 yes
Method 4 : Using insert() method
Here, we are using insert()
function to insert a new column at particular location.
Syntax:
dataframe.insert(location,"new_column", [value,.,value])
where,
- 1. dataframe is the input dataframe
location
parameter will take integer value to locate the position of the new columnnew_column
is the name of the new column- last parameter is the
list of values
to be assigned to the column created.
Example: In this example, we are going to add stock column and add values in last position
# import the module
import pandas
# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
# pass this food to the dataframe by specifying rows
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])
# add column - stock at last position
dataframe.insert(4,"stock", ['yes','no','no','yes'])
# display dataframe
print(dataframe)
Output:
id name cost quantity stock
item-1 foo-23 ground-nut oil 567.00 1 yes
item-2 foo-13 almonds 562.56 2 no
item-3 foo-02 flour 67.00 3 no
item-4 foo-31 cereals 76.09 2 yes
Method 5 : Using assign() method
assign()
is used to add a new column by taking the column name and values
Syntax:
dataframe.assign(new_column= [value,.....,value])
where,
- dataframe is the input dataframe
new_column
is the new column name that takeslist of values
Example: In this example, we are going to add stock column and add values.
# import the module
import pandas
# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
# pass this food to the dataframe by specifying rows
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])
# add column - stock
dataframe= dataframe.assign(stock= ['yes','no','no','yes'])
# display dataframe
print(dataframe)
Output:
id name cost quantity stock
item-1 foo-23 ground-nut oil 567.00 1 yes
item-2 foo-13 almonds 562.56 2 no
item-3 foo-02 flour 67.00 3 no
item-4 foo-31 cereals 76.09 2 yes
Method 6 : Using [] with NaN value
In this method we are going to add a column by filling NaN values in that column using [] .NaN
stands for Not a Number. It is available in numpy package, so we have to import numpy
module
Syntax:
dataframe['new_column']=numpy.NaN
where,
- dataframe is the input dataframe
new_column
is the new column namenumpy.NaN
is the value to be assigned to this new column for NaN values
Example: In this example we are going to add a column named stock and pass NaN values.
# import the module
import pandas
import numpy
# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
'name':['ground-nut oil','almonds','flour','cereals'],
'cost':[567.00,562.56,67.00,76.09],
'quantity':[1,2,3,2]}
# pass this food to the dataframe by specifying rows
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])
# add column - stock
dataframe['stock']=numpy.nan
# display dataframe
print(dataframe)
Output:
id name cost quantity stock
item-1 foo-23 ground-nut oil 567.00 1 NaN
item-2 foo-13 almonds 562.56 2 NaN
item-3 foo-02 flour 67.00 3 NaN
item-4 foo-31 cereals 76.09 2 NaN
Summary
In this article, we discussed how to add a new column in the existing dataframe using [],insert(),assign()
and with constant/NaN/None
values. We have seen that , it is possible to add the column at any position by using insert()
function.
References