Use Pandas DataFrame read_csv() as a Pro [Practical Examples]

Different methods to read_csv into pandas DataFrame

In this tutorial we will discuss about read_csv(). This function is used to convert the csv file data into pandas dataframe. We will learn to convert csv file to pandas dataframe with different parameters of read_csv() function

  • Import csv to pandas DataFrame using read_csv()
  • read_csv() with first row as header
  • read_csv() with custom index
  • read_csv() with new column names
  • read_csv() with skip rows
  • Read first N rows from csv to pandas DataFrame
  • Import specific columns from csv to pandas DataFrame using read_csv()
  • read_csv() from absolute path
  • read_csv() from relative path

 

Sample CSV content

Since this article is all about converting CSV to pandas dataframe, so we will take the below sample.csv file. This CSV file will be used for all our methods and examples explanation in this tutorial:

Advertisement
,id,name,cost,quantity
item-1,foo-23,ground-nut oil,567.0,1
item-2,foo-13,almonds,562.56,2
item-3,foo-02,flour,67.0,3
item-4,foo-31,cereals,76.09,2

Scenario-1 : Import csv to pandas DataFrame using read_csv()

Here we are going  to consider the csv file from the above and import the csv data into the pandas dataframe by specifying no parameters inside read_csv() function.

Syntax:

pandas.read_csv("file_name.csv")

where, file_name is the name of the input csv file.

Example: In this example, we are going to import csv to pandas dataframe

# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv")

#display the dataframe
print(dataframe)

Output:

  Unnamed: 0      id            name    cost  quantity
0     item-1  foo-23  ground-nut oil  567.00         1
1     item-2  foo-13         almonds  562.56         2
2     item-3  foo-02           flour   67.00         3
3     item-4  foo-31         cereals   76.09         2

 

Scenario-2 : read_csv() with first row as header

Here we are going  to consider the csv file from the above and import the csv data into the pandas dataframe by specifying no parameters inside read_csv() function. It will get the header for first row.

Advertisement

Syntax:

pandas.read_csv("file_name.csv")

where, file_name is the name of the input csv file.

 

Example: In this example, we are going to import csv to pandas dataframe

# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv")

#display the dataframe
print(dataframe)

Output:

  Unnamed: 0      id            name    cost  quantity
0     item-1  foo-23  ground-nut oil  567.00         1
1     item-2  foo-13         almonds  562.56         2
2     item-3  foo-02           flour   67.00         3
3     item-4  foo-31         cereals   76.09         2

 

Scenario-3 : read_csv() with  custom index

Here we are going to specify the custom index from the existing columns in the csv file through index_col parameter.

Syntax:

Advertisement
pandas.read_csv("file_name.csv",index_col='column')

where,

  1. file_name is the name of the input csv file.
  2. index_col will take column name as input

 

Example: In this example, we are going to import csv to pandas dataframe with quantity as custom index

# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv",index_col='quantity')

#display the dataframe
print(dataframe)

Output:

         Unnamed: 0      id            name    cost
quantity                                           
1            item-1  foo-23  ground-nut oil  567.00
2            item-2  foo-13         almonds  562.56
3            item-3  foo-02           flour   67.00
2            item-4  foo-31         cereals   76.09

 

Scenario-4 : read_csv() with new column names

Here we are going to specify the new column names  from the existing columns in the csv file through names parameter to the pandas dataframe. It will take list of column names

Syntax:

pandas.read_csv("file_name.csv",names=[columns])

where,

Advertisement
  1. file_name is the name of the input csv file.
  2. columns will be the list of columns

 

Example: In this example, we are going to import csv to pandas dataframe with new columns

# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv",names=['ID','NAME','COST','QUANTITY'])

#display the dataframe
print(dataframe)

Output:

            ID            NAME    COST  QUANTITY
NaN         id            name    cost  quantity
item-1  foo-23  ground-nut oil   567.0         1
item-2  foo-13         almonds  562.56         2
item-3  foo-02           flour    67.0         3
item-4  foo-31         cereals   76.09         2

 

Scenario-5 : read_csv() with skip rows

In this method , we are using skiprows parameter to remove top n rows into the pandas dataframe.

Syntax:

pandas.read_csv("file_name.csv",skiprows)

where,

  1. file_name is the name of the input csv file.
  2. skiprows will take number of rows to be skipped.

 

Example 1: In this example, we are going to import csv to pandas dataframe by skipping 2 rows

Advertisement
# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv",skiprows=2)

#display the dataframe
print(dataframe)

Output:

   item-2  foo-13  almonds  562.56  2
0  item-3  foo-02    flour   67.00  3
1  item-4  foo-31  cereals   76.09  2

 

Example 2: In this example, we are going to import csv to pandas dataframe by skipping 4 rows

# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv",skiprows=4)

#display the dataframe
print(dataframe)

Output:

Empty DataFrame
Columns: [item-4, foo-31, cereals, 76.09, 2]
Index: []

 

Scenario-6 : Read first N rows from csv to pandas DataFrame

nrows is the parameter used to return first n rows to the pandas dataframe  from the csv file .

Syntax:

pandas.read_csv("file_name.csv",nrows)

where,

Advertisement
  1. file_name is the name of the input csv file.
  2. nrows will take number of rows to be returned from the top of csv.

 

Example 1: In this example, we are going to import csv to pandas dataframe by returning top 2 rows

# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv",nrows=2)

#display the dataframe
print(dataframe)

Output:

  Unnamed: 0      id            name    cost  quantity
0     item-1  foo-23  ground-nut oil  567.00         1
1     item-2  foo-13         almonds  562.56         2

 

Example 2: In this example, we are going to import csv to pandas dataframe by returning top 4 rows

# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv",nrows=4)

#display the dataframe
print(dataframe)

Output:

  Unnamed: 0      id            name    cost  quantity
0     item-1  foo-23  ground-nut oil  567.00         1
1     item-2  foo-13         almonds  562.56         2
2     item-3  foo-02           flour   67.00         3
3     item-4  foo-31         cereals   76.09         2

 

Scenario-7 : Import specific columns from csv to pandas DataFrame using read_csv()

columns is the parameter used to get only particular to the pandas dataframe from the csv file .It will take list of columns

Syntax:

Advertisement
pandas.DataFrame(dataframe,columns=[columns])

where, columns will take list of columns  and dataframe is resulted from the csv file

 

Example 1: In this example, we are going to import csv to pandas dataframe by taking id and name

# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv")


#display the dataframe
print(pandas.DataFrame(dataframe,columns=['id','name']))

Output:

       id            name
0  foo-23  ground-nut oil
1  foo-13         almonds
2  foo-02           flour
3  foo-31         cereals

 

Example 2: In this example, we are going to import csv to pandas dataframe by taking only cost.

# import pandas
import pandas 

#read the csv 
dataframe=pandas.read_csv("sample.csv")


#display the dataframe
print(pandas.DataFrame(dataframe,columns=['cost']))

Output:

     cost
0  567.00
1  562.56
2   67.00
3   76.09

 

Scenario-8 : read_csv() using absolute path

We can also read csv from the path (path in out local system) inplace of file_name. Any way we can specify the file_name along with the path. It will take all the parameters as same as the above

Syntax:

pandas.read_csv("path//......file_name.csv")

where, file_name is the name of the input csv file.

 

Example: In this example, we are going to import csv to pandas dataframe taken file from following path:

# ls -l /root/sample.csv
-rw-r--r-- 1 root root 148 Jan 26 23:42 /root/sample.csv

While our script also lies in the same path:

]# ls -l /root/eg-1.py
-rw-r--r-- 1 root root 136 Jan 27 14:50 /root/eg-1.py

So this will be our script with absolute path of CSV file:

# import pandas
import pandas 

# read the csv 
dataframe=pandas.read_csv("<code class="language-">/root/sample.csv") # display the dataframe print(dataframe)

Output:

  Unnamed: 0      id            name    cost  quantity
0     item-1  foo-23  ground-nut oil  567.00         1
1     item-2  foo-13         almonds  562.56         2
2     item-3  foo-02           flour   67.00         3
3     item-4  foo-31         cereals   76.09         2

 

Scenario-9 : read_csv() from relative path

We can also read csv from the relative path inplace of file_name. Any way we can specify the file_name along with the path. It will take all the parameters as same as the above. we have to mention r for relative path

Syntax:

pandas.read_csv(r"path//......file_name.csv")

where, file_name is the name of the input csv file.

 

Example: In this example, we are going to import csv to pandas dataframe using relative path.

Now we will move our script to /tmp

]# ls -l /tmp/eg-1.py
-rw-r--r-- 1 root root 136 Jan 27 14:50 /tmp/eg-1.py

And try to access /root/sample.csv using relative path:

# import pandas
import pandas 

# read the csv 
dataframe=pandas.read_csv("../root/sample.csv")

# display the dataframe
print(dataframe)

Output:

  Unnamed: 0      id            name    cost  quantity
0     item-1  foo-23  ground-nut oil  567.00         1
1     item-2  foo-13         almonds  562.56         2
2     item-3  foo-02           flour   67.00         3
3     item-4  foo-31         cereals   76.09         2

 

Summary

In this article, we discussed how to read a csv into the pandas dataframe using read_csv() function for following scenarios:

  • read_csv() with first row as header
  • read_csv() with custom index
  • read_csv() with new column names
  • read_csv() with skip rows
  • Read first N rows from csv to pandas DataFrame
  • Import specific columns from csv to pandas DataFrame using read_csv()
  • read_csv() from absolute path
  • read_csv() from relative path

 

Reference

Pandas read_csv()

 

Didn't find what you were looking for? Perform a quick search across GoLinuxCloud

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can either use the comments section or contact me form.

Thank You for your support!!

Leave a Comment

X