Exploring pandas melt() function [Practical Examples]

Introduction to Pandas melt() function

Pandas melt() function is used to unpivot a DataFrame from wide to long format, optionally leaving identifiers set. A pivot table aggregates the values in a data set. In this tutorial, we’ll learn how to do the opposite: break an aggregated collection of data into an unaggregated one.

This function is useful to massage a DataFrame into a format where one or more columns are identifier variables (id_vars), while all other columns, considered measured variables (value_vars), are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable’ and ‘value’.

Advertisement

Syntax:

pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None, ignore_index=True)

Parameters

  • id_vars - tuple, list, or array, optional. Column(s) to use as identifier variables.
  • value_vars - tuple, list, or array, optional. Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.
  • var_name - scalar. Name to use for the ‘variable’ column. If None it uses frame.columns.name or ‘variable’.
  • value_name - scalar, default ‘value’. Name to use for the ‘value’ column.
  • col_level - int or str, optional. If columns are a MultiIndex then use this level to melt.
  • ignore_index - bool, default True. If True, original index is ignored. If False, the original index is retained. Index labels will be repeated as necessary.

 

What we will cover?

We will discuss the following:

  • Set one column as identifier variable and one column as value variable
  • Set one column as identifier variable and two columns as value variable

 

Create pandas DataFrame with example data

DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/tuples and columns refers to the attributes.

We can create the DataFrame by using pandas.DataFrame() method.

Syntax:

Advertisement
pandas.DataFrame(input_data,columns,index)

Parameters:

It will take mainly three parameters

  1. input_data is represents a list of data
  2. columns represent the columns names for the data
  3. index represent the row numbers/values

We can also create a DataFrame using dictionary by skipping columns and indices.

Example: Python Program to create a dataframe for college data from a dictionary.

# import pandas 
import pandas 

# create dataframe with college data
dataframe = pandas.DataFrame({'Name': {0: 'sravan', 1: 'bobby', 2: 'deepak',3:'prasad'},
				'Subjects': {0: 'PHP', 1: 'PHP', 2: 'dbms',3:'java'},
				'Marks': {0: 89, 1: 90, 2: 93,3:57}})

# display dataframe
print(dataframe)

Output:

     Name Subjects  Marks
0  sravan      PHP     89
1   bobby      PHP     90
2  deepak     dbms     93
3  prasad     java     57

 

Scenario-1 : Set one column as identifier variable and one column as value variable

Here we are defining one variable as identifier and also one variable is value variable.

Advertisement

Example 1: Python program to set Name as identifier variable and Subjects as value variable

# import pandas 
import pandas 

# create dataframe with college data
dataframe = pandas.DataFrame({'Name': {0: 'sravan', 1: 'bobby', 2: 'deepak',3:'prasad'},
				'Subjects': {0: 'PHP', 1: 'PHP', 2: 'dbms',3:'java'},
				'Marks': {0: 89, 1: 90, 2: 93,3:57}})

# here Name is id_vars and Subjects is value_vars
print(pandas.melt(dataframe, id_vars =['Name'], value_vars =['Subjects']))

Output:

     Name  variable value
0  sravan  Subjects   PHP
1   bobby  Subjects   PHP
2  deepak  Subjects  dbms
3  prasad  Subjects  java

 

Example 2: Python program to set Name as identifier variable and Marks as value variable

# import pandas 
import pandas 

# create dataframe with college data
dataframe = pandas.DataFrame({'Name': {0: 'sravan', 1: 'bobby', 2: 'deepak',3:'prasad'},
				'Subjects': {0: 'PHP', 1: 'PHP', 2: 'dbms',3:'java'},
				'Marks': {0: 89, 1: 90, 2: 93,3:57}})

# here Name is id_vars and Marks is value_vars
print(pandas.melt(dataframe, id_vars =['Name'], value_vars =['Marks']))

Output:

     Name variable  value
0  sravan    Marks     89
1   bobby    Marks     90
2  deepak    Marks     93
3  prasad    Marks     57

 

Scenario-2 : Set one column as identifier variable and two columns as value variable

Here we are defining one variable as identifier and two variables  as value variable.

Example : Python program to set Name as identifier variable and Subjects, Marks  as value variable

Advertisement
# import pandas 
import pandas 

# create dataframe with college data
dataframe = pandas.DataFrame({'Name': {0: 'sravan', 1: 'bobby', 2: 'deepak',3:'prasad'},
				'Subjects': {0: 'PHP', 1: 'PHP', 2: 'dbms',3:'java'},
				'Marks': {0: 89, 1: 90, 2: 93,3:57}})

# here Name is id_vars and subjects , Marks is value_vars
print(pandas.melt(dataframe, id_vars =['Name'], value_vars =['Subjects','Marks']))

Output:

     Name  variable value
0  sravan  Subjects   PHP
1   bobby  Subjects   PHP
2  deepak  Subjects  dbms
3  prasad  Subjects  java
4  sravan     Marks    89
5   bobby     Marks    90
6  deepak     Marks    93
7  prasad     Marks    57

 

Summary

In this article we learned about Pandas and how it melts a DataFrame with the melt method. (Melting is the process of converting a wide data set to a narrow one.) The method accepts two primary parameters:

  • The id_vars parameter sets the identifier column, the column for which the wide data set aggregates data.
  • The value_vars parameter accepts the column(s) whose values pandas will melt and store in a new column.

 

References

Pandas - melt()

 

Didn't find what you were looking for? Perform a quick search across GoLinuxCloud

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can either use the comments section or contact me form.

Thank You for your support!!

Leave a Comment

X