Introduction to pandas.Series.map()
Pandas supports element-wise operations just like NumPy (after all, pd.Series
stores their data using np.array
). For example, it is possible to apply transformation very easily on both pd.Series
and pd.DataFrame
:
np.log(df.sys_initial) # Logarithm of a series
df.sys_initial ** 2 # Square a series
np.log(df) # Logarithm of a dataframe
df ** 2 # Square of a dataframe
The pd.Series.map
method can be used to execute a function to each value and return a pd.Series containing each result. We can pass the Series to Python’s built-in dict
function to create a dictionary.
The map()
method is similar to the apply method as it helps in making elementwise changes that have been defined by functions. However, in addition, the map function also accepts a series or dictionary to define these elementwise changes.
Pandas maps the Series’ index labels and values to the dictionary’s keys and values:
Syntax:
Series.map(arg, na_action=None)
Parameters:
- arg - It will represents the Series data as input.
- na_action - represents the action on NaN values.
Example-1: map() two Panda Series
We are going to map()
two Series with out any parameters.
Syntax:
new_series=old_series.map(values)
Example:
This method performs the mapping by first matching the values of the outer Series
with the index labels of the inner Series
. It then returns a new Series, with the index labels of the outer Series
but the values from the inner Series
.
# import the module
import pandas as pd
# Create Series x
x = pd.Series({"one": 1, "two": 2, "three": 3})
# Create Series y
y = pd.Series({1: "a", 2: "b", 3: "c"})
# Print x series
print(x)
print("====================")
# Print y series
print(y)
print("====================")
# map the labels in the index of x to the values of y
print(x.map(y))
Output:
one 1
two 2
three 3
dtype: int64
====================
1 a
2 b
3 c
dtype: object
====================
one a
two b
three c
dtype: object
Example-2: map() Panda Series with Dictionary
As we mentioned earlier, we can also map() a Series with a Dictionary. So I will update my previous code, and convert the Panda Series into Dictionary using to_dict()
function
# import the module
import pandas as pd
# Create Series x
x = pd.Series({"one": 1, "two": 2, "three": 3})
# Create Series y and convert it to dictionary
y = pd.Series({1: "a", 2: "b", 3: "c"}).to_dict()
# Print x series
print(x)
print("====================")
# Print y series
print(y)
print("====================")
# map the labels in the index of x to the values of y
print(x.map(y))
Output:
one 1
two 2
three 3
dtype: int64
====================
{1: 'a', 2: 'b', 3: 'c'}
====================
one a
two b
three c
dtype: object
Example-3: Handle missing values during map() with na_action
The na_action='ignore'
or na_action=None
can be used to avoid applying map()
function on missing values and keep them as NaN. Let's update our code and remove one of the key value pair from the dictionary and try to perform map
operation:
# import the module
import pandas as pd
# Create Series x
x = pd.Series({"one": 1, "two": 2, "three": 3})
# Create Series y
y = pd.Series({1: "a", 2: "b"})
# Print x series
print(x)
print("====================")
# Print y series
print(y)
print("====================")
# map the labels in the index of x to the values of y
print(x.map(y, na_action='ignore'))
Output
As you can see, we only had two elements in our dictionary so for the third key, map()
has added as NaN:
one 1
two 2
three 3
dtype: int64
====================
1 a
2 b
dtype: object
====================
one a
two b
three NaN
dtype: object
Can we use map() with Pandas DataFrame?
map()
can use not only a function, but also a dictionary or another series. This method doesn't exist on pandas.DataFrame objects.
We will update our code to use pandas DataFrame along with Series:
# import the module
import pandas as pd
# Create Series x
x = pd.Series({"one": 1, "two": 2, "three": 3})
# Create pandas DataFrame
y = pd.DataFrame([1, 2, 3])
# map the labels in the index of y to the values of x
print(y.map(x, na_action='ignore'))
As you can see from the output, we have got an exception:
Traceback (most recent call last):
File "<string>", line 11, in <module>
File "/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py", line 5487, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'map'
Summary
One of the basic tasks in data transformations is the mapping of a set of values to another set. Pandas provides a generic ability to map values using a lookup table (via a Python dictionary or a pandas Series) using the .map()
method.
This method performs the mapping by first matching the values of the outer Series with the index labels of the inner Series. It then returns a new Series, with the index labels of the outer Series but the values from the inner Series.
As with other alignment operations, if pandas does not find a map between the value of the outer Series and an index label of the inner Series, it fills the value with NaN.
References