Understanding the Question - Count number of friends
The task is to count the number of friends each person has using a MapReduce framework. The input consists of multiple lines, each containing a pair of friend names separated by a space. The goal is to output the number of friends for each person in a specific format.
The input is a number of lines with pairs of name of friends, in the form:
[Friend1] [Friend2]
The output should list the number of friends each person has, in lexicographical order, in the following format:
{"key":"[Person name]","value":"[Number of friends]"}
The Hacker Rank code template includes the MapReduce
class and parts related to input/output handling. However, the mapper
and reducer
functions are incomplete. Your task is to fill in these functions so that the program correctly counts the number of friends for each person and outputs the result in the required format.
The input format contains a list of single space-separated pairs of friend names. The input handling code to read this data has already been provided. Sample Input:
Joe Sue
Sue Phi
Phi Joe
Phi Alice
The output handling part has also been provided. The output should be a list where each entry has a key (person name) and a value (number of friends), sorted in lexicographical order. Sample Output:
{"key":"Alice","value":"1"}
{"key":"Joe","value":"2"}
{"key":"Phi","value":"3"}
{"key":"Sue","value":"2"}
Solution
Here's how you can implement the mapper
and reducer
functions:
- Mapper Function: The
mapper
function will read each friendship pair and emit each user as a key with a value of 1. - Reducer Function: The
reducer
function will sum up the values for each user to get the total number of friends.
import sys
from collections import OrderedDict
class MapReduce:
def __init__(self):
self.intermediate = OrderedDict()
self.result = []
def emitIntermediate(self, key, value):
self.intermediate.setdefault(key, [])
self.intermediate[key].append(value)
def emit(self, value):
self.result.append(value)
def execute(self, data, mapper, reducer):
for record in data:
mapper(record)
for key in self.intermediate:
reducer(key, self.intermediate[key])
self.result.sort()
for item in self.result:
print("{\"key\":\""+item[0]+"\",\"value\":\"" + str(item[1]) + "\"}")
mapReducer = MapReduce()
def mapper(record):
# Split the record to get the two users in the pair
user1, user2 = record.strip().split()
# Emit each user with a value of 1
mapReducer.emitIntermediate(user1, 1)
mapReducer.emitIntermediate(user2, 1)
def reducer(key, list_of_values):
# Sum up the values to get the total number of friends
total_friends = sum(list_of_values)
mapReducer.emit((key, total_friends))
if __name__ == '__main__':
inputData = []
for line in sys.stdin:
inputData.append(line)
mapReducer.execute(inputData, mapper, reducer)
Understanding the solution:
- The
MapReduce
class initializes with an ordered dictionaryintermediate
to store intermediate results and a listresult
to store the final results. - The
emitIntermediate
method appends values to the list of a specific key in the intermediate dictionary. - The
emit
method appends the final key-value pairs to theresult
list. - The
execute
Method runs themapper
on each input record and then runs thereducer
on the intermediate results. It sorts the final results and prints them in the specified format. Mapper
function splits each input record into two users (assuming the input format is a pair of user IDs separated by a space). It then emits each user with a value of 1 to signify a friendship.Reducer
function sums the list of values (which are all 1s) for each user to get the total number of friends. It then emits the user ID and the total count.
Pass the input :
1 2
2 3
1 3
4 5
When we run the code on Hacker Rank, the sample Test case is shown as successful:
