Question: Validating Postal CodesĀ [Python Regex and Parsing]
A valid postal code P have to fullfil both below requirements:
- P must be a number in the range from 100000 to 999999 inclusive.
- P must not contain more than one alternating repetitive digit pair.
Alternating repetitive digits are digits which repeat immediately after the next digit. In other words, an alternating repetitive digit pair is formed by two equal digits that have just a single digit between them.
For example:
121426 # Here, 1 is an alternating repetitive digit.
523563 # Here, NO digit is an alternating repetitive digit.
552523 # Here, both 2 and 5 are alternating repetitive digits.
Your task is to provide two regular expressions regex_integer_in_range and regex_alternating_repetitive_digit_pair. Where:
regex_integer_in_range should match only integers range from 100000 to 999999 inclusive
regex_alternating_repetitive_digit_pair should find alternating repetitive digits pairs in a given string.
Both these regular expressions will be used by the provided code template to check if the input string P is a valid postal code using the following expression:
(bool(re.match(regex_integer_in_range, P))
and len(re.findall(regex_alternating_repetitive_digit_pair, P)) < 2)
Input Format:
Locked stub code in the editor reads a single string denoting P from stdin and uses provided expression and your regular expressions to validate if P is a valid postal code.
Output Format:
You are not responsible for printing anything to stdout. Locked stub code in the editor does that.
Sample Input:
110000
Sample Output:
False
Explanation:
1 1 0000 : (0, 0) and (0, 0) are two alternating digit pairs. Hence, it is an invalid postal code.
NOTE:
You have to pass all the test cases to get a positive score
Possible solutions
Now we will discuss the possible solutions for the given problem from hacker rank.
The following code is already given on the editor of hacker rank:
regex_integer_in_range = r"_________" # Do not delete 'r'.
regex_alternating_repetitive_digit_pair = r"_________" # Do not delete 'r'.
import re
P = input()
print (bool(re.match(regex_integer_in_range, P))
and len(re.findall(regex_alternating_repetitive_digit_pair, P)) < 2)
Now we have to use the above code to find the solution:
Solution-1: Using range
Here is the first simple solution to the given problem using python range:
regex_integer_in_range = r"^([1-9][0-9]{5})$"
regex_alternating_repetitive_digit_pair = r"(?=(\d)\d\1)"
import re
P = input()
print (bool(re.match(regex_integer_in_range, P))
and len(re.findall(regex_alternating_repetitive_digit_pair, P)) < 2)
This code defines two regular expressions, regex_integer_in_range
and regex_alternating_repetitive_digit_pair
, and then imports the re
module to use its functionality to apply these regular expressions.
It then prompts the user for input and assigns the input to a variable P
. The code then uses the re.match
function to check if P
matches the regex_integer_in_range
pattern, and uses the re.findall
function to find all instances of the regex_alternating_repetitive_digit_pair
pattern in P
.
It checks if the number of instances of the repetitive digit pair pattern is less than 2, and prints the boolean value resulting from combining the check for a match to regex_integer_in_range
with the check for less than 2 instances of the repetitive digit pair pattern.
Solution-2: Alternative method
We can also solve the problem by changing the previous code a little bit as shown below:
regex_integer_in_range = r"^([1-9][0-9]{5})$"
regex_alternating_repetitive_digit_pair = r"(?=(.)(.)(\1))"
import re
P = input()
print (bool(re.match(regex_integer_in_range, P))
and len(re.findall(regex_alternating_repetitive_digit_pair, P)) < 2)
This code is similar to the previous one, but there are a few differences.
The first difference is in the definition of regex_alternating_repetitive_digit_pair
, which has been modified to include a lookahead assertion. The lookahead assertion, (?=(.)(.)(\1))
, checks if there is a digit (captured by the first group) followed by another digit (captured by the second group) and then the same digit as the first group (captured by the third group).
The second difference is in the use of the re.findall
function, which has been modified to return a list of tuples instead of a list of strings. Each tuple will contain the three captured groups from the matching instances of the regex_alternating_repetitive_digit_pair
pattern. The code then checks if the length of this list is less than 2 and prints the boolean value resulting from combining this check with the check for a match to regex_integer_in_range
.
Solution-3: Alternative method
Now we will use another method by changing the above code as shown below:
regex_integer_in_range = r"^[1-1][0-9][0-9][0-9][0-9][0-9]$|^[2-9][0-9][0-9][0-9][0-9][0-9]$"
regex_alternating_repetitive_digit_pair = r"(\d)(?=\d\1)"
import re
P = input()
print (bool(re.match(regex_integer_in_range, P))
and len(re.findall(regex_alternating_repetitive_digit_pair, P)) < 2)
This code is different from the previous ones in a few ways.
The first difference is in the definition of regex_integer_in_range
, which now consists of two alternatives separated by a vertical bar (|
). The first alternative matches a 6-digit integer in which the first digit is 1 and the remaining digits are any digit from 0 to 9. The second alternative matches a 6-digit integer in which the first digit is any digit from 2 to 9 and the remaining digits are any digit from 0 to 9.
The second difference is in the definition of regex_alternating_repetitive_digit_pair
, which has been modified to include a capturing group around the first digit and a lookahead assertion after the first digit. The capturing group captures the first digit and the lookahead assertion checks if the digit after the first digit is the same as the first digit. The code then uses the re.findall
function to find all instances of the regex_alternating_repetitive_digit_pair
pattern in P
and checks if the length of this list is less than 2. It then prints the boolean value resulting from combining this check with the check for a match to regex_integer_in_range
.
Summary
In this short article, we learned how we can solveĀ Validating postal codes on Hacker Rank using various methods. We solve the problem using three different methods and explained each methods.
Further reading
Question on Hackerrank: Python Validating Postal Codes [Regex and Parsing]