In text processing, one of the most common tasks is to check if a string contains a certain substring. Whether you’re parsing logs, searching through large datasets, or even validating user input, knowing how to effectively check for substrings is essential. This article will provide an in-depth look into various methods available in Python to check if a “python string contains” a certain substring. If you’ve ever found yourself asking, “How can I check if this ‘python string contains’ that?”, or you’re looking for efficient ways to find if a “python contain substring,” then this guide is for you.
Different Methods for Checking if a Python String Contains Substring
Here are the different methods which can be used to check if Python string contains substring:
- The
inOperator: A simple, readable way to check if a python string contains another string. str.find()Method: Searches for a substring and returns the index of the first occurrence, or -1 if not found.str.index()Method: Similar tostr.find(), but raises aValueErrorif the substring is not found.str.count()Method: Counts the number of non-overlapping occurrences of a substring within a string.- Using Regular Expressions (
relibrary): Provides flexibility to search for complex patterns within the string.
1. Using in Operator
The in operator in Python is used to check if a particular element exists within a given iterable object such as a list, tuple, dictionary, or string. When used with strings, it can be employed to verify whether a substring exists within a given string.
>>> "hello" in "hello world"
True
>>> "world" in "hello world"
True
>>> "Python" in "hello world"
False
1. Using Variables for the check:
substring = "hello"
main_string = "hello world"
if substring in main_string:
print(f"'{substring}' exists in '{main_string}'")
2. Using with if Statements
if "hello" in "hello world":
print("Substring exists.")
3. Using with for Loop
This is not the most efficient way to do it but could be helpful in customized scenarios.
main_string = "hello world"
substring = "hello"
found = False
for i in range(len(main_string) - len(substring) + 1):
if main_string[i:i + len(substring)] == substring:
found = True
break
if found:
print("Substring exists.")
4. Using With while Loop
main_string = "hello world"
substring = "hello"
found = False
i = 0
while i <= len(main_string) - len(substring):
if main_string[i:i + len(substring)] == substring:
found = True
break
i += 1
if found:
print("Substring exists.")
5. Creating a Function to perform the check:
def substring_exists(substring, main_string):
return substring in main_string
# Usage
print(substring_exists("hello", "hello world"))
6. Case-Sensitive Handling
By default, the in operator is case-sensitive. For case-insensitive
search:
if "HELLO".lower() in "hello world".lower():
print("Substring exists.")
7. No Support for Whole-Word Matching
The in operator in Python will simply check for the existence of a
sequence of characters within another string, without any concern for
word boundaries. That means if you search for “Hell” in “Hello”, it will
return True, as “Hell” is a contiguous substring in “Hello”.
print("Hell" in "Hello") # Output: True
This might not be what you want if you’re looking for complete word matches. If word boundaries are important for your use case, you may need to tokenize the string or use regular expressions to enforce word boundaries.
words = "Hello, Hell, Hello".split()
if "Hell" in words:
print("Whole word 'Hell' found.")
else:
print("Whole word 'Hell' not found.")
# Output: Whole word 'Hell' found.
This simplistic tokenization method splits on spaces and would consider “Hell,” (with a comma) and “Hell” to be different words
2. Using str.find() Method
The str.find() method in Python is used to find the index of the first
occurrence of a substring within a given string. If the substring is not
found, the method returns -1.
Syntax:
string.find(substring[, start[, end]])
substring: The substring to search for.startandend(optional): Specifies the start and end positions within the string to search.
1. Basic Usage
>>> "hello world".find("hello")
0
>>> "hello world".find("world")
6
>>> "hello world".find("Python")
-1
2. Using Variables
substring = "hello"
main_string = "hello world"
index = main_string.find(substring)
if index != -1:
print(f"'{substring}' found at index {index}")
3. Using with if Statements
index = "hello world".find("hello")
if index != -1:
print(f"Substring found at index {index}")
4. Using with for Loop
While it’s not common to use str.find() within a for loop for
finding substrings (since str.find() itself is sufficient), you could
use it to find multiple occurrences of a substring.
main_string = "hello world, hello again"
substring = "hello"
index = 0
while index != -1:
index = main_string.find(substring, index)
if index != -1:
print(f"Substring found at index {index}")
index += len(substring)
5. Using with while Loop
Again, it’s not usually needed, but here’s how you might use it:
main_string = "hello world"
substring = "hello"
index = 0
while index != -1:
index = main_string.find(substring, index)
if index != -1:
print(f"Substring found at index {index}")
index += len(substring)
6. Creating a Function to perform this check:
def find_substring(substring, main_string):
return main_string.find(substring)
# Usage
index = find_substring("hello", "hello world")
if index != -1:
print(f"Substring found at index {index}")
7. Error Handling
Like the in operator, str.find() is also safe to use and won’t raise
exceptions for invalid or empty inputs.
>>> "".find("hello")
-1
>>> "hello".find("")
0
8. Case-Sensitive Handling
By default, str.find() is case-sensitive. For a case-insensitive
search, you can convert both the string and the substring to lowercase.
main_string = "Hello World"
substring = "HELLO"
index = main_string.lower().find(substring.lower())
if index != -1:
print(f"Substring found at index {index}")
9. No Support for Whole-Word Matching
Just like the in operator, Python’s str.find() method does not
support whole-word matching or word boundary recognition by default. It
simply checks for the existence of a sequence of characters in a string,
without any regard for whether those characters constitute a whole word
or part of another word.
>>> "Hello World".find("Hell")
0
10. Limitations
- Lack of Word Boundary Recognition:
str.find()does not consider word boundaries by default. It will find substrings even if they are part of other words. - Case Sensitivity: The method is inherently case-sensitive. You have to manually convert both the string and the substring to the same case for a case-insensitive search.
- No Regular Expression Support: str.find() doesn’t support regular expressions, so if you need more complex pattern matching, you’ll have to use the re module.
- Ambiguity in ‘Not Found’ Scenario: Because it returns
-1when the substring is not found, you’ll need extra logic if you want to distinguish between different types of “no match” scenarios, such as an empty substring or a genuine non-match.
3. Using str.index() Method
The str.index() method in Python is similar to str.find(). It’s used
to find the index of the first occurrence of a substring in a given
string. However, there’s one key difference: if the substring is not
found, str.index() raises a ValueError instead of returning -1.
The syntax for the str.index() method is as follows:
string.index(substring, start, end)
substring: The substring you’re searching for.startandend(optional): Indicate where to start and end the search.
1. Basic Usage
Here’s how you would generally use the str.index() method:
>>> "hello world".index("hello")
0
>>> "hello world".index("world")
6
>>> "hello world".index("Python")
ValueError: substring not found
You can store the value that str.index() returns in a variable:
index = "hello world".index("hello")
2. Using with if statement
You could use an if statement to check whether a substring exists
before taking an action:
text = "hello world"
if "hello" in text:
index = text.index("hello")
print(f"'hello' found at index {index}")
3. Using with while loop
Though less common for this specific operation, you can use a while
loop to repeatedly search for a substring:
text = "hello world, hello again"
start = 0
while "hello" in text[start:]:
index = text.index("hello", start)
print(f"'hello' found at index {index}")
start = index + 1
4. Using with for loop
A for loop can be used in a similar way:
text = "hello world, hello again"
start = 0
for word in text.split():
if "hello" in word:
index = text.index("hello", start)
print(f"'hello' found at index {index}")
start = index + 1
5. Case-Sensitive Handling
The str.index() method is case-sensitive. For case-insensitive
searching, you could convert both the string and substring to lower
case:
>>> "Hello World".index("hello")
ValueError: substring not found
>>> "Hello World".lower().index("hello".lower())
0
6. Error/Exception Handling
Unlike str.find(), the str.index() method raises a ValueError if
the substring is not found. You could catch this with a try-except
block:
try:
index = "hello world".index("Python")
except ValueError:
print("Substring not found.")
7. Limitations
- Lack of Word Boundary Recognition: Like
str.find(),str.index()doesn’t consider word boundaries and will find substrings even if they are part of other words. - Case Sensitivity: The method is case-sensitive, requiring you to convert both the string and the substring to the same case if you need a case-insensitive search.
- No Regular Expression Support:
str.index()doesn’t support regular expressions, so more complex pattern matching would require using theremodule. - Raises an Exception for ‘Not Found’: This method will raise a
ValueErrorif the substring is not found, which may require additional exception handling logic.
4. Using str.count() Method
The str.count() method in Python is used to count the occurrences of a
substring in a given string. The method is case-sensitive and does not
consider word boundaries by default.
Syntax:
string.count(substring[, start[, end]])
substring: The substring you want to search for.startandend(optional): Specifies where to start and end the search within the string.
1. Basic Usage
Here’s how you would generally use the str.count() method:
>>> "hello world, hello again".count("hello")
2
>>> "hello world, hello again".count("world")
1
>>> "hello world, hello again".count("Python")
0
You can store the return value of str.count() in a variable:
count = "hello world, hello again".count("hello")
2. Using with if statement
You can use an if statement to take action based on the count:
text = "hello world, hello again"
count = text.count("hello")
if count > 0:
print(f"'hello' found {count} times.")
3. Using with while loop
Using a while loop with str.count() may not be the most typical
scenario, but it could be done, especially if the string content is
dynamically changing:
text = "hello world, hello again"
while "hello" in text:
count = text.count("hello")
print(f"'hello' found {count} times.")
# Remove one occurrence of "hello" to change the string
text = text.replace("hello", "", 1)
4. Using with for loop
Here is how you can use a for loop:
text = "hello world, hello again"
words = text.split()
for word in words:
if word.count("hello") > 0:
print(f"'hello' found in {word}")
5. Case-Sensitive Handling
By default, str.count() is case-sensitive. If you need a
case-insensitive count, you can convert both the string and the
substring to lowercase (or uppercase).
>>> "Hello World".count("hello")
0
>>> "Hello World".lower().count("hello".lower())
1
6. Error Handling
The str.count() method is quite safe to use and won’t throw exceptions
for invalid or empty inputs. If the substring is not found, it simply
returns 0.
>>> "".count("hello")
0
>>> "hello".count("")
0
7. Limitations
- Lack of Word Boundary Recognition: By default,
str.count()does not consider word boundaries. It will count occurrences of substrings even if they are part of other words. - Case Sensitivity: The method is case-sensitive, requiring additional steps for case-insensitive counting.
- No Regular Expression Support: Unlike methods in the
remodule,str.count()does not support regular expressions for more complex pattern matching. - No Error Output: While this is also a feature (it doesn’t throw
exceptions), the lack of any error output other than
0means you can’t distinguish between different kinds of “no match” scenarios, such as an empty string vs. a genuine non-match.
5. Using Regular Expressions (re library)
The re library in Python provides several methods for working with
regular expressions. Below is a table that outlines some of the most
commonly used methods for various substring and pattern matching tasks:
| Method | Description | Example Usage | Example String | Result |
|---|---|---|---|---|
re.match() |
Determines if the regular expression matches at the beginning of the string | re.match('Hi', 'Hi Hello') |
Hi Hello | Match |
re.search() |
Searches the string for a match, and returns the first occurrence | re.search('Hello', 'Hi Hello') |
Hi Hello | Match |
re.findall() |
Returns all occurrences of the pattern in the string as a list | re.findall('l', 'Hello') |
Hello | [’l’,’l’] |
re.finditer() |
Returns an iterator yielding match objects for all pattern occurrences | re.finditer('l', 'Hello') |
Hello | Iterator with [’l’,’l’] |
re.fullmatch() |
Checks if the whole string matches the pattern | re.fullmatch('Hi Hello', 'Hi Hello') |
Hi Hello | Match |
re.sub() |
Replaces occurrences of the pattern in the string with a specified string | re.sub('Hello', 'Hi', 'Hello World') |
Hello World | Hi World |
re.split() |
Splits <a href=“https://www.golinuxcloud.com/python-split-string/" | |||
| target="_blank” rel=“noopener noreferrer” | ||||
| title=“5 simple examples to learn python string.split()">the string by the occurrences of the pattern | re.split('\s', 'Hi Hello') |
Hi Hello | [‘Hi’, ‘Hello’] | |
re.compile() |
Compiles a regular expression pattern into a regex object, which can be used for matching using its methods | pattern = re.compile('Hello') |
- | Regex object |
his table provides an overview of the various re methods you can use
to find substrings in Python. Note that the result column is just a
simplified summary; in actuality, some of these methods return more
complex objects like match objects
Regular expressions provide a flexible way to search or match complex string patterns in text.
import re
text = "Hello, world!"
result = re.search("world", text)
print(bool(result)) # Output: True
1. Storing in variable
You can store the result of re.findall() in a variable:
matches = re.findall("hello", "hello world, hello again")
2. Using with if statement
You can use an if statement to take action based on the number of
matches:
import re
text = "hello world, hello again"
matches = re.findall("hello", text)
if len(matches) > 0:
print(f"'hello' found {len(matches)} times.")
3. Using with while loop
You could use a while loop, especially if the string or pattern
changes dynamically:
import re
text = "hello world, hello again"
pattern = "hello"
while re.search(pattern, text):
matches = re.findall(pattern, text)
print(f"'{pattern}' found {len(matches)} times.")
# Remove one occurrence of the pattern
text = re.sub(pattern, "", text, count=1)
4. Using with for loop
A for loop can iterate through the matches:
import re
text = "hello world, hello again"
matches = re.findall("hello", text)
for match in matches:
print(f"Found: {match}")
5. Case-sensitive Handling
To perform a case-insensitive search, you can use the re.IGNORECASE
flag:
matches = re.findall("hello", "Hello World, hello again", re.IGNORECASE)
6. Word Boundary Match
Python’s re library can be used to enforce word boundaries.
import re
text = "Hello, Hell, Hello"
pattern = r"\bHell\b"
match_count = len(re.findall(pattern, text))
print(f"Whole word 'Hell' found {match_count} times.")
7. Error/Exception Handling
Errors in the regular expression pattern will raise a re.error. You
can catch this with a try-except block:
import re
try:
re.findall("hello[", "hello world") # This is an invalid pattern
except re.error:
print("Invalid regular expression pattern.")
8. Complex Matching
With regular expressions, you can perform complex string pattern matching.
result = re.findall(r"\b[a-zA-Z]{5}\b", text)
print(result) # Output: ['Hello', 'world']
9. Different Regex Patterns
Here’s a table that outlines some commonly used regular expression patterns for various needs:
| Pattern | Description | Example Pattern | Example String | Match? |
|---|---|---|---|---|
^... |
Checks if the string starts with the given pattern | ^Hello |
Hello, world | Yes |
...$ |
Checks if the string ends with the given pattern | world$ |
Hello, world | Yes |
. |
Matches any character except a newline | H.llo |
Hallo | Yes |
[...] |
Matches any character inside the brackets | [aeiou] |
Hello | Yes |
[^...] |
Matches any character NOT inside the brackets | [^aeiou] |
Hello | Yes |
* |
Matches 0 or more repetitions of the preceding character | He*llo |
Hello | Yes |
+ |
Matches 1 or more repetitions of the preceding character | He+llo |
Hello | Yes |
? |
Matches 0 or 1 repetition of the preceding character | He?llo |
Hello | Yes |
{m,n} |
Matches between m and n repetitions of the preceding char |
He{1,2}llo |
Hello | Yes |
\w |
Matches any alphanumeric character | Hello\w |
Hello1 | Yes |
\W |
Matches any non-alphanumeric character | Hello\W |
Hello@ | Yes |
\d |
Matches any digit | Hello\d |
Hello2 | Yes |
\D |
Matches any non-digit | Hello\D |
HelloA | Yes |
\s |
Matches any whitespace character | Hello\sWorld |
Hello World | Yes |
\S |
Matches any non-whitespace character | Hello\SWorld |
HelloWorld | Yes |
This table is not exhaustive, but it should give a good starting point for understanding how to use regular expressions for substring checks. For more detailed explanations and advanced patterns, you can consult the Python official documentation on regular expressions.
Examples of Special Cases
1. Checking for Multiple Substrings Simultaneously
You can check for the presence of multiple substrings by using a loop or a comprehension.
text = "Hello, world! How are you?"
substrings = ["Hello", "world", "you"]
result = all(sub in text for sub in substrings)
print(result) # Output: True
2. Finding Overlapping Substrings
The built-in methods do not account for overlapping substrings. However, you can find overlapping substrings by manipulating the index.
text = "abababa"
substring = "aba"
start = 0
while start < len(text):
start = text.find(substring, start)
if start == -1: break
print(f"Found at index: {start}")
start += 1 # Modify from `start += len(substring)` to just `start += 1`
This will output:
Found at index: 0
Found at index: 2
Found at index: 4
3. Non-contiguous Substring Match (Subsequence Match)
A non-contiguous substring is also known as a subsequence. Finding a subsequence involves searching for the characters of the substring, not necessarily adjacent to each other, but appearing in the same order.
def is_subsequence(sub, text):
it = iter(text)
return all(c in it for c in sub)
text = "Hello, world!"
sub = "Hlo"
result = is_subsequence(sub, text)
print(result) # Output: True
Here, sub is a non-contiguous substring (or a subsequence) of text, and the function returns True.
Use-Cases for Experienced Programmers
1. Text Parsing
Example: Imagine you have a log file that contains lines like:
INFO - User logged in
ERROR - File not found
INFO - User logged out
You might want to extract only the lines that contain “ERROR” to diagnose issues.
with open("logfile.log", "r") as file:
error_lines = [line.strip() for line in file if "ERROR" in line]
print(error_lines)
This would give you a list of lines that contain the substring “ERROR”, allowing for quick diagnostics.
2. Web Scraping
Example: Let’s say you’re scraping a webpage and want to extract all
URLs. URLs are often contained within href attributes of anchor tags.
You might use Beautiful Soup and Python like so:
from bs4 import BeautifulSoup
import requests
response = requests.get("https://example.com")
soup = BeautifulSoup(response.text, 'html.parser')
urls = [a['href'] for a in soup.find_all('a', href=True) if "http" in a['href']]
Here, we’re looking for the substring “http” within each href
attribute to ensure it’s an actual URL.
3. Data Transformation
Example: You have a CSV file with a column named “Full Name” and you want to split it into two separate columns, “First Name” and “Last Name”.
import csv
new_rows = []
with open('names.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
header = next(csvreader)
header.extend(["First Name", "Last Name"])
new_rows.append(header)
for row in csvreader:
full_name = row[0]
first_name, last_name = full_name.split(' ')
row.extend([first_name, last_name])
new_rows.append(row)
with open('new_names.csv', 'w') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerows(new_rows)
In this example, we used the split() method, which checks for the
space substring to split the names.
4. String Localization
Example: Assume you’re localizing a video game and you need to replace all occurrences of “Health” with its Spanish equivalent “Salud”.
text = "Health is the most important asset."
localized_text = text.replace("Health", "Salud")
Here, you used the replace() method, which internally checks for the
substring “Health” and replaces it with “Salud”.
Common Pitfalls and How to Avoid Them
1. Off-by-One Errors
What it is: Off-by-One errors occur when you make an error by one unit when specifying the index range for a substring.
How to Avoid: Always double-check the start and end indices. Python uses zero-based indexing, which can be the source of confusion.
Example: When using str.find(), the method returns -1 if the
substring is not found. But if you use this as an index, you may run
into issues.
text = "Hello"
index = text.find("World")
new_text = text[:index] # This would truncate the entire string!
2. Character Encoding Issues
What it is: Strings can have different encodings like ASCII, UTF-8, or UTF-16. A mismatch in encoding can result in incorrect substring checks.
How to Avoid: Always specify encoding where applicable, especially when reading from or writing to files.
with open("file.txt", "r", encoding="utf-8") as file:
text = file.read()
if "special_character" in text:
print("Found!")
3. Null and Empty Strings
What it is: An empty string ("") or a None value can sometimes
be mistakenly used in substring checks, leading to incorrect results or
errors.
How to Avoid: Always validate the string and the substring before performing checks.
text = "Hello"
substring = None # Or it could be an empty string ""
if substring:
if substring in text:
print("Found!")
else:
print("Substring is empty or None!")
Tips and Best Practices
1. Pre-Check for String Lengths
What it is: If the substring you’re searching for is longer than the string you’re searching in, you can avoid running the search operation entirely.
Why it’s useful: This can save computational time, especially in situations where you’re dealing with a large dataset or multiple search operations.
text = "Hello"
substring = "Hello World"
if len(substring) <= len(text):
if substring in text:
print("Found!")
else:
print("Cannot find, substring is longer than text.")
2. Use Built-in Functions When Possible for Better Performance
What it is: Python’s built-in functions like in, find(), or index() are generally optimized and faster than creating custom search algorithms for common cases.
Why it’s useful: Built-in functions are well-tested, optimized, and lead to cleaner, more maintainable code.
# Using built-in `in` for simplicity and performance
if "ell" in "Hello":
print("Found!")
3. When to Use Advanced Algorithms
What it is: For specialized use-cases, consider using more advanced string-matching algorithms like KMP or Boyer-Moore. These algorithms offer better performance characteristics for specific types of string matching problems.
Why it’s useful: If you’re working on a performance-critical application like a search engine or text editor, then using a more advanced algorithm can make a significant difference.
# Pseudocode for Boyer-Moore algorithm
def boyer_moore(text, pattern):
# Implement the Boyer-Moore algorithm here
pass
# Use only for large texts and when the standard methods are too slow
if boyer_moore(large_text, pattern):
print("Found using Boyer-Moore!")
Troubleshooting Common Errors
1. TypeError When Using Incompatible Types
What it is: Attempting to search for a non-string type within a
string will raise a TypeError.
Why it happens: The in operator, as well as methods like
str.find() and str.index(), are designed to work with string data
types. Providing an incompatible data type like an integer or a list
will throw an error.
# This will raise a TypeError
try:
result = 42 in "Hello, World!"
except TypeError:
print("TypeError: You cannot search for an integer in a string.")
# This will also raise a TypeError
try:
result = ['H', 'e'] in "Hello, World!"
except TypeError:
print("TypeError: You cannot search for a list in a string.")
2. ValueError from str.index()
What it is: Using the str.index() method will raise a ValueError
if the substring is not found in the string.
Why it happens: Unlike str.find(), which returns -1 when it
doesn’t find the substring, str.index() throws a ValueError to
indicate the absence of the substring.
# This will raise a ValueError
try:
position = "Hello, World!".index("Python")
except ValueError:
print("ValueError: substring not found.")
Frequently Asked Questions (FAQ)
What’s the difference between str.find() and str.index()?
str.find() returns -1 when the substring is not found, while
str.index() throws a ValueError.
Is the in operator case-sensitive?
Yes, the in operator is case-sensitive when checking if a Python
string contains a substring.
How can I make my substring search case-insensitive?
You can convert both the string and the substring to lower (or upper) case before performing the search.
Can I search for multiple substrings at once?
Not directly, but you can use a loop or list comprehension along with
the in operator to check for multiple substrings.
Is it possible to find overlapping substrings?
Yes, but not using the built-in methods directly. You’ll need to use custom logic or regular expressions for that.
What
is the time complexity of the in operator?
For substring search, the average time complexity is O(N*M) where N and M are the lengths of the string and substring, respectively.
How do I find the starting index of a substring?
You can use str.find() or str.index() methods to find the starting
index.
How do I count the occurrences of a substring?
Use the str.count() method to count the occurrences of a substring in
a string.
Can I use wildcards in my substring search?
Not with the built-in methods, but you can achieve this using regular expressions.
What should I do if I get a TypeError or ValueError?
A TypeError usually indicates that you’re trying to search for an
incompatible data type. A ValueError from str.index() means the
substring was not found. Validate your inputs and handle exceptions
accordingly.
Summary
In this article, we’ve explored multiple methods to determine if a
Python string contains a specific substring. From the straightforward
in operator to more advanced methods like str.find(), str.index(),
and str.count() as well as Regular Expressions, there’s a range of
techniques tailored to different scenarios and needs. Each method has
its own pros and cons, which are magnified depending on the specifics of
your use case—such as speed, accuracy, or complexity.
Key Takeaways
- For simple substring checks, the
inoperator is the most straightforward approach. - Use
.lower()or.upper()for case-insensitive checks. - The
str.find()method returns the start index of the substring, or -1 if not found. - The
str.index()method is likestr.find(), but raises aValueErrorif the substring is not found. - The
str.count()method can be used to count the occurrences of a substring. - For more advanced substring matching, including word boundaries and pattern recognition, regular expressions offer a powerful alternative.
- Be aware of common pitfalls, like off-by-one errors and character encoding issues.

![Check if Python String contains Substring [5 Methods]](/python-string-contains-substring/python_strings.jpg)
