How to match text at the start or end of a string in Python?


Problem..

Assume you need to check the start or end of a string for a specific text patterns. The common patterns might be filename extensions but can also be anything. I will show you few methods on how you can do this.

Startswith() method

A simple way to check the beginning of a string is by using startswith() method.

Example

text = "Is USA colder than Australia?"
print(f"output \n {text.startswith('Is')}")

Output

True

Example

filename = "Hello_world.txt"
print(f"output \n {filename.startswith('Hello')}")

Output

True

Example

site_url = 'https://www.something.com'
print(f"output \n {site_url.startswith('http:')}")

Output

False

Example

print(f"output \n {site_url.startswith('https:')}")

Output

True

endswith() Method.

A simple way to check the ending of a string is by using endswith() method.

Output

text = "Is USA colder than Australia?"
print(f"output \n {text.endswith('?')}")

Output

True

Example

filename = "Hello_world.txt"
print(f"output \n {filename.endswith('.txt')}")

Output

True

Now if we want to check for multiple choices with the above methods we need to provide tuples. One of the common usage is the check for file extensions let us say we need to validate for ".txt" and ".csv" files in a directory.

import os
filenames = os.listdir('.')
# Let us first check if there are files
print(f"output \n {any(name.endswith(('.csv',',txt')) for name in filenames)}")

Output

True

Output

[name for name in filenames if name.endswith(('.csv', '.txt')) ]

Output

['file1.csv',
'HRDataset.csv',
'Input.csv',
'input.txt',
'input_copy.txt',
'movies_data.csv',
'my_html_data_to_csv.csv',
'temporary_file1_for_zip.csv',
'temporary_file2_for_zip.csv',
'test.csv',
'test1.txt',
'test2.txt',
'tmdb_5000_movies.csv']

Remember these methods accepts tuples, if you have a list of choices to search, then we need to convert them in tuples.

import os

# list with choices
patters = ['.csv','.txt']

# get the file names
filenames = os.listdir('.')

# Let us first check if there are files
any(name.endswith(patters) for name in filenames)

Output

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
8
9 # Let us first check if there are files
---> 10 any(name.endswith(patters) for name in filenames)

in (.0)
8
9 # Let us first check if there are files
---> 10 any(name.endswith(patters) for name in filenames)

TypeError: endswith first arg must be str or a tuple of str, not list

The above command returned an error, so we need to convert the list into a tuple.

Example

# Let us first check if there are files
any(name.endswith(tuple(patters)) for name in filenames)

Output

True

Similarly, we need to convert list to tuple to get the file names.

Example

[name for name in filenames if name.endswith(tuple(patters)) ]

Output

['file1.csv',
'HRDataset.csv',
'Input.csv',
'input.txt',
'input_copy.txt',
'movies_data.csv',
'my_html_data_to_csv.csv',
'temporary_file1_for_zip.csv',
'temporary_file2_for_zip.csv',
'test.csv',
'test1.txt',
'test2.txt',
'tmdb_5000_movies.csv']

Finally, the startswith() and endswith() methods look nice when combined with other operations, such as common data reductions. For example:

Example

if any(name.endswith(tuple(patters)) for name in filenames):
<perform the logic here>

Updated on: 10-Nov-2020

787 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements