Skip to content

Regular Expressions in Python

New Course Coming Soon:

Get Really Good at Git

Regular expressions let us find content inside strings matching a particular format.

By formulating a regular expression with a special syntax, you can

The re Python standard library module gives us a set of tools to work with regular expressions.

In particular, among others it offers us the following functions:

Both take take 3 parameters: the pattern, the string to search into, and the flags.

Before talking about how to use them, let’s introduce the basics of a regular expression pattern.

The pattern is a string wrapped in a r'' delimiter. Inside it, we can use some special combinations of characters we can use to capture the values we want.

For example:

Square brackets can contain multiple characters matches: [\d\sa] matches digits and whitespaces, and the character a. [a-z] matches characters from a to z.

\ can be used to escape, for example to match the dot ., you should use \. in your pattern.

| means or

Then we have anchors:

Then we have quantity modifiers:

Parentheses, (<expression>), create a group. Groups are interesting because we can capture the content of a group.

Those 2 examples match the whole string:

re.match('^.*Roger', 'My dog name is Roger')
re.match('.*', 'My dog name is Roger')

Printing one of those statements will result in a string like this:

<re.Match object; span=(0, 20), match='My dog name is Roger'>

If you assign the result to a result variable and call group() on it, you will see the match:

result = re.match('^.*Roger', 'My dog name is Roger')
print(result.group())
# My dog name is Roger

Let’s try to get the name of the dog, if you don’t know what is going to be the name of the dog, you can look for “name is ” and then add a group, like this:

result = re.search('name is (.*)', 'My dog name is Roger')

result.group() will print “name is Roger”, and result.group(1) will print the content of the group, “Roger”:

print(result.group())  # name is Roger
print(result.group(1)) # Roger

I mentioned re.search() and re.match() take flags as the 3rd parameter. We have a few possible flags, the most used is re.I to perform a case-insensitive match.

This is just an introduction to regular expressions, starting from this there’s a lot of rabbit holes you can go into.

I recommend trying your regular expressions on https://regex101.com for correctness. Make sure you choose the Python flavor in the sidebar.

Are you intimidated by Git? Can’t figure out merge vs rebase? Are you afraid of screwing up something any time you have to do something in Git? Do you rely on ChatGPT or random people’s answer on StackOverflow to fix your problems? Your coworkers are tired of explaining Git to you all the time? Git is something we all need to use, but few of us really master it. I created this course to improve your Git (and GitHub) knowledge at a radical level. A course that helps you feel less frustrated with Git. Launching Summer 2024. Join the waiting list!
→ Get my Python Handbook
→ Get my Python Handbook

Here is how can I help you: