Back to home

PrerequisitesGetting startedCreating the notebookImporting the moduleBasic usage of the moduleSearching stringsMatching stringsUsage without compileSplitting a string into a listReplacing matchesSummaryResources and further reading
Working With Regex Expressions In Python main image

Working With Regex Expressions In Python

This post will use the regular expressions module from the standard library to ... .

Prerequisites

  1. Familiarity with Pipenv. See here for my post on Pipenv.
  2. Familiarity with JupyterLab. See here for my post on JupyterLab.
  3. Familiarity with Regular Expressions

Getting started

Let's create the hello-python-regex directory and install Pillow.

1 2 3 4 5 6 7 8 9 # Make the `hello-python-regex` directory $ mkdir hello-python-regex $ cd hello-python-regex # Create a folder to place your icons $ mkdir icons # Init the virtual environment $ pipenv --three $ pipenv install --dev jupyterlab

At this stage, we can start up the notebook server.

1 2 3 # Startup the notebook server $ pipenv run jupyter-lab # ... Server is now running on http://localhost:8888/lab

The server will now be up and running.

Creating the notebook

Once on http://localhost:8888/lab, select to create a new Python 3 notebook from the launcher.

Ensure that this notebook is saved in hello-python-regex/docs/regex.ipynb.

We will explore the following in each cell of the notebook:

  1. Importing the Regex module.
  2. A basic usage of the Regex module.
  3. String replacement with the Regex module.

Importing the module

This imports the regex module from the standard library.

1 2 3 4 5 import re m = re.search("Hello, (.+)", "Hello, world!") m.group(1) # 'world!'

Basic usage of the module

There are a number of useful module methods that we can use that we will demonstrate:

  1. Searching strings.
  2. Matching strings.
  3. Usage without compile.
  4. Splitting a string into a list.
  5. Replacing matches.

Searching strings

1 2 3 4 5 6 7 8 9 10 11 import re pattern = re.compile("ello, (.+)") m = pattern.search("Hello, world!") m.group(1) print(m) # <re.Match object; span=(0, 13), match='Hello, world!'> print(m.group(1)) # world! n = pattern.search("Hello, world!", 0) print(n) # <re.Match object; span=(0, 13), match='Hello, world!'> print(n.group(1)) # world!

Matching strings

1 2 3 4 5 6 7 8 9 pattern = re.compile("ello, (.+)") m = pattern.match("Hello, world!") # No match as "e" is the 2nd character the "Hello, world!". print(m) # None pattern = re.compile("Hello, (.+)") # Does match n = pattern.match("Hello, world!") print(n) # <re.Match object; span=(0, 13), match='Hello, world!'>

Usage without compile

When you use re.match and re.search as a static method, you can pass the Regex as the first argument:

1 2 3 4 5 m = re.match("Hello, (.+)", "Hello, world!") print(m) # <re.Match object; span=(0, 13), match='Hello, world!'> n = re.match("Hello, (.+)", "Hello, world!") print(n) # <re.Match object; span=(0, 13), match='Hello, world!'>

Splitting a string into a list

1 2 3 4 5 6 7 m = re.split(",", "Hello, world!") print(m) # ['Hello', ' world!'] n = re.split("\s", "Hello beautiful world!") print(n) # ['Hello', 'beautiful', 'world!']

Replacing matches

We can make use of the search and sub methods to replace matches.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # Simple example target = "Photo credit by [@thomas](https://site.com/@thomas)" m = re.search(r"Photo credit by \[@(.+)\]\(https://site.com/@(.+)\)", target) res = re.sub(m.group(1), "dennis", target) print(res) # Photo credit by [@dennis](https://site.com/@dennis) # By iterating for multiple matches target = "Photo credit by [@thomas](https://site.com/@user)" m = re.search(r"Photo credit by \[@(.+)\]\(https://site.com/@(.+)\)", target) res = target for idx, val in enumerate(m.groups()): res = re.sub(val, "dennis", res) print(res) # Photo credit by [@dennis](https://site.com/@dennis)

For a more specific replacement (particularly in a large set of text), we can be more explicit with the string to replace:

1 2 3 4 5 6 7 8 9 10 target = """ Other words thomas and user we don't want to replace. Photo credit by [@thomas](https://site.com/@user) """ new_name = "dennis" pattern = re.compile(r"Photo credit by \[@(.+)\]\(https://site.com/@(.+)\)") res = pattern.sub(f"Photo credit by [@{new_name}](https://site.com/@{new_name})", target) # Other words thomas and user we don't want to replace. # Photo credit by [@dennis](https://site.com/@dennis)

Summary

Today's post demonstrated how to use the re module from the standard library to search, match, split and replace text in Python strings.

This can be unbelievably useful when working with text files.

Resources and further reading

Photo credit: pawel_czerwinski

Dennis O'Keeffe

@dennisokeeffe92
  • Melbourne, Australia

Hi, I am a professional Software Engineer. Formerly of Culture Amp, UsabilityHub, Present Company and NightGuru.
I am currently working on workingoutloud.dev, Den Dribbles and LandPad .

Related articles


1,200+ PEOPLE ALREADY JOINED ❤️️

Get fresh posts + news direct to your inbox.

No spam. We only send you relevant content.