Skip to main content

RegEx(Regular Expression) in Python

Hey there! In this guide, we'll explore RegEx or Regular Expression in Python. A Regular Expression or RegEx is a special sequence of characters that uses a search pattern to find a string or set of strings. Let's dive in!


Regular Expression​

Regular expressions (regex) are a powerful tool for working with text by allowing you to define patterns that match specific character sequences.Regex is commonly used for tasks like:

  • Searching for specific patterns within a string
  • Validating inputs (e.g., email addresses, phone numbers)
  • Extracting or replacing parts of text

1. RegEx Module in Python​

Python has a built-in module named re that is used for regular expressions in Python. We can import this module by using the import statement.

Example:

# importing re module
import re

After importing the module you can start using the various function provided .

2. Metacharaters​

Metacharacters are the characters with special meaning.These are used for pattern making. Below is a list of few Metacharacters with their descriptions

MetacharacterDescription
.Matches any character except a newline.
^Matches the start of a string.
$Matches the end of a string.
*Matches 0 or more repetitions of the preceding character or group.
+Matches 1 or more repetitions of the preceding character or group.
?Matches 0 or 1 occurrence of the preceding character or group (makes it optional).
{}Specifies the exact number of repetitions {n}, a range {n,m}, or a minimum {n,}.
[]Matches any single character inside the brackets.
|Acts as an OR operator between expressions (e.g., `a
()Groups multiple expressions together and captures the matched text.
\Escapes special characters (e.g., \. matches a literal .) or introduces special sequences like \d(We will learn Special Sequences in the next section).

3. Special Sequences​

Special Sequence in regex are shorthand notations that match specific types of characters, like digits, whitespace, or word characters, making patterns more concise and readable. Table below shows various examples Special Sequence, their description and examples

Special SequenceDescriptionExample
\dMatches any digit (equivalent to [0-9]).\d{3} matches 123 in abc123
\DMatches any non-digit character.\D+ matches abc in abc123
\wMatches any alphanumeric character or underscore.\w+ matches abc123 in abc123
\WMatches any non-alphanumeric character.\W+ matches !@# in abc!@#
\sMatches any whitespace character (space, tab, newline).\s+ matches spaces in a b c
\SMatches any non-whitespace character.\S+ matches abc123 in abc123
\bMatches a word boundary.\bword\b matches word in word! but not in sword
\BMatches a non-word boundary.\Bword\B matches word in swordplay but not word alone
\AMatches the start of the string.\Aabc matches abc in abc123
\ZMatches the end of the string.123\Z matches 123 in abc123
\Escapes special characters.\. matches a literal . in 3.14
\tMatches a tab character.a\tb matches a b (tab)
\nMatches a newline character.a\nb matches a b with newline

3. RegEx Functions​

Regex functions in Python, available through the re module, provide powerful tools to search, match, split, replace, and manipulate text based on specified patterns, making it easy to handle complex text processing tasks.

FunctionDescriptionExample Usage
re.search()Searches for the first occurrence of the pattern within the string and returns a match object if found.re.search(r"\d+", "Age: 25") finds 25.
re.match()Checks if the pattern matches only at the beginning of the string. Returns a match object if found.re.match(r"Hello", "Hello World") matches Hello.
re.findall()Finds all non-overlapping matches of the pattern in the string and returns them as a list.re.findall(r"\w+", "Hello World") returns ['Hello', 'World'].
re.split()Splits the string by occurrences of the pattern and returns a list of substrings.re.split(r"\s", "Split this sentence") returns ['Split', 'this', 'sentence'].
re.sub()Replaces occurrences of the pattern in the string with a specified replacement.re.sub(r"\d", "#", "Phone: 123-456") returns Phone: ###-###.
re.compile()Compiles a regex pattern into a regex object for reuse, which can then use search(), match(), etc.pattern = re.compile(r"\d+"); pattern.search("ID: 789") finds 789.

Example:

import re

pattern = r"\b[A-Za-z]+\b" # matches any word consisting of letters only
text = "Hello, Algo world!"
matches = re.findall(pattern, text)
print(matches) # Output: ['Hello', 'Algo', 'world']

4. Practical Use Case: Password Validator​

Here’s an example of a Python Program to check the validity of password input by the user.

import re

def validate_password(password):
# Define the regex pattern for password validation
pattern = r"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$"

# Check if the password matches the pattern
if re.match(pattern, password):
return "Password is valid"
else:
return "Password is invalid"

# Test the function
print(validate_password("Password123!")) # Output: Password is valid
print(validate_password("Pass123")) # Output: Password is invalid (too short, no special character)