RegEx(Regular Expression) in Python

Hey there! In this guide, we'll explore RegEx or Regular Expression in Python. A Regular Expression or RegEx is a special sequence of characters that uses a search pattern to find a string or set of strings. Let's dive in!

Regular Expression

Regular expressions (regex) are a powerful tool for working with text by allowing you to define patterns that match specific character sequences.Regex is commonly used for tasks like:

Searching for specific patterns within a string
Validating inputs (e.g., email addresses, phone numbers)
Extracting or replacing parts of text

1. RegEx Module in Python

Python has a built-in module named re that is used for regular expressions in Python. We can import this module by using the import statement.

Example:

# importing re module
import re

After importing the module you can start using the various function provided .

2. Metacharaters

Metacharacters are the characters with special meaning.These are used for pattern making. Below is a list of few Metacharacters with their descriptions

Metacharacter	Description
`.`	Matches any character except a newline.
`^`	Matches the start of a string.
`$`	Matches the end of a string.
`*`	Matches 0 or more repetitions of the preceding character or group.
`+`	Matches 1 or more repetitions of the preceding character or group.
`?`	Matches 0 or 1 occurrence of the preceding character or group (makes it optional).
`{}`	Specifies the exact number of repetitions `{n}`, a range `{n,m}`, or a minimum `{n,}`.
`[]`	Matches any single character inside the brackets.
`\|`	Acts as an OR operator between expressions (e.g., `a
`()`	Groups multiple expressions together and captures the matched text.
`\`	Escapes special characters (e.g., `\.` matches a literal `.`) or introduces special sequences like `\d`(We will learn `Special Sequences` in the next section).

3. Special Sequences

Special Sequence in regex are shorthand notations that match specific types of characters, like digits, whitespace, or word characters, making patterns more concise and readable. Table below shows various examples Special Sequence, their description and examples

Special Sequence	Description	Example
`\d`	Matches any digit (equivalent to `[0-9]`).	`\d{3}` matches `123` in `abc123`
`\D`	Matches any non-digit character.	`\D+` matches `abc` in `abc123`
`\w`	Matches any alphanumeric character or underscore.	`\w+` matches `abc123` in `abc123`
`\W`	Matches any non-alphanumeric character.	`\W+` matches `!@#` in `abc!@#`
`\s`	Matches any whitespace character (space, tab, newline).	`\s+` matches spaces in `a b c`
`\S`	Matches any non-whitespace character.	`\S+` matches `abc123` in `abc123`
`\b`	Matches a word boundary.	`\bword\b` matches `word` in `word!` but not in `sword`
`\B`	Matches a non-word boundary.	`\Bword\B` matches `word` in `swordplay` but not `word` alone
`\A`	Matches the start of the string.	`\Aabc` matches `abc` in `abc123`
`\Z`	Matches the end of the string.	`123\Z` matches `123` in `abc123`
`\`	Escapes special characters.	`\.` matches a literal `.` in `3.14`
`\t`	Matches a tab character.	`a\tb` matches `a b` (tab)
`\n`	Matches a newline character.	`a\nb` matches `a b` with newline

3. RegEx Functions

Regex functions in Python, available through the re module, provide powerful tools to search, match, split, replace, and manipulate text based on specified patterns, making it easy to handle complex text processing tasks.

Function	Description	Example Usage
`re.search()`	Searches for the first occurrence of the pattern within the string and returns a match object if found.	`re.search(r"\d+", "Age: 25")` finds `25`.
`re.match()`	Checks if the pattern matches only at the beginning of the string. Returns a match object if found.	`re.match(r"Hello", "Hello World")` matches `Hello`.
`re.findall()`	Finds all non-overlapping matches of the pattern in the string and returns them as a list.	`re.findall(r"\w+", "Hello World")` returns `['Hello', 'World']`.
`re.split()`	Splits the string by occurrences of the pattern and returns a list of substrings.	`re.split(r"\s", "Split this sentence")` returns `['Split', 'this', 'sentence']`.
`re.sub()`	Replaces occurrences of the pattern in the string with a specified replacement.	`re.sub(r"\d", "#", "Phone: 123-456")` returns `Phone: ###-###`.
`re.compile()`	Compiles a regex pattern into a regex object for reuse, which can then use `search()`, `match()`, etc.	`pattern = re.compile(r"\d+"); pattern.search("ID: 789")` finds `789`.

Example:

import re

pattern = r"\b[A-Za-z]+\b"  # matches any word consisting of letters only
text = "Hello, Algo world!"
matches = re.findall(pattern, text)
print(matches)  # Output: ['Hello', 'Algo', 'world']

4. Practical Use Case: Password Validator

Here’s an example of a Python Program to check the validity of password input by the user.

import re

def validate_password(password):
    # Define the regex pattern for password validation
    pattern = r"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$"
    
    # Check if the password matches the pattern
    if re.match(pattern, password):
        return "Password is valid"
    else:
        return "Password is invalid"

# Test the function
print(validate_password("Password123!"))  # Output: Password is valid
print(validate_password("Pass123"))       # Output: Password is invalid (too short, no special character)

Regular Expression​

1. RegEx Module in Python​

2. Metacharaters​

3. Special Sequences​

3. RegEx Functions​

4. Practical Use Case: Password Validator​