Groups, Anchors & Lookarounds

💤0
Lv 10 XP
← ⚙️ Scripting & Automation · Regular Expressions

Groups, Anchors & Lookarounds

Intermediate ⭐ 80 XP ⏱ 18 min #regex#groups#capture

Capture parts of a match, reuse them, and assert context with lookarounds.

📖Theory

Groups ( ) capture part of a match so you can extract or reuse it. Captures are numbered \1, \2 (or $1 in replacements), and can be named (?P<year>\d{4}). A non-capturing group (?: ) groups without saving.

Lookarounds assert context without consuming characters:

  • Lookahead (?=…) / negative (?!…) — “followed by” / “not followed by”
  • Lookbehind (?<=…) / negative (?<!…) — “preceded by” / “not preceded by”

These let you match something only when its surroundings qualify — e.g. a number followed by “USD” without including “USD” in the match.

🌍Real-World Example
import re

m = re.search(r"(?P<year>\d{4})-(?P<month>\d{2})", "2026-06")
print(m.group("year"), m.group("month"))   # 2026 06

# Replace using captured groups: swap "First Last" -> "Last, First"
print(re.sub(r"(\w+)\s+(\w+)", r"\2, \1", "Ada Lovelace"))  # Lovelace, Ada

# Lookahead: amounts followed by USD (without matching USD)
print(re.findall(r"\d+(?= USD)", "10 USD, 5 EUR, 20 USD"))  # ['10', '20']
✍️Hands-On Exercise
  1. Capture the year and month from a date string into named groups.
  2. Use backreferences to reformat “First Last” into “Last, First”.
  3. Write a lookahead that matches a price only when followed by a currency code.
  4. Explain the difference between (abc) and (?:abc).
🧾Cheat Sheet
TokenMeaning
(…)capturing group
(?:…)non-capturing group
(?P<name>…)named group
\1 / $1backreference
(?=…) / (?!…)lookahead / negative
(?<=…) / (?<!…)lookbehind / negative
💬Common Interview Questions
What is a capturing group used for?

To extract or reuse part of a match — via group numbers/names for extraction, or backreferences (\1, $1) in replacements and within the pattern.

What does a lookahead do?

It asserts that a position is (or isn’t) followed by a pattern without consuming those characters, so they stay out of the match itself.

📚Official Documentation

📝 My notes on this topic

Auto-saves as you type