Cookie preferences
We use cookies for analytics. Privacy Policy You can accept or decline non-essential tracking.
Practical guide to regex data cleaning: formulas, workflow, implementation pitfalls, and a direct execution playbook with Regex Tester.
Go to tool
Test regular expressions with live match highlighting and capture groups.
Dirty CRM data costs real money: bounced emails, duplicate contacts, botched mail merges. Here are 5 battle-tested regex patterns that clean the most common issues. Test each one in the Regex Tester before running on production data.
Strip everything except digits and leading +:
Pattern: [^\d+]
Replace: (empty)
Flags: gInput: (555) 123-4567 → Output: 5551234567
Input: +1-800-555-0199 → Output: +18005550199
Then format programmatically:
const digits = phone.replace(/[^\d+]/g, '');
// Format US: +1 (XXX) XXX-XXXXCatch obviously invalid emails before they enter your CRM:
Pattern: ^[^\s@]+@[^\s@]+\.[^\s@]+$Matches: [email protected], [email protected]
Rejects: user @example, @missing.com, no-at-sign.com
This is a practical filter, not RFC 5322 compliant. For production, pair with MX record lookup.
CSV exports and copy-paste often introduce extra whitespace:
Pattern: \s{2,}
Replace: (single space)
Flags: gInput: John Smith Jr → Output: John Smith Jr
Also trim leading/trailing whitespace with ^\s+|\s+$ (flags: gm for multiline).
Pull company domains from a list of website URLs:
Pattern: https?:\/\/([^\/]+)
Capture group 1 = domainInput: https://www.acme-corp.com/about → Group 1: www.acme-corp.com
To strip www.:
Pattern: https?:\/\/(?:www\.)?([^\/]+)CSV parsers sometimes leave stray quotes and spaces:
Pattern: ^[\s"]+|[\s"]+$
Replace: (empty)
Flags: g (apply per field)Input: " Acme Corp " → Output: Acme Corp
For escaped double-quotes inside fields:
Pattern: ""
Replace: "import re
def clean_crm_row(row: dict) -> dict:
if 'phone' in row:
row['phone'] = re.sub(r'[^\d+]', '', row['phone'])
if 'email' in row:
row['email'] = row['email'].strip().lower()
if 'company' in row:
row['company'] = re.sub(r'\s{2,}', ' ', row['company']).strip()
return rowPaste each pattern into the Regex Tester with sample data to verify matches and captures before applying at scale.
This article is reviewed by the Tools Hub editorial team for factual accuracy, practical relevance, and consistency with current product workflows.
Last reviewed:
Generate and validate regular expressions faster with the AI Assistant in Regex Tester to clean messy CRM and CSV fields before import.
Use Regex Tester to debug broken patterns with AI-assisted pattern revision and deterministic sample testing.
Use regex patterns for safe data cleanup before CRM and CSV imports so formatting errors in phones, emails, and names stop breaking automation.
Create clickable Telegram links with prefilled messages, phone number format, and bot start params. Copy-paste templates for campaigns, QR codes, and bio links.