The Regex Replace function is missing in your Excel? This tutorial shows how to quickly add it to your workbooks, so you could use regular expressions to replace text strings in Excel.
When it comes to changing one piece of text to another, Microsoft Excel provides a number of options to choose from including the Find and Replace tool and a couple of substitution functions. Why would one want to complicate things with regexes? Because Excel's standard features can only process an exact string that you specify. To find a string that matches some pattern and replace it with something else, regular expressions are indispensable.
Excel VBA Regex Replace function
As it's generally known, built-in Excel functions do not support regular expressions. To be able to use regexes in your formulas, you need to create your own function. Luckily, the RegExp object already exists in VBA, and we are using this object in the below code:
Tips:
- If you have little experience with VBA, this guide will walk you through the process: How to insert VBA code in Excel.
- After adding the code, remember to save your file as a macro-enabled workbook (.xlsm).
RegExpReplace syntax
The RegExpReplace function searches an input string for values that match a regular expression and replaces the found matches with the text you specify.
The function accepts 5 arguments, but only the first three are required.
Where:
- Text (required) - the text string to search in.
- Pattern (required) - the regular expression to match.
- Text_replace (required) - the text to replace the matching substrings with.
- Instance_num (optional) - a serial number indicating which instance to replace. If omitted, the function will replace all found matches (default).
- Match_case (optional) - controls whether to match or ignore text case. If TRUE or omitted (default), the search is case-sensitive; if FALSE - case-insensitive.
The function works in all versions of Excel 365, Excel 2021, Excel 2019, Excel 2016, Excel 2013 and Excel 2010.
Usage notes
To ensure that your results will meet your expectations, let's take a closer look at the inner mechanics:
- By default, the function works in the Replace all mode. To substitute a specific occurrence, put a corresponding number in the instance_num argument.
- By default, the function is case-sensitive. For case-insensitive search, set the match_case argument to FALSE. Because of the VBA RegExp limitations, the classic case-insensitive pattern (?i) is not supported.
- When you supply a regex directly in a formula, remember to enclose it in double quotation marks.
- If a valid pattern is not found, the function will return the original string with no changes.
- If the regex is invalid, a #VALUE! error will occur.
Excel Regex replace examples
Assuming you've already inserted the RegExpReplace function in your workbook, let's get to more fascinating things - using regular expressions for advanced find and replace in Excel.
Regex to replace string matching a pattern
In the sample dataset below, supposing you want to hide some personal data such as social security numbers. Given that SSN is a nine-digit number in the format "000-00-0000", we are using the following regular expression to find it.
Pattern: \d{3}-\d{2}-\d{4}
For replacement, this string is used:
Replacement text: XXX-XX-XXXX
With the original string in A5, the complete formula takes this form:
=RegExpReplace(A5, "\d{3}-\d{2}-\d{4}", "XXX-XX-XXXX")
For convenience, you can enter the pattern and replacement text in separate cells and refer to those cells in your formula. If you plan to use the formula for multiple cells, remember to lock the cell addresses with the $ sign:
=RegExpReplace(A5, $A$2, $B$2)
Regex to replace number in a string
To find any single digit from 0 to 9, use \d in your regular expression. To find specific digits, use an appropriate quantifier or construct a more sophisticated regex like shown in the below examples.
Replace all numbers
To replace absolutely all numbers in a string with some character or text, use the + quantifier, which says to search for numbers containing 1 or more digits.
Pattern: \d+
For example, to replace all numbers in cell A5 with an asterisk, use this formula:
=RegExpReplace(A5, "\d+", "*")
Replace amounts of money
In the same dataset, suppose you wish to replace only the amounts of money and not all the numbers. To have it done, you search for the dollar sign followed by one or more digits \$\d+ - this part matches the dollar unit. After the main unit, there may or may not be a fractional unit. To match it, you look for zero or one period after which come from 0 to 2 digits \.?\d{0,2}. The word boundary \b in the end ensures that the matching value is not part of a bigger number.
Pattern: \$\d+\.?\d{0,2}\b
Serve this regular expression to our custom function and you'll get the follwoing result:
=RegExpReplace(A5, "\$\d+\.?\d{0,2}\b", "*")
Regex to find and replace all matches
In classic regular expressions, there is the global search flag /g that forces a regex to find all possible matches in a string. In VBA, this flag is not supported. Instead, the VBA RegExp object provides the Global property that defines whether to search for all occurrences or only the first one. In the code of our function, the Global property is set to True, meaning the pattern should be tested against all possible matches in a string.
So, what do you do to replace all occurrences matching a pattern? Nothing special. This behavior is implemented by default.
Let's say you manage an Excel file with personal data. Some information is confidential, so before sharing this file with your colleagues, you want to replace sensitive info such as social security numbers (SSN) and individual taxpayer identification numbers (ITIN) with "CONF".
Given that both SSN and ITIN have the same format, you can find them using the below regular expression.
Pattern: \d{3}-\d{2}-\d{4}
Since the 4th instance_num argument of our function defaults to TRUE, you can safely omit it:
=RegExpReplace(A5, "\d{3}-\d{2}-\d{4}", "CONF")
As the result, all substrings matching the regex are replaced with the specified text:
Regex to replace a specific instance
To replace just one occurrence matching a given pattern, define the corresponding number in the instance_num argument.
In the above example, supposing you wish to replace only social security numbers. In all cells, SSN is listed first, so we are replacing the 1st instance:
=RegExpReplace(A5, $A$2, $B$2, 1)
Where $A$2 is the pattern and $B$2 is the replacement text.
Search and replace in Excel with Regex Tools
Have you heard about the recent addition to our Ultimate Suite? I mean four powerful Regex Tools introduced in the latest version. Compared to their VBA counterparts, the Ablebits Regex functions have two main advantages:
- You can use regular expressions in normal .xlsx workbooks without having to insert any code or enable macros in your Excel.
- Processed by the standard .NET RegEx engine, these functions support full-featured classic regular expressions and overcome many limitations of VBA RegExp.
For more information, please see AblebitsRegexReplace function.
Below, you will find an example of Regex Replace that cannot be done with VBA.
How to replace strings using regular expressions
Let's say you aim to replace text in square brackets with some character or string. The task can be accomplished with either a capturing group or positive look-arounds. The former is an easier way, so we'll stick to it.
Generally, to find a character that is "special" in regular expressions, you need to put a backslash right before it. In regexes, \ acts as an escape character that cancels the special meaning of the following character and turns it into a literal character. So, to find a bracket, you prefix it with a backslash: \[ to match an opening bracket and \] to match a closing bracket. Between the brackets, place a (capturing group). Depending on which quantifier you use in the group, different replacements will be made.
Greedy pattern: \[(.*)\]
This pattern performs a greedy search - consumes as much as possible. The .* expression matches as many characters as possible, and then tries to match ]. So, this pattern captures everything from the first opening bracket to the last closing bracket.
Lazy pattern: \[(.*?)\]
This pattern performs a lazy search - consumes as little as possible. The question mark ? forces .* to match as few characters as possible until it finds the next match in the pattern, which is ]. So, this pattern captures everything from the first opening bracket to the first closing bracket.
Negated class: \[([^\]]*)\]
One more way to match a substring in square brackets is to use a negation operator (^) inside the capturing group. From the first opening bracket, this pattern captures any characters other than a closing bracket, until it finds the first closing bracket. The result will be the same as with the lazy search.
Once you've decided on the regex, go "feed" it to the Ablebits Regex Replace function. Here's how:
- On the Ablebits Data tab, in the Text group, click Regex Tools.
- On the Regex Tools pane, configure these settings:
- Select the source strings.
- Enter your pattern.
- Choose the Replace with option and type the replacement text in the box.
- To have the results as formulas, not values, select the Insert as a formula check box.
- Click the Replace button.
At a moment's notice, the AblebitsRegexReplace function is inserted in a new column to the right of your original data.
=AblebitsRegexReplace(A2, "\[(.*)\]", "*")
The pattern and replacement text you specify go to the 2nd and 3rd arguments, respectively. Alternatively, you can keep them in separate cells, and supply the corresponding cell references. In our case, we enter the patterns in cells B2 and C2, type the replacement value in A2, then compare the results and choose the desired outcome:
Tips and notes:
- The AblebitsRegexReplace function finds and replaces all matches.
- The function can be inserted directly in a cell via the standard Insert Function dialog box, where it is categorized under AblebitsUDFs.
That's how to find and replace text strings in Excel using regular expressions. I thank you for reading and look forward to seeing you on our blog next week!
Available downloads
Excel Regex Replace examples (.xlsm file)
Ultimate Suite 14-day fully-functional version (.exe file)
7 comments
The issue with the compile error is that the "text_result" line has & instead of an ampersand by itself.
text_result = Left(text, pos_start - 1) & Replace(text, text_find, text_replace, pos_start, 1, vbBinaryCompare)
should be
text_result = Left(text, pos_start - 1) & Replace(text, text_find, text_replace, pos_start, 1, vbBinaryCompare)
Hi!
Explain your problem. Why did you write two identical codes?
I believe he means there is " &" instead of "&" in the text on this page.
The "&" causes a syntax error (I just stumbled upon it myself).
Replacing by "&" fixed it.
Felix
I realise the web form edited out the error part. this is also why
> Why did you write two identical codes?
I will write as plain :
I believe he means there is AMPERSAND A M P SEMICOLON instead of AMPERSAND.
THE AMPERSAND A M P SEMICOLON causes a syntax error (I just stumbled upon it myself).
Replacing by just AMPERSAND in the line fixed it.
I think the text box here replaces A M P SEMICOLON by nothing as well ~~ something to look into.
Felix
For Text result "amp;" created a compile error for me.
I think you might be missing
Dim regex, matches As Object
in RegExpReplace().
Thanks again for all the great info you provide! It's a valuable resource!
Hi Carl,
Thank you for your feedback! We are happy to hear you find our new Regex functions helpful.
Unfortunately, VBA is beyond my area of expertise. The function's code was written by Alex, our Excel guru. He says you are absolutely right - it's recommended to declare all variables, especially in large codes. In this particular case, aiming to make the code as compact as possible, he omitted these declarations because the Set statement always assigns an object reference to a variable. I hope this makes sense :)