RegEx match content unless it is inside quotes
I recently had the need to match content outside quotes, but avoid content inside quotes. For example, I have content like so:
this is content to replace, “but don’t replace this string, it’s “important”“, but feel free to replace this.
The answer came from http://thisworkinglife.blogspot.com/2006/04/net-regular-expressions-finding.html supported by http://regexadvice.com/forums/thread/36369.aspx> and http://www.regular-expressions.info/lookaround.html and http://aspnet.4guysfromrolla.com/articles/022603-1.aspx
The expression is added to the beginning of any other expression. Here it is:
It means: “… and before it is an even number of double-quotes not proceeded by a or no quotes at all before it …”
If you’ve never used look-ahead or look-behind expressions, they’re incredibly cool. I understand they’re not as performant as normal expressions, but definitely do the job. The nice part about them is they’re not considered part of the match in any way, so no need to capture this non-important content and replace it back into the modified string.
In time, I’d like to expand it to encompass single-quoted text too, but that got weird. Do I need to escape a ‘ inside a ’ but not a “? If there’s a ‘ inside a ” is it ok? What if it’s a ’ that doesn’t have white space around it such as “can’t” or “don’t”? Do I escape a ‘ with a or with a ‘? For this purpose, assuming the content in question was in double-quotes was sufficient, and made the regex much simpler.