Your regex should work ‘as-is’. Assuming that it is doing what you want it to.
This means match
wordA followed by 0 or more spaces followed by
wordB, but do not match if followed by
wordc. Note the single space between
wordc which means that
wordA wordB wordc will not match, but
wordA wordB wordc will.
Here are some example matches and the associated replacement output:
Note that all matches are replaced no matter how many spaces. There are a couple of other points: –
(?! wordc)is a negative lookahead, so you wont match lines
wordA wordB wordcwhich is assume is intended (and is why the last line is not matched). Currently you are relying on the space after
?!to match the whitespace. You may want to be more precise and use
(?!\swordc). If you want to match against more than one space before wordc you can use
(?!\s*wordc)for 0 or more spaces or
(?!\s*+wordc)for 1 or more spaces depending on what your intention is. Of course, if you do want to match lines with wordc after wordB then you shouldn’t use a negative lookahead.
*will match 0 or more spaces so it will match wordAwordB. You may want to consider
+if you want at least one space.
(\s*)– the brackets indicate a capturing group. Are you capturing the whitespace to a group for a reason? If not you could just remove the brackets, i.e. just use
Update based on comment
Hello the problem is not the expression but the HTML out put that are not considered as whitespace. it’s a Joomla website.
Preserving your original regex you can use:
wordA((?:\s| )*)wordB(?!(?:\s| )wordc)
The only difference is that not the regex matches whitespace OR
. I replaced
\swordc since that is more explicit. Note as I have already pointed out that the negative lookahead
?! will not match when wordB is followed by a single whitespace and wordc. If you want to match multiple whitespaces then see my comments above. I also preserved the capture group around the whitespace, if you don’t want this then remove the brackets as already described above.