The reason that you do not get an optional cat
after a reluctantly-qualified .+?
is that it is both optional and non-anchored: the engine is not forced to make that match, because it can legally treat the cat
as the “tail” of the .+?
sequence.
If you anchor the cat at the end of the string, i.e. use ^(dog).+?(cat)?$
, you would get a match, though:
Pattern p = Pattern.compile("^(dog).+?(cat)?$"); for (String s : new String[] {"dog, cat", "dog, dog, cat", "dog, dog, dog"}) { Matcher m = p.matcher(s); if (m.find()) { System.out.println(m.group(1)+" "+m.group(2)); } }
This prints (demo 1)
dog cat dog cat dog null
Do you happen to know how to deal with it in case there’s something after cat?
You can deal with it by constructing a trickier expression that matches anything except cat
, like this:
^(dog)(?:[^c]|c[^a]|ca[^t])+(cat)?
Now the cat
could happen anywhere in the string without an anchor (demo 2).