> It's a problem when e.g. searching for caf◌́e doesn't find café.
That doesn't even display on my browser[1]; tried it in Goland[2], doesn't display there either, so that's the rare case 0.0001% that I wouldn't really worry about, because if the code has undisplayable unicode sequences, there's bigger problems than searching.
OP is trying to explain that accents in Unicode can be written in two ways: either "COMBINING ACUTE ACCENT" + "LATIN SMALL LETTER E" (two codepoints) or "LATIN SMALL LETTER E WITH ACUTE" (one codepoint). They both render the same on all browsers. But they don't compare the same unless you use locale-aware code.
To demonstrate this OP explicitly used "DOTTED CIRCLE" (◌) then added the "COMBINING ACUTE ACCENT" to that. Normally there would be no dotted circle.
If you're using a Mac you have an easy way to reproduce the distinction: save a file called café.txt and look at what filename it has in a directory listing. (It will look the same but have a different byte sequence).
Why is this a problem? In 99.9999% of Java code, the UTF-8 characters aren't going to trip up an ASCII search.