You can use Unicode code points for the shift table(s -- the Horspool variant has only one table) but make them sparse, that way the tables only contain characters that exist in the pattern. Hash tables make for easy lookups, but with a bit blob of a few hundred kB you can also use a bitmask.
Of course, you can also just reduce the Unicode pattern to bytes, so your alphabet is never larger than 256. This will run slower, but not as much as you'd think: Boyer-Moore does benefit from larger alphabets, but only to the extent that the alphabet is actually used.
> Of course, you can also just reduce the Unicode pattern to bytes
Ah, right. I was under the impression this was unsafe, since you could end up with spurious byte matches that are not on character boundaries. But it seems the keyword is "self synchronizing", and UTF-8 (but not UTF-16) is safe to do byte-oriented searching on.
Of course, you can also just reduce the Unicode pattern to bytes, so your alphabet is never larger than 256. This will run slower, but not as much as you'd think: Boyer-Moore does benefit from larger alphabets, but only to the extent that the alphabet is actually used.