Surprised they don't use a hardware video encoder, is it because the well and efficiently supported formats are all MPEG, and thus have fairly high licensing cost on top of the hardware? Or because even efficient HVEs use more resources than webcams can afford? Or because inter-frame coding requires more storage, which (again) means higher costs, which (again) eats into the margin, which cheap webcam manufacturers consider not worth the investment?
My older Logitech C920 has an on-board H.264 encoder. Newer revisions of the same model does not.
I haven't figured out why they chose to remove it, but your point about licensing cost combined with them not advertising it much as a feature, and most of their competitors not including "proper" video encoding might explain it.
Unfortunately, this makes it much harder to use these as webcams on a Raspberry Pi (which even has H.264 hardware acceleration – the bottleneck is decoding the MJPEG stream from the camera, for which ffmpeg does not have hardware acceleration on the RPi).
As an alternative to ffmpeg, GStreamer provides hardware accelerated MJPEG decoding on the Pi. I think there are bugs, though, which makes it unsuitable for some use cases. Here's an example pipeline - https://forums.raspberrypi.com/viewtopic.php?p=1989575#p1989...
MJPEG is just a very simple "video" format that needs very simple and cheap electronics to work. Video encoding blocks are mostly part of bigger SoCs and comes with licensing costs.
Same goes on the other hand for the receiving end - decoding a stream of JPEGs is just much simpler in both CPU use and code complexity than dealing with something like H.264.