I would say it's a matter of input vs output. Using AI to identify the best part of the image to act as input is fine (e.g. where to focus, what to base exposure/white balance on, etc). But if you treat that part of the image differently as part of a function's output, e.g. adjusting color differently on a face vs the rest of the image in post, that's where my sense of distaste kicks in. I'd rather that rely only on local color information (e.g. treating a part of the image differently because it's darker is fine).
Using facial recognition for focus is ok since the camera still faithfully captures the light entering the lens, but recognizing things in the frame to treat different parts of the image differently in post-processing is a big no-no.
If you don't let it do that, the picture will often be annoyingly dark in the foreground / blown out in the background, because cameras' dynamic range isn't anywhere near your eyes. This is called local tone mapping/dynamic range optimizer.
Sharpening also benefits from object segmentation because you don't want the effects to bleed over into different areas, you get halos that way.
Local tone mapping works just wine without object recognition. It's a dumb algorithm that goes over the image and normalizes pixel values relative to their surroundings to cram the higher dynamic range of the image into the lower dynamic range of the screen. Phone camera apps had it for at least 10 years, and people did it manually with exposure bracketing ever since digital cameras became mainstream.
Yes, but any local area algorithm benefits from segmentation because it automatically becomes "smarter". Upscaling is the most obvious one - the upscaled version of a blue object on a red background is not the average of the two colors - but it applies all over the place.
I don't know of any implementations of this, but an interesting one would be auto white balance. It's typically done as a global slider, but if the image has multiple light sources this doesn't always look good.
And actual straight-up "knowing what's in the scene" AI can help too; people shouldn't look sickly in your photos just because they're under a low-CRI yellow light. You probably want to know what color skin tones actually are.
I'm aware of all these things but there still is a 1:1 correspondence between camera sensor pixels and JPEG [luminance] pixels. Yes, color information isn't complete, but it's the right balance. Enlarge it and you aren't adding any new information. Shrink it and you're losing information.
Definitely not true unfortunately. In particular it's not true for areas colored red or blue, because the green channel doesn't have the luminance information - that's why I mentioned them.
Lightroom and at least one current phone camera have ML Bayer demosaicing for this reason and it's visibly sharper.
(Note sharpening and enlarging are the same operation.)
Well, if that's the case, there are two options: the photographer git exposure wrong (easy to do in high contrast situations) or you create a HDR photo in post. More often than not so, just changingetering, or the composition by e.g. getting rid of some very bright sky or dark foreground, allows any modern (read: post late 90s) to get exposure right in P-mode. Added benefit of digital photography: you can check your shot on location, including histograms and even live histograms. Takes all of 2 minutes in the field with some practice. Of course, a smartphoe camera won't do any of that.
This is true for dedicated cameras, but smartphone sensors have less dynamic range and so can't naturally get good pictures in a lot of normal situations, especially mixed (indoor+outdoor) lighting.
When any form of object recognition is involved.