A Journey Through Color Space with FFmpeg

dylan604 · on April 17, 2023

"What is color space 1? And what are primaries?"

Be thankful you have the ability to set the color space and primaries within the command. Back in the days of dinosaurs (mid-2000s to 2010ish) and when HD was just an infant sucking its thumb and before Apple updated the MOV container to hold these values, it was even more annoying to get colors correct. The big issue then was gamma settings. Windows used one value and Macs used another, and there were always issues getting them to look the same on either platform.

There was also the bad ol' days of when iTunes would reject your MOVs for not having these primaries and color space values in your files created by Final Cut Pro. It turns out, that within the silos of Apple, they do no communicate back and forth with each other. So the iTunes team read the white paper on the MOV file format and wrote all of their automation expecting these values to be present. Unfortunately, the team writing FCP had not updated their exporter to use these new features of the MOV. So iTunes rejected the files from not being written by an official Apple product!?!? something about left hand ands right hands can be written here. To the point, we had to have a 3rd party app developed to take the files from FCP and update them to be compliant with iTunes.

I still have scars from the shrapnel from those "early" days. If only articles like this were made at the beginning of the features rather than 15+ years after the fact and can be written after doing a google search. So hopefully future me types can find this rather than the hell of needing the feature and having to get the feature developed, and then just try it and hope for the best.

giantrobot · on April 17, 2023

> It turns out, that within the silos of Apple, they do no communicate back and forth with each other. So the iTunes team read the white paper on the MOV file format and wrote all of their automation expecting these values to be present. Unfortunately, the team writing FCP had not updated their exporter to use these new features of the MOV.

I was somewhat involved in this at Apple at the time! Your description is actually a decent description of the reality on the ground. The following may however have some misremembered details so don't take it as gospel.

At the time QuickTime was migrating internally the older MacOS era subsystems to newer ones under the hood. The low level frameworks were private to QuickTime, Apple didn't ship public headers for them. Both iTMS and FCP however did use the private headers to use those frameworks directly.

With iTMS they had a bespoke media ingest and export pipeline. They were calling QuickTime's private frameworks but doing their own thing with the raw samples between ingest and export. IIRC they had QuickTime branch/builds of their own because their encoder fleet didn't just jump on the latest OS and QuickTime version.

FCP/Compressor were also directly using QuickTime's private frameworks but as I recall shipped their own private copies of some of them. I believe this was due to those revving with the OS rather than QuickTime's package but FCP had to support older OSes.

Even QuickTime's behavior wasn't entirely consistent. Videos captured on iPhones had an assumed color space (Rec.709 IIRC) but at the time didn't write the tags into the files. QuickTime had to recognize iPhone originated files and then play them as if they were tagged then tag them if it exported (pass-thru or otherwise) those files. For other untagged input files it assumed tags based on the frame size, SD sizes were assumed to be Rec.601 if they weren't tagged and HD sizes Rec.709 if they were untagged. Of course tagged files didn't need any guessing and got the tagged behavior on playback and export.

It's not necessarily that the three teams didn't talk to each other, their schedules were just wildly different cadences. While the QuickTime group was writing all the frameworks being used they were concerned mostly with the low level issues. They were also supporting iLife, iTunes (the app), and QuickTime Player/plugin releases along with a good portion of the group working on iOS as well. FCP and iTMS were just internal customers writing their own stuff using the low level frameworks.

dylan604 · on April 17, 2023

Maintaining backwards compatibility with FCP makes sense, but the hubris of the iTMS team was also quite funny to me as well. They could have handled this on their end knowing that the Apple ecosystem was fractured regarding QT. They could have built Transporter to be much more tolerant of files created by FCP/Compressor to not be rejected because of these differences in Atoms within the container. Instead, many 3rd party apps had to be built/used by people instead of just handling it internally.

I also relate back to their air of superiority from another situation I had with them. Their original specs called for video content originating on film to use IVTC to return the frame rate back to original progressive frames for the obvious reasons. However, they clearly were not familiar with TV/Episodic content that was originally shot on film, but edited at 29.97 with broken cadences, use of 30p graphics, and other content that was true 29.97i. There was a specific F/X series that 2 different providers had their files rejected from the show because of faulty IVTC processing. My company got involved, but iTMS said to send them the unprocessed 29.97 content because they "knew" how to handle it. They didn't, and then it landed in my lap. I processed the files and submitted them where they were approved. There was a request from iTMS to find out how they were made. Very smuggly and with a righteous smile on my face, I got to quote an answer they had provided me previously back to them, "this is a proprietary process developed by our in-house engineers." We got a lot of work because of that.

It was examples like this that make me roll my eyes when FAANG types are held in such high regard. They're just humans trying to figure stuff out with people available to them. They might not actually know everything just as us mere mortals, but some of us have unique experiences that allows us to tell the emperor he's not wearing any clothes when the FAANG types issue decrees from up on high that are clearly just not right.

giantrobot · on April 17, 2023

Even internally I got the same sense of the iTMS team. To this day I don't know where they got that attitude from. If I were to guess it would be they had a ridiculous revenue-to-engineer ratio and got a lot of attention/praise at the executive level. Every time (for some approximation of every) their pipeline ran they generated millions of dollars in revenue. Meanwhile the QuickTime Player/plugin/QuickLook was invoked likely billions of times a year and generated revenue on a tiny fraction of those invocations. I can't say for sure but it was very annoying at times interacting with them internally. Dealing with that externally I could imagine being a complete pain in the ass.

aikinai · on April 17, 2023

This is a shot in the dark, but you seem very knowledgeable about color, video formats, and the Apple ecosystem so I thought you might have some insight into this issue I'm facing recently.

I'm trying to process HLG3 video files from my Sony camera and have spent far too many hours trying to figure out how different settings affect QuickTime, ffmpeg, and Davinci Resolve, among others. The biggest thing I can't figure out right now is why Davinci Resolve renders the HLG video brighter than QuickTime.

If I enable 10-bit precision and Mac display color profiles for the Davinci viewer, it does seem to tap into the HDR capabilities on my MacBook and allow for full dynamic range that looks similar to QuickTime's render. However, everything is brighter, especially highlights.

Do you know of any good references to better understand how HDR works on a modern Mac?

dylan604 · on April 17, 2023

I got out of this particular job type just before HDR became widespread as a consumer product. Chasing color accuracy is one of the top 3 things I hate about computing only proceeded by choosing a font, and printing.

I spent time in a color correction facility where we had to calibrate Sony studio reference CRTs, LCDs (or other flat panel tech), Christie projectors, and desktop monitors. I absolutely hate Hate HATE it. The external monitors being fed by a Decklink card would have to be calibrated using a dedicated instrument for each type of screen. The desktop screens would have to be calibrated by another device. At the end of the day, the client was always advised to base color decisions on the calibrated display monitors and NOT the desktop monitors. Luckily, all of this was handled by the video engineering department, and I only know as much about it from what I asked. Thankfully, it was never my responsibility! So, I guess I'm saying good luck with that! ;-) Trying to dial in all of the different display devices to match is a journey into madness. There was a time when commercial content was only ever critically evaluated on broadcast quality monitors. As web content became larger, people were less concerned about the broadcast safe and pushed values to no longer be broadcast safe so that they looked good on iPhones. All of that to say, you're not alone in your pain.

Also, are you working a PC or Mac? If you're on a PC, I have zero experience with this on that platform. Fun fact (using air quotes), there was a bug in QTPlayer so that the first file you opened would display with an incorrect gamma value. When doing critical color decisions with that player, we would open some random file first, then open up the rest of the files we were interested in. Don't know if that bug is still around or not. If you're facing brightness differences between apps, my first assumption would be gamma differences between apps.

aikinai · on April 17, 2023

Oh okay, thanks for the response though! Yeah, this whole space is such a mess and I'm not even trying to do something at a professional level (which is why I'm using my Mac screen and not a reference monitor).

There are a lot of blog posts and YouTube videos about matching Davinci, QuickTime, YouTube, etc. but so far I've only found answers for SDR. Native Mac screens have a true HDR mode built in you can see clearly when you play any HDR video in Photos, QuickTime, etc. The normal UI is capped to, I believe, 500 nits, and then HDR video unlocks up to 1600 nits. I can see my settings in Resolve are unlocking the higher brightness, so it should be triggering the MacOS HDR pipeline, but it's too bright, so I agree the gamma must be different somehow.

aikinai · on April 17, 2023

I played with it a bit more just now and I'm very close! For anyone who stumbles on this thread in the future I found that in Resolve if I disable automatic color management, and set the output color space to "ST2084 500 nit", then the gamma seems to match Quicktime exactly (though I'm not a professional). It's still clearly going over 500 nits in the viewer, so I guess that's some sort of baseline that matches the "default 500 nits but go over for HDR" that Macs do.

The colors are still just a bit off and I think that might be something about how Quicktime and Resolve are interpreting the Sony HLG3 colors which don't exactly match the standards. I might try some of the normalization products offered here[0] later when I get time.

[0] https://xtremestuff.net/sony-and-hybrid-log-gamma-hlg/

galad87 · on April 17, 2023

Because tone mapping from an high dynamic range to a standard dynamic range is not standardised, there are some recommendations, but each app uses a slightly different algorithm. Then an app might implement tone mapping for HLG, some might display it as it is on a SDR screen (HLG is supposed to be watchable even on those old SDR screens.) Or implement it mostly properly, but ignore the current screen color profile. Even for SDR, some apps don't implement Bt.1886 yet, so you might see different results.

aikinai · on April 17, 2023

Mentioned in a sibling comment, but actually I'm trying to edit using the HDR mode possible on native Apple screens (like the 1600 nits MacBook Pro). So I think I'm not running into any SDR mapping. It seems to be using the Mac HDR mode, but with a slightly different gamma that I don't know how to correct for.

galad87 · on April 17, 2023

Tone mapping is used for HDR screens too, if the screen has got less nits than the video. Anyway, probably it's something else, you will have more luck asking DaVinci Resolve developers. It could be they are ignoring the screen color profile, or they set something wrong, there are so many things that can go wrong.

throwaway290 · on April 17, 2023

Ifff you are sure you got everything else right, look into color space / gamma tags. Latest versions of Resolve allow to provide these tags in export settings. Sometimes you may need to use tags different than Resolve's default to make it render with correct gamma/color in QT, vimeo etc.

3033616426 · on April 17, 2023

https://www.youtube.com/watch?v=1QlnhlO6Gu8 is a great resource. IIRC, this video focuses on SDR, but this channel has many good resources.

TL;DR for HDR content, you need to figure out _exactly_ which OS and which video player (or web browser) you want your content to be consumed in, and you can match that. It's difficult to achieve cross-platform color-correct output, even in SDR, and I'm not sure it'll be possible for HDR.

Sniffnoy · on April 17, 2023

> I like to use the mnemonics S=short, M=medium, and L=large wavelengths to remember all of this.

That's not a mnemonic, that's what they actually stand for...

brucethemoose2 · on April 17, 2023

You should check out VapourSynth, which makes colorspace conversion tasks like this very organized and Pythonic: https://www.vapoursynth.com/doc/functions/video/resize.html

But it also sometimes suffers from trying to parse the original colorspace, whether you use use ffmpeg reader or not.

PaulHoule · on April 17, 2023

I think it’s funny because I have gotten into red-cyan anaglyph stereograms both for screen and print and found that the more “advanced” hardware and techniques I use the worse results I get.

https://en.wikipedia.org/wiki/Anaglyph_3D

I have a high color gamut Dell monitor and what it does it takes (0, 255, 0) in sRGB space and turn it into something like (16, 186, 15) because the primaries are a little more saturated than the sRGB primaries. You wouldn’t want sRGB pure green turned into “hit by a green laser pointer green” would you?

The trouble is red got mixed in with my green so now there is a “ghost” image in the wrong eye.

If I view an image in a tk canvas on Windows, tk does no color correction so I don’t see the ghosts, but if I use a modern application like Photoshop or a web browser it does color correction and I do see them. I know I could get better results if I made an image in a high color gamut space but haven’t yet operationalized this because I’m more interested in print than screen at the moment. I was thinking pretty seriously of making a WebGL viewer that does the anaglyph processing in the browser but that’s a non-starter if I can only output in the sRGB color space.

As for print I have the same problems, I know I get worse results if I load a color profile for the specific paper I use and have Photoshop manage colors than if I set the printer to use ICM color calibration. I’m in the middle right now of setting up some controlled experiments that let me measure the ghosting caused by color “correction” which is not a “correction” at all for stereograms.

mncharity · on April 17, 2023

> making a WebGL viewer that does the anaglyph processing in the browser but that’s a non-starter if I can only output in the sRGB color space.

There's apparently now WebGL support for display-p3[1] on Chrome and Safari.[2] Anaglyph shaders can be straightforward[3] - I had fun playing with shallow-3D desktop UIs.

[1] https://en.wikipedia.org/wiki/DCI-P3 [2] https://caniuse.com/?search=drawingBufferColorSpace https://developer.mozilla.org/en-US/docs/Web/API/WebGLRender... https://bugzilla.mozilla.org/show_bug.cgi?id=1771373 [3] https://www.shadertoy.com/results?query=anaglyph https://threejs.org/examples/webgl_effects_anaglyph.html

ggm · on April 17, 2023

I generally hate analogies, and I welcome being told this analogy is wrong:

you know that thing where ads on radio "sound louder" but when you hassle the station they swear blind the total db on the signal as "volume" remains constant? Thats what "colourspace" is to me, in sound: the colours add up to "white" but how you got there alters what it looks like.

That, plus people respond differently (rods and cones) to different specific intensities coming in. Total energy? same-same. Human response? Not same.

pavlov · on April 17, 2023

A color space also defines what “white” means, not just the interpretation of the component values.

For example, sRGB defines a D65 white point. This is the absolute color of the light that the monitor should emit for white pixels. It’s usually defined using the XYZ color space (or derived x/y chromaticity coordinates).

XYZ is the scientific baseline for color, but it’s not widely used for content creation. Other color spaces are designed to make color more intuitive and easier to encode. Talking about color in the XYZ space would be sort of like talking about music using absolute frequencies. It’s often more practical and flexible to say “note A” rather than “440 Hz.”

sylware · on April 17, 2023

I use my own-coded ffmpeg based video player and since video color spaces are a gigantic mess, I don't use the vulkan hardware for that, but convert via ffmpeg the video color space directly to my vulkan framebuffer color space. Still, I would use the vulkan 2D scaler with filtering.

crazygringo · on April 17, 2023

This is a great article, not only in explaining a lot of aspects of color spaces, but also highlighting just how maddening it is to use tools that deal with color spaces, because every tool has all sorts of silent defaults and assumptions that are always different/incompatible with another tool. It's maddening.

And it's only getting worse now with HDR (high dynamic range) content, as opposed to regular SDR (standard dynamic range). Fun fact: on a Mac at least, if you have the same content (hit TV show or movie) in SDR (usually 8-bit 1080p) and HDR (usually 10-bit 4K), there is no video player that will play the HDR version in a way that matches the SDR version. They range from unwatchable (VLC plays HDR content impossibly dark) to watchable-but-still-darker (Infuse) to weird (IINA is normal brightness but the colors have weird tint/saturation).

How HDR is defined in theory sounds reasonable but is disastrous in practice. While SDR just maps brightness to whatever your display is (from 0 to 255), HDR maps actual real-life brightness from pitch black to the brightest sun (from 0 to 1024). So the idea is that if you shoot outdoors on the brightest day, a hypothetical screen someday should be able to display it equally bright.

Now this is disastrous in practice from a creative standpoint because if you're watching a movie, cutting from a dark indoor nighttime bedroom scene to a sunny outdoor party scene is going to hurt your eyes. It's too jarring. Movies and TV generally don't want a real-life range of brightness. Not to mention, of course, that we don't have display technologies that can achieve it anyways, or even anything close to it.

So what are TV shows and movies doing? As far as I can tell, they're essentially filming and color-correcting in SDR as they've always done, and then using some totally arbitrary formula to convert that to HDR for HDR mastering. Bright sunlight scenes are never encoded as actual sunlight brightness (thank god), the relative difference between a nighttime bedroom scene and a bright outdoor scene in HDR is the same as it is in SDR (again, thank god). The only HDR "special features" seem to be that things like VFX lightsabers, explosions, sparks, and so forth get extra-boosted brightness in HDR.

Now back to the video players on Macs: because there is no official recommendation of how to translate HDR to real screens that have traditionally only shown SDR content (e.g. a Macbook display), they all do it differently. Which sucks. Not only are they all playing HDR content badly, there isn't even a right way to do it. We're lost in the desert.

What's the solution? Somebody needs to contact Hollywood to figure out what the de-facto translation is that they're using when they release SDR and HDR versions of the same content, and then just run the formula backwards so playing my TV show that's HDR-mastered looks identical to the SDR source they also put out. (At least for regular scenes, both nighttime bedroom and outdoor party. The VFX things like lightsabers and explosions might get washed out to bright white but that's far preferable to the other 99% of the content being too dark.)