It would scan the private photo library on your phone. There is no “public” photo library on my phone, and the default photo library contains extremely personal photos that I consider private and have not shared with anyone else, so I see no value in torturing language to pretend that this is not my private photo library. You are correct that it would only scan my private photos conditioned on a switch being turned on that would also cause those photos to be uploaded to my private cloud backup account. However this does not make that data any less private to me, and it is very different than scanning my photos in the cloud.
Since late 2022 Apple has enabled Advanced Data Protection, which encrypts all photos before they’re uploaded to cloud storage. With ADP on, my photo library is “private” not just in the common-language sense (it contains extremely personal data I have not shared with others) but also in an opinionated technical sense that these files are accessible only to me. If Apple’s CSAM scanner was deployed today, it would be scanning those photos in cleartext on your phone before unreadable data was sent up to the cloud. You could argue that Apple was making a trade: “hey, it doesn’t matter whether we can read the private data you’re storing, the price of sending even unreadable encrypted private data to our infrastructure is that you must run local software that scans the private photos on your phone,” and that’s a trade you might accept or reject on the merits. However I think it’s extremely important to say it exactly this way and not play language games. Apple was going to mandate local scanning of private photos as the cost of using their infrastructure even to store opaque private bits.
To add more detail to that, Apple's proposed CSAM scanning worked by computing a hashed value for each photo on your device then compare that to a list of known CSAM image hashes downloaded from Apple. Entirely on your device, aka the "client," as in "client side scanning" (to clarify, Apple's cloud is not the client, your personal device is). Then if you have photos that hashed to a value on the known CSAM hash list (which this isn't MD5 or similar bullshit hash algo, so that would only happen if you either engineered a hash collision or actually had CSAM content) they'd send them over to have a human look at. That's multiple photos, cause 1 match could well be a false positive.
It did a great job at freaking people out hearing about their photos getting scanned and it could be defeated by making a 1 pixel change to any photos a pedo would hide on their phone (since any changes to the image would totally change the hash).
>could be defeated by making a 1 pixel change to any photos a pedo would hide on their phone (since any changes to the image would totally change the hash).
This isn't the way those hashes work. A 1 pixel change would still hash similar enough to be matched. Maybe there are adversarial 1 pixel changes that could break the hashing, but I doubt it.
Even cropping, watermarks and other manipulations like that would still match. "Perceptual hashing", very different to cryptographic hashing. It's basically checking if an image looks "similar enough".
I believe this is why they needed multiple matches, because otherwise there must have been too many false positives.
This may be too oversimplified, but imagine that in a series of CSAM images, there might be, for example, a wall or furniture or something, that could appear similar enough to a wall in one of your own photos. That's a match, off to the gulag with you!
At the time of the announcement in 2021 there were no encrypted photos in iCloud. There was "private photo library only on your device" and "photos shared with Apple (not private)". The scanner would not have scanned private device-only photos.
> "If Apple’s CSAM scanner was deployed today, it would be scanning those photos in cleartext on your phone before unreadable data was sent up to the cloud."
Or enabling Advanced Data Protection could have disabled the scanner, we don't know. Even if it went the way you said, you could still not use iCloud and have private photos on your device, whereas your phrasing is trying to imply that there would be no option to do that.
Apple has been working on end-to-end encrypted iCloud since at least 2018 [1]. In fact they’ve been gradually implementing it since 2015. They finally deployed ADP in 2022. It is ludicrous to believe that in 2021 they designed a client-side photo scanning system whose only conceivable purpose is to be part of an end-to-end encrypted backup system, and yet also believe that system was not intended to be turned on as part of their ongoing (and ultimately successful) encryption rollout.
I think we will have to agree to disagree about the idea that turning on cloud backup suddenly makes my private photo library “not a private photo library.”
Since late 2022 Apple has enabled Advanced Data Protection, which encrypts all photos before they’re uploaded to cloud storage. With ADP on, my photo library is “private” not just in the common-language sense (it contains extremely personal data I have not shared with others) but also in an opinionated technical sense that these files are accessible only to me. If Apple’s CSAM scanner was deployed today, it would be scanning those photos in cleartext on your phone before unreadable data was sent up to the cloud. You could argue that Apple was making a trade: “hey, it doesn’t matter whether we can read the private data you’re storing, the price of sending even unreadable encrypted private data to our infrastructure is that you must run local software that scans the private photos on your phone,” and that’s a trade you might accept or reject on the merits. However I think it’s extremely important to say it exactly this way and not play language games. Apple was going to mandate local scanning of private photos as the cost of using their infrastructure even to store opaque private bits.