1. data that has to be retained for legal reasons can't be deleted by normal deletion processes
2. data that should be deleted didn't delete properly
3. to fix (2) they manually ran delete requests for times up to about the current date (back in 2018), relying on (1) to protect data
4. turns out somebody forgot to configure (1) for emails send to domains belonging to Chase (at that point the merger was 18 years ago)
5. It took 1.5 years for anyone to notice.
To me it seems like number 5 is the biggest problem here. Mistakes happen, but had they noticed in time they likely would have had those messages in backups (if they don't, that's a much bigger problem). But they probably don't retain the backups for long, for the same reasons they delete old emails in the first place (legal discovery)
It seems like laws about how data gets handled are pretty much not real laws, due to there being almost no enforcement. So, most companies handle their data in a fairly careless way
It might be illogical for companies to invest more into their IT infrastructure if there isn't a good reason to do so. I mean, even massive customer data leaks go basically unpunished, so how do you justify the mitigation expenses at the board meeting when shareholders are already mad about the lack of growth last quarter?
It turns out investigation and enforcement is disproportionately doled out to large corporations. Companies like Alphabet and Meta have multiple teams to make sure they are handling data correctly and yet there could still be things that fall through the cracks.
This regulatory attention on large companies is advantageous to startups though; until a startup gets big nobody really cares about its data handling compliance.
GDPR fines are for when you get caught. Nobody cares about small companies enough to catch them.
On the other hand, big companies don't just deal with clearly defined responsibilities like GDPR; that's table stakes. They deal with random government investigations on data handling. If a company has multiple products, combining data from those multiple products could very well suddenly become antitrust concerns. What makes a particular data handling practice an antitrust concern? Companies don't really know in advance because it's purposefully vague and meant to be worked out by the courts.
I’m not convinced companies actually keep these records so much as they retain access to inbox and outbox data And call that good enough. I think in many environments if you delete an email from your outbox it would become largely irrecoverable.
I’m curious if any IT folks can comment on what they have seen? Do you actually have log records of all messages? Or do you have something like a snapshot of all accounts at a given time?
In a "normal" operation (when things go smoothly and are well designed) yes, the objects are retained, even if deleted in primary mailboxes, explicitly for the accommodation of discovery requests and litigation holds. The rules can get more or less complex - negotiations on deals with TCV over a certain threshold have keywords and parties whose relevant emails are retained beyond normal policy, certain job classes have data retention extended beyond normal policy, keywords can trigger longer retention, information classification (either manual or automated) can extend retention periods. Software solutions have for many years streamlined the work necessary to pull this off compared to what it would have taken in an Exchange server farm 20 years ago.
Anyway, all these business rules are (should be) documented in formal policies on data retention, litigation hold, privacy, etc. If a company were going through a working annual audit process for Sox or increasingly even routine annual financial audits (non-sox), policies would be examined for evidence that they are working with spot checks - e.g. auditor selects examples of qualifying events and asks for evidence that the given policy is in fact enabled/enforced technically. This may test retention as well as intention deletion (e.g. we should deliberately have no data beyond five years unless an exceptional circumstance warrants it).
Office 365 offers message retention - our standard configuration will see it set with a seven year period. So you can delete from the outfox, and it goes to deleted items. If you delete from there it disappears from view, but any admin exporting your mailbox will include it for seven years.
There were “at least 12 civil securities-related regulatory investigations, eight of which were conducted by the Commission staff” [1]. No doubt, more will be filed.
For JPMorgan, sure. That missing evidence gets interpreted adversely against JPMorgan, which means those cases are basically won. The question is how many more will arise in the coming months, perhaps years, that find reasonable claim to damages arising out of evidence the defendant has conceded it can’t bring to its defense.
Will the interpretations of the missing evidence be worse or better for them, though? Seeing as it's JPM, it's very possible that whatever evidence was lost (lol) would be a lot more damaging than whatever the prosecutors will come up with.
1. data that has to be retained for legal reasons can't be deleted by normal deletion processes
2. data that should be deleted didn't delete properly
3. to fix (2) they manually ran delete requests for times up to about the current date (back in 2018), relying on (1) to protect data
4. turns out somebody forgot to configure (1) for emails send to domains belonging to Chase (at that point the merger was 18 years ago)
5. It took 1.5 years for anyone to notice.
To me it seems like number 5 is the biggest problem here. Mistakes happen, but had they noticed in time they likely would have had those messages in backups (if they don't, that's a much bigger problem). But they probably don't retain the backups for long, for the same reasons they delete old emails in the first place (legal discovery)