Apt Encounters of the Third Kind

ericbarrett · on March 26, 2021

One of the most fascinating breach analyses I've ever read.

Reading between the lines, I sense the client didn't 100% trust Mr. Bogdanov in the beginning, and certainly knew there was exfiltration of some kind. Perhaps they had done a quick check of the same stats they guided the author toward. "Check for extra bits" seems like a great place to start if you don't know exactly what you're looking for.

Their front-end architecture seemed quite locked down and security-conscious: just a kernel + Go binary running as init, plain ol' NFS for config files, firewalls everywhere, bastion hosts for internal networks, etc. So already the client must have suspected the attack was of significant sophistication. Who was better equipped to do this than their brilliant annual security consultant?

Which is completely understandable to me, as this hack is already of such unbelievable sophistication that resembles a Neil Stephenson plot. Since the author did not actually commit the crime, and in fact is a brilliant security researcher, everything worked out.

bscphil · on March 27, 2021

> So already the client must have suspected the attack was of significant sophistication. Who was better equipped to do this than their brilliant annual security consultant?

If you suspected your security consultant, what would be the point of slipping them tiny hints about what you've found? If they're the source of the intrusion, they already know. If they're not the source of the intrusion, why fear them when you've already been compromised? Also, if you suspected the consultant, why hire them to do the security review?

I suspect the real reason is probably simpler: they have strong personal or financial incentives to "not have known" about the intrusion before the researcher discovered it.

ericbarrett · on March 27, 2021

I agree there's nothing to rule out your theory. Likely we will never know. But then why authorize sharing the story?

Specifically I don't think the owner thought it was likely, just a concern he couldn't shake. Probably he relaxed as soon as the consultant didn't make excuses, and tackled the job—extracting the binary from an unlinked inode is definitely not showing reluctance. Pure speculation, of course.

IgorBog61650384 · on March 27, 2021

Hi, author of the blog post. This is correct - keeping PII protected has always been their concern, but recent breaches in thier's and other industries (including some they heard of and were not publicized) made them even more concerned.

srcreigh · on March 27, 2021

I don't know if this is realistic in any way, but I've seen lots of Murder, She Wrote episodes where the criminal only gets caught because they become involved in the investigation some way and accidentally reveal knowledge that only the attacker could possibly know. This strategy necessitates hiding secret information so it can be revealed later by the attacker.

eeZah7Ux · on March 26, 2021

> just a kernel + Go binary running as init

This is hardly reducing the attack surface compared to a good distro with the usual userspace.

It's been decades since attackers relied on a shell, or unix tools in general, or on being to write to disk and so on: it's risky and ineffective.

Many attack tools run arbitrary code inside the same process that has been breached and extract data from its memory.

They don't try to snoop around or write to disk and so on. Rather, move to another host.

The only good mitigation is to split your own application in multiple processes based on the type of risk and sandbox each of them accordingly.

ericbarrett · on March 26, 2021

> This is hardly reducing the attack surface compared to a good distro with the usual userspace.

Run `tcpdump -n 'tcp and port 80'` on your frontend host and you'll still see PHP exploit attempts from 15 years ago. Not every ghost who knocks is an APT. A singleton Go binary running on a Linux kernel with no local storage is objectively a smaller attack surface than a service running in a container with /bin/sh, running on a vhost with a full OS, running on a physical host with thousands of sleeping VMs—the state of many, many websites and APIs today.

eeZah7Ux · on March 26, 2021

> is objectively a smaller attack surface

No, you have to understand what is really part of the attack surface and what the attacker wants.

For example, on a properly built system with a single application running with its own user the attacker might have no practical benefit at all in doing a privilege escalation to root.

> running on a physical host with thousands of sleeping VMs

This is a strawman. A shared hypervisor opens another attack surface and was not part of the discussion.

ericbarrett · on March 26, 2021

Look, my friend, we will have to disagree on this. What exploits will attack this setup from the front? A Linux networking or other syscall RCE, a Go compiler intrinsic RCE, a vulnerability in the app code, or a vulnerability in a third party library. All of which exist in the common OS-hosted scenario, in addition to everything else, plus you have both your container and your OS to worry about (e.g. openssl).

EDIT: Anyway, I'd like to thank Mr. Bogdanov and his client for sharing this story—it's just fascinating.

VectorLock · on March 28, 2021

Sounds like a pretty nice way to get around having to constantly patch minor CVEs in base OS/distributions to maintain compliance - cut out the OS entirely.

eeZah7Ux · on March 30, 2021

No, it's not. You can deploy a very minimal Linux while also keeping the services that are actually good for security, like logging, IDS/IPS, certification compliance tooling, monitoring.

Unless you are running unnecessary daemons exposed on the Internet, 99% of the attack surface is from your application and the kernel itself.

Both parts that you can't remove.

alfiedotwtf · on March 27, 2021

> This is a strawman. A shared hypervisor opens another attack surface and was not part of the discussion.

Not to worry. When the ghosts knock, you just have to remind them that their attack is out of scope /s

motohagiography · on March 26, 2021

Superb work. The "who" of attribution is more likely related to the actual PII they were after than any signature you'll get in the code. Seems like a lot of effort and risk of their malware being discovered for PII instead of being an injection point into those users machines. I rarely hear security people talk about why a system was targeted, and once you have that, you can know what to look for, inject canaries to test etc.

afrcnc · on March 26, 2021

From Twitter chatter, this appears to be Chinese APT malware, something related to PlugX

mwcampbell · on March 26, 2021

> Chinese APT malware,

Why is it necessary to point out the foreign origin? Doesn't that just encourage our innate xenophobia?

fouric · on March 26, 2021

It should be pretty easy for someone to differentiate between the Chinese people and the Chinese government.

Meanwhile, can you prove that this "innate xenophobia" is present in every human to an extent that it's actually relevant, and that this particular instance of suggesting that the malware is Chinese in origin meaningfully exacerbates it?

Moreover, China is a geopolitical rival to the United States, India, and other countries that constitute a majority of HN readers. Information like this is interesting from that viewpoint.

motohagiography · on March 26, 2021

Threat modelling to develop useful risk mitigation requires that system owners essentially do a means/motive/opportunity test on the valuable data they have. The motive piece includes nation states as actors, and that matters in terms of how much recourse you are going to have against an attacker.

However, I'd propose a new convention that any unattributed attacks and example threat scenarios of nation states should use Canada as the default threat actor, because nobody would believe it or be offended.

mc32 · on March 26, 2021

If it were Russian, American or Israeli would you have the same reservations?

vxNsr · on March 26, 2021

Lol no s/he likely wouldn’t but s/he’ll argue it’s different bec Trump didn’t make any negative statements about them so it’s impossible to be xenophobic against them.

To prove my point s/he had no problem with the top level comment 6 hrs ago “mossad gonna mossad”

renewiltord · on March 26, 2021

My interpretation, not knowing anything about the field, is that this is a nation state actor or sponsored by such.

FpUser · on March 27, 2021

It might as well be but since you know nothing about the field what is the value of your interpretation?

To me your interpretation reads like: something is wrong, must be (substitute with your enemy du jour)

renewiltord · on March 27, 2021

Ah, the value is in saying "the thing you're saying reads like this to an uninformed person". If the interpretation is correct, it reinforces the communication style chosen. If the interpretation is incorrect and the writer is aiming at this audience it is evidence against.

For instance, sometimes I say something to a friend and they misunderstand what I intend. The feedback on the misunderstanding permits me to recalibrate my communication and it helps them receive the right information.

I am not claiming that it was "the Chinese". I'm claiming that saying "Chinese APT" reads to me like this.

duxup · on March 27, 2021

I guess it depends on when we talk about it but it certainly matters if it is the janitor / secret hacker in the building or someone from somewhere that you have no legal recourse.

ivanstojic · on March 26, 2021

I think the old “Mossad is gonna Mossad” thing is still true. Good security practices are mandatory, and will keep you safe 99% of the time.

But when you have what appear to be state level actors using 0 day exploits... you will not stop them.

satyanash · on March 26, 2021

Thanks for making me look up "Mossad is gonna Mossad" -> Schneier -> Mickens' essay titled "This World of Ours".

https://www.usenix.org/system/files/1401_08-12_mickens.pdf

jeffrallen · on March 26, 2021

Thank you for this. Helps put my career choices into perspective. (I just quit security work to be a stay at home dad.)

GSGBen · on March 26, 2021

Thanks, this is such good writing. Reminds me a little of Douglas Adams.

sophacles · on March 26, 2021

Both are good authors. If you like the humor aspect that's mostly Mickens - one of my favorites from him: https://www.usenix.org/system/files/1311_05-08_mickens.pdf

GSGBen · on March 27, 2021

Cheers, this one is even better!

Arrath · on March 26, 2021

That was very entertaining, thank you.

justupvoting · on March 26, 2021

No 0-day here, more of a supply chain attack, but your point stands. This actor was determined

cmeacham98 · on March 26, 2021

Conspiracy theory: the fact the POC insisted on the writer checking out the traffic suggests they knew about (or were suspicious of) the fact that PII was being leaked.

clankyclanker · on March 26, 2021

Probably, but is that a conspiracy theory so much as an insurance policy? Being able to competently complete that sort of nightmare investigation is probably why the investigator was re-hired annually.

A packet capture of the config files would show something was up to anyone suspicious, but knowing what to do about it is a completely different story.

cmeacham98 · on March 26, 2021

The 'conspiracy' part of my conspiracy theory is not that they hired a security consultant, but that they explicitly guided him to the exact hardware[1] with the correct metric to detect it[2] asking him to test for a surprisingly accurate hypothetical[3], even going so far as to temporarily deny the suggestion of the person they're paying to do this work[4]. This is weirdly specific assuming they had no knowledge of the compromise.

Of course, I have no non-circumstantial evidence and this could all be a coincidence, which is why my comment is prefixed with "conspiracy theory".

1: "However, he asked me to first look at their cluster of reverse gateways / load balancers"

2: Would have likely been less likely to find the issue with active analysis given the self destruct feature

3: "Specifically he wanted to know if I could develop a methodology for testing if an attacker has gained access to the gateways and is trying to access PII"

4: "I couldn't SSH into the host (no SSH), so I figured we will have to add some kind of instrumentation to the GO app. Klaus still insisted I start by looking at the traffic before (red) and after the GW (green)"

tln · on March 26, 2021

Perhaps "the guy responsible for building the kernel" noticed his laptop was compromised. Then they'd know of a theoretical possibility of a compromise.

Not wanting to instrument the Go app could be an operational concern.

bombcar · on March 26, 2021

It sounded to me like they had a suspicion and specifically wanted the contractor to use his expertise in a limited way that would catch if the suspicion was right.

Perhaps they had noticed the programs restarting and when trying to debug triggered it.

cham99 · on March 26, 2021

Sometimes commercial companies get a tip from intelligence agencies:

"Your <reverse gateway> devices are compromised and leak PII." Nothing more.

marcosdumay · on March 26, 2021

#4 is a reasonable request. If the client wants to verify the lower level ops instead of higher level application and deployment, the instrumentation would be counterproductive. That could happen if he was thinking something on the lines of "there's a guy here that compiles his own kernel on a personal laptop, I wonder what impact this has".

The other ones could be explained by him being afraid of leaking PII, and most PII being on that system.

mzs · on March 26, 2021

from Igor:

>I think he had some suspicions, but he is denying that vehemently ;)

https://twitter.com/IgorBog61650384/status/13753134251323146...

patrickdavey · on March 27, 2021

Is POC being used as "point of contact" ? I've not come across the acronym before.

https://en.wikipedia.org/wiki/POC

karamanolev · on March 27, 2021

Yes. That would be the person with the org handling the organization's relationship with the contractor, setting up their access, answering questions, guiding, propagating results, etc.

sloshnmosh · on March 26, 2021

I was thinking the same.

mwcampbell · on March 26, 2021

I'm trying to find the lesson in here about how to prevent this kind of incident in the first place. The nearest I can find is: don't build any production binaries on your personal machine.

rcxdude · on March 26, 2021

Reproducible builds can go a long way, along with a diverse set of build servers which are automatically compared. Whether you use your personal machine or a CI system there's still the risk of it being compromised (though your personal machine is probably at a little more risk of that since personal machines tend to have a lot more software running on them than CI systems or production machines).

h2odragon · on March 26, 2021

I'm paranoid, and I'd have considered the efforts described here to be pretty secure. I'll say the only counter to this grade of threat is constant monitoring, by a varied crew of attentive, inventive, and interested people. Even then, there's probably going to be a lot of luck needed.

kiliancs · on March 26, 2021

Traffic analysis and monitoring will detect detect signs of intrusion almost in real time but also exfiltration. The network never lies.

h2odragon · on March 26, 2021

The kind of eyes that can spot the hinky pattern while watching that monitor are the vital ingredient, and thats not something i can quantify. Or even articulate well.

marcosdumay · on March 26, 2021

> The network never lies.

Steganography begs to differ.

How much free entropy do you have on your network traffic?

EDIT: Corrected. Thanks cuu508.

cuu508 · on March 26, 2021

*steganography

zozbot234 · on March 26, 2021

One sensible mitigation to this grade of threat; avoid running Windows, even as a VM host as the dev did. It's a dumpster fire.

xmodem · on March 26, 2021

I think you may have misinterpreted that part of the post - my understanding is that the Linux laptop that was being used was compromised, and there was a 3 month gap when that developer switched to a Windows machine before that became compromised too. Specifically it would be fascinating to learn whether the Windows host was compromised or if it was only the Linux VM.

ducktective · on March 26, 2021

> ...It looks as if it took the attackers three months to gain access back into the box and into the VM build...

How the attackers were able to gain access again after the developer used a VM in Windows? My guesses:

- The developer machine was compromised in a deeper level (rootkit?)

- The developer installs a particular application in each Linux box

- There is a bug in an upstream distro

dathinab · on March 26, 2021

> The developer machine was compromised in a deeper level (rootkit?)

Unlikely that would not have taken 3 month.

> The developer installs a particular application in each Linux box

Possible, but also unlikely, as long as the vm wasn't used for other things this also wouldn't have taken 3 month.

> The developer installs a particular application in each Linux box

There probably is, but it probably has nothing to do with this exploit. For the same reasons as mentioned above.

My guess is that it was a targeted attack against that developer and there is a good chance the first attack and the second attack used different attack vectors hence the 3 month gap.

PeterisP · on March 27, 2021

My guess would be persistence in other parts of their network used to get the credentials of that developer in some way. Perhaps some internal webapp; perhaps credential reuse with some other system; perhaps malware installed in some development tool or script that the developer would pull from some other company system and run on their machine. Perhaps even phishing, which is much more likely to succeed if you have compromised some actual coworkers' machine and can send the malware through whatever messaging system you use internally.

ericbarrett · on March 27, 2021

Seconded. This attack is so wild I would even suspect physical access, perhaps a WiFi drive-by.

dathinab · on March 26, 2021

I would go further and say:

"Developer systems are often the weakest link."

(Assuming that the system on itself is designed with security in mind.)

The reason is manifold but include:

- attacks against developer systems are often not or less considered in security planing

- many of the technique you can use to harden a server conflict with development workflows

- there are a lot of tools you likely run on dev systems which add a large (supply chain) attack surface (you can avoid this by allways running everything in a container, including you language server/core of your ides auto completion features).

Some examples:

- docker groub member having pseudo root access

- dev user has sudo rights so key logger can gain root access

- build scripts of more or less any build tool (e.g. npm, maven plugins, etc.)

- locking down code execution on writable hard drives not feasible (or bypassed by python,node,java,bash).

- various selinux options messing up dev or debug tools

- various kernel hardening flags preventing certain debugging tools/approaches

- preventing LD_PRELOAD braking applications and/or test suites

...

dane-pgp · on March 26, 2021

I think a big difference between build machines and dev machines, at least in principle, is that you can lock down the network access of the build machine, whereas developers are going to want to access arbitrary sites on the internet.

A build machine may need to download software dependencies, but ideally those would come from an internal mirror/cache of packages, which should be not just more secure but also quicker and more resilient to network failures.

tetha · on March 26, 2021

Interestingly, this is water on mills we are currently thinking about. We're in the process of scaling up security and compliance procedures, so we have a lot of things on the table, like segregation of duties, privileged access workstations, build and approval processes.

Interestingly, the way with the least overall headaches is to fully de-privilege all systems humans have access to during regular, non-emergency situations. One of those principles would be that software compiled on a workstation automatically disqualifies from deployment, and no human should even be able to deploy something into a repository the infra can deploy from.

Maybe I should even push container-based builds further and put up a possible project to just destroy and rebuild CI workers every 24 hours. But that will make a lot of build engineers sad.

Do note that "least headaches" does not mean "easy".

Nextgrid · on March 26, 2021

This is why I always insist on branches being protected at the VCS server level so that no code can sneak in without other's approval - the idea is that even if your machine is compromised, the worst it can do is commit malicious code to a branch and open a PR where it'll get caught during code review, as opposed to sneakily (force?) pushing itself to master.

dathinab · on March 26, 2021

In this case no CI was involved so that wouldn't have helped.

(The CI was not compromised but a dev laptop which was used to manually build+deploy the kernel, without any CI involved).

Through generally I agree with you.

benlivengood · on March 26, 2021

If you use cloud services that offer automated builds you can push the trust onto the provider by building things in a standard (docker/ami) image with scripts in the same repository as the code, cloned directly to the build environment.

If you roll your own build environment then automate the build process for it and recreate it from scratch fairly often. Reinstall the OS from a trusted image, only install the build tools, generate new ssh keys that only belong to the build environment each time, and if the build is automated enough just delete the ssh keys after it's running. Rebuild it again if you need access for some reason. Don't run anything but the builds on the build machines to reduce the attack surface, and make it as self contained as possible, e.g. pull from git, build, sign, upload to a repository. The repository should only have write access from the build server. Verify signatures before installing/running binaries.

tetha · on March 26, 2021

> If you use cloud services that offer automated builds you can push the trust onto the provider by building things in a standard (docker/ami) image with scripts in the same repository as the code, cloned directly to the build environment.

And I guess, for those super-critical builds, don't rely on anything but the distro repos or upstream downloads for tooling?

Because if you deploy your own build tools from your own infra, you are at risk to taint the chain of trust with binaries from your own tainted infra again. I'm aware of the trusting trust issue, but compromising the signed gcc copy in debians repositories would be much harder than some copy of a proprietary compiler in my own (possibly compromised) binary repository.

benlivengood · on March 26, 2021

> And I guess, for those super-critical builds, don't rely on anything but the distro repos or upstream downloads for tooling?

You can build more tooling by building it in the trusted build environment using trusted tools. Not everything has to be a distro package, but the provenance of each binary needs to be verifiable. That can include building your own custom tools from a particular commit hash that you trust.

hda111 · on March 27, 2021

Did you mean read access from the build server? I’m confused.

maybevain · on March 27, 2021

I believe the commenter meant that only the build server should be able to write to the build artifact repository, so ”write access from the build server” would be correct.

hda111 · on March 31, 2021

Good point

marcosdumay · on March 26, 2021

Hum... On what machine do you build them?

That can not be the right lesson, because there's no inherent reason "personal machine" is any less safe than "building cluster" or whatever you have around. Yes, on practice it often is less secure to a degree, so it's not a useless rule, but it's not a solution either.

If it's solved some way, it's by reproducible builds and automatic binary verification. People are doing a lot of work on the first, but I think we'll need both.

theamk · on March 26, 2021

> there's no inherent reason "personal machine" is any less safe than "building cluster" or whatever you have around

Sure there is! I browse internet a lot on my dev machines, and this exposes me to bugs in browsers and document viewers. And if I do get compromised, my desktop is so complex and runs so many services the compromise is unlikely to be detected. So all attacker needs is one zero day, once.

Compare this to a CI with infra-as-a-code, like Github Actions. If the build process gets compromised, it only matters until the next re-build. Even if you get a supply chain attack once (for example), if this is discovered all your footholds disappear! And even if you got the developers' keys, it is not easy to persist -- you have to make commits and those can be noticed and undone.

(Of course if your "building cluster" is a bunch of traditional machines which are never reformatted and which many developers have root access to, then they are not that much more secure. But you don't have to do it that way.)

marcosdumay · on March 27, 2021

You rebuild your build cluster with what image? Where do the binaries there come from? And what machine rebuilds the machines.

Securing the machines themselves is a process of adding up always decreasing marginal gains until you say "enough", but the asymptote is never towards a fully secure cluster. That ceiling on how secure you can get is clearly suboptimal.

Besides, the ops people's personal machines have a bunch of high access permissions that can permanently destroy any security you can invent. That isn't any less true if your ops people work for Microsoft instead of you.

theamk · on March 27, 2021

I mentioned "github actions" for a reason. You give up lots of control when you use them. In exchange, you get "crowd immunity" -- the hope that if there is a vulnerability, it will affect so many people that (1) you are not going to be the easiest target and (2) someone, somewhere will notice it.

Your build actions happen all in the docker images/ephemeral VMs. You use images directly distributed by the corresponding project, for example you may start directly from Canonical's Ubuntu image. The "runners" are provided by Github, and managed by Microsoft's security team as well. The only thing that you actually control is a 50-line YAML file in your git repo, and people will look at it any time they want to add a new feature.

Yes, the if someone hacks Microsoft's ops people, they can totally mess up my day. But would they? Every usage of zero-day carries some risk, so if attackers do get access to those systems, they'll much likely to go for some sort of high-value, easy-money target like cryptocurrency exchanges. Plus, I am pretty sure that Microsoft actually has solid security practices, like automatic deployments, 2FA everywhere, logging, auditing, etc... They are not going to have a file on CI/CD machine that is different from one in Git, like OP's system did!

marcosdumay · on March 27, 2021

So, you are not concerned about APTs at all. Well, nothing on the entire thread is relevant to you.

theamk · on March 27, 2021

Um, no, that’s not what I said.

The APTs do not have magical powers, they buy from the same exploit market everyone has.

Let’s say my organization (which is not very well known) has an exploitable bug. What are the chances that someone will discover it? Pretty close to none, the hole can be there for many years waiting for APT to come and exploit it.

Now imagine Github runner or default Ubuntu image has an exploitable bug. What are the chances it will last long? Not very high. In a few months, someone will discover and either report or exploit it. Then it will be fixed and no longer helpful for APT threat actors.

Remember, the situation described in the post only occurred because they used binary images that only a few people could look at. Generating binary kernel on someone’s laptop is easy to subvert in undetectable way, but how do you subvert a Dockerfile stored in Git repo without it being obvious?

stickfigure · on March 26, 2021

Use a PaaS like Heroku or Google App Engine, with builds deployed from CI. All the infrastructure-level attack surface is defended by professionals who at least have a fighting chance.

I feel reasonably competent at defending my code from attackers. The stuff that runs underneath it, no way.

jeffrallen · on March 26, 2021

The lesson is in the essay from James Mickens, above.

mwcampbell · on March 26, 2021

But isn't that just defeatist? Can't we continue to ratchet up our defenses?

PeterisP · on March 27, 2021

If you double your defences, you double the cost but advanced attackers still get what they want. "Ratchet up defences" does't mean simply doing things a bit more correctly, it requires you to hire many expensive people to do lots of things that you didn't do before. This article is a good example - the company as described seems to have a very good (and expensive) level of security already, the vast majority of companies are much less secure, and it still wasn't sufficient.

And if you increase your defences so much that you're actually somewhat protected from an advanced attacker, you're very, very far on the security vs usability tradeoff, to get there is an organization-wide effort that (unlike simple security basics/best practices) makes doing things more difficult and slows down your business. You do it only if you really have to, which is not the case for most organizations - as we can see from major breaches e.g. SolarWinds, the actual consequences of getting your systems owned are not that large, companies get some bad PR and some costs, but it seems that prevention would cost more and still would be likely to fail against a sufficiently determined attacker.

depressedpanda · on March 27, 2021

> This article is a good example - the company as described seems to have a very good (and expensive) level of security already

Do you think having a "kernel guy" building the kernel on his personal laptop before deploying it to production is a very good level of security?

alinspired · on March 27, 2021

in solarwind hack their cicd systems were hacked, don't see a lesson here

kjjjjjjjjjjjjjj · on March 26, 2021

Build everything on a secured CI/CD system, keep things patched, monitor traffic egress especially with PII, manual review of code changes, especially for sensitive things

xmodem · on March 26, 2021

This is truly the stuff of nightmares, and I'm definitely going to review our CI/CD infrastructure with this in mind. I'm eagerly awaiting learning what the initial attack vector was.

lovedswain · on March 26, 2021

9 times out of 10, through the front door. Some shit in a .doc, .html or .pdf. The Google-China hack started with targetted pdfs

diarrhea · on March 26, 2021

If people didn't allow macros in Excel, stayed in read-only mode in Word and only opened sandboxed PDFs (convert to images in sandbox, OCR result, stitch back together), we would see a sharp decline in successful breaches. But that would be boring.

pitaj · on March 26, 2021

I think opening all PDFs in a browser would be good enough™ as browser sandboxes are about as secure as sandboxing gets.

dilyevsky · on March 27, 2021

Operation aurora happened when most people still used IE and some used Opera (i did) and very few Firefox and others

ducktective · on March 26, 2021

How such an attack is even possible? A bug in the LibreOffice, browser, or Evince?

yjftsjthsd-h · on March 26, 2021

PDF is a nightmare format, including such gems as javascript IIRC; it's not surprising that it can be used to make exploits in reader software.

ducktective · on March 26, 2021

So the attacker has to have exploits in every pdf reader app on linux? Since it is not Adobe only and there are quite a few. Or maybe a common backend engine (mupdf and popler)...

yjftsjthsd-h · on March 26, 2021

Yeah, I suspect that a rather lot of the options use the same libraries; https://en.wikipedia.org/wiki/Poppler_(software) claims that poppler is used by Evince, LibreOffice 4.x, and Okular (among others).

npsimons · on March 27, 2021

This has advantages and disadvantages; yes, if there is a security hole in it, it likely affects everything that uses it. But it also means it gets use-case tested more thoroughly, at a minimum. Ideally, all "stakeholders" would have a vested interest in doing reviews of their own, or perhaps pooling money to have the code scrutinized.

josephg · on March 26, 2021

An attacker doesn’t need every attack to work every time. One breach is usually enough to get into your system, so long as they can get access to the right machine.

I heard a story from years ago that security researchers tried leaving USB thumb drives in various bank branches to see what would happen. They put autorun scripts on the drives so they would phone home when plugged in. Some 60% of them were plugged in (mostly into bank computers).

PeterisP · on March 27, 2021

The attacker obviously does not need to have exploits in every pdf reader app on linux, it needs to have an exploit in a single pdf reader app out of all those which someone in your organization is using. If 99% of your employees are secure but 1% are not, you're vulnerable. Perhaps there's a receptionist in your Elbonian[1] branch on an outdated unpatched computer, and that's as good entry point in your network as any other, with possibilities for lateral movement to their boss or IT support persons' account and onwards from there. In this particular case, a developer's Linux machine was the point of persistence where the malware got inserted into their server builds, however, most likely that machine wasn't the first point of entrance in their syetems.

[1] https://dilbert.fandom.com/wiki/Elbonia

npsimons · on March 27, 2021

Remember how Adobe removed Flash support from Acrobat a couple of years back? Attacks like this are why. Well, and Flash had other issues, too.

I'm not sure when you started using PDFs (I remember mid-90s when my Dad told me about this cool new document format that would standardize formats across platforms, screen and paper!), but hardly anything is static any more.

lovedswain · on March 26, 2021

The nexus of unsafe programming languages and exploit markets, where for the right price you can purchase undisclosed bugs basically ready to use. Modern offensive security is essentially a bit like shopping in Ikea

npsimons · on March 26, 2021

This is the kind of content I come to HN for! I don't get to do a lot of low level stuff these days, and my forensics skills are almost non-existent, so it's really nice to see the process laid out. Heck, just learning of binwalk and scapy (which I'd heard of, but never looked into) was nice.

h2odragon · on March 27, 2021

Consider the possibility that its fiction. Would you be upset? I wouldn't, perhaps a bit disappointed not to learn more. This certainly fits into "worthy of itself".

gue5t · on March 26, 2021

Please change the posting title to match the article title and disambiguate between APT (Advanced Persistent Threats, the article subject) and Apt (the package manager).

dopidopHN · on March 26, 2021

Thanks, I don’t work in security but I use APT a lot. I thought it was a unfunny joke? Like ... APT provide some of those packages? Ok. That make more sense.

The author did a good job at making that readable. Is it often like that?

NotEvil · on March 27, 2021

Most security analysis are easily readable (the ones I read were) but there may be outliers

milliams · on March 26, 2021

Especially where the article doesn't define it or even use the term "APT" except in the title.

jwilk · on March 26, 2021

FWIW, the package manager is also spelled APT.

gue5t · on March 26, 2021

You're right... what an annoying namespace collision. On the other hand, stylizing software as Initial Caps is much more acceptable than stylizing non-software acronyms that way, so it would still be less misleading to change the capitalization.

IgorPartola · on March 26, 2021

Would you say that these things aren’t aptly named?

_6mdd · on March 26, 2021

tgv · on March 27, 2021

And I thought it was the common English word "apt". It's a trivial ambiguity, not click-bait.

_6mdd · on March 26, 2021

I thought we couldn't edit titles?

lormayna · on March 26, 2021

Poster here. Do you think I need to edit the title? This title was funny to me, but probably just because I am a security guy and I know what is an APT.

8note · on March 26, 2021

Switching Apt to APT would add lot of clarity while barely changing the title

airstrike · on March 26, 2021

It should match the original, unless you have a strong reason not to i.e. it breaks the guidelines somehow

https://news.ycombinator.com/newsguidelines.html

kencausey · on March 26, 2021

Yes, the poster can for a limited time, 2 hours I think.

anamexis · on March 26, 2021

Also the mods can and often do.

anticristi · on March 26, 2021

Call me naïve, but who is such a hot target to warrant so much effort to exfiltrate PII? Defense? FinTech? Government?

h2odragon · on March 26, 2021

Who is such a hot target and can take such an independent attitude, even to allowing this to be published? If this had been a bank, they'd have had to report to regulators and likely we'd have heard none of these details for years if ever. Same for most anything else big enough to be a target i can think of offhand.

dathinab · on March 26, 2021

Idk. while banks have to report on this they are (as far as I know) still free to publicize details.

We normally don't hear about this things not because they can't speak about it but because they don't want to speak about it (bad press).

My guess is that it's a company which takes security relatively serious, but isn't necessary very big.

> hot target [..] else big enough to be a target

I don't thing you need to be that big to be a valid target for a attack of this kind, neither do I think this attack is on a level where "only the most experienced/best hackers" could have pulled it of.

I mean we don't know how the dev laptop was infected but given that it took them 3 month to reinfect it I would say it most likely wasn't a state actor or similar.

hda111 · on March 27, 2021

Doesn’t the GDPR force them to talk about it? I mean all potentially affected people must be informed.

anticristi · on March 28, 2021

Indeed, GDPR requires you to inform the Data Protection Authority and affected users.

perlgeek · on March 26, 2021

Hotel or b2b travel agencies also have PII that can be very useful to intelligence agencies.

bsamuels · on March 26, 2021

Based on how outlandish the GW setup is, this is definitely a bank.

It could conceivably belong to a defense organization, but if it did, they wouldn't be able to write up a blog about their findings.

alinspired · on March 27, 2021

sounds like a non-conventional bank with many details allowed to be posted, perhaps something crypto ?

draugadrotten · on March 26, 2021

I'd add medical to that list. Vaccine test results are hot stuff.

daveslash · on March 26, 2021

I think you're right that it's medical. The author calls out PII was the target. Sure, there's PII in Defense/Fintech/Government, but it's probably not the target in those sectors and PII doesn't have the same spotlight on it as in the Medical world (e.g. HIPPA & GDPR).

dane-pgp · on March 26, 2021

Are you saying that, for example, the addresses of military generals and spies are less of a target for hackers than the addresses of medical patients? While there are laws to protect medical information, I think all governments care more about protecting national security information.

daveslash · on March 26, 2021

Ah, good point! No, I was not saying that at all, and thank you for pointing that out.

When I was thinking of "defense", I was thinking of the defense contractors who are designing/building things like the next-gen weapons, radar, vehicles, and the like. In that context, when it comes to what they can exfiltrate, I think attackers probably prioritize the details & designs over PII. Just a guess though.

anonymousiam · on March 27, 2021

At least in the USA, the addresses of military generals has already been pwned (at least by the CCP/PRC). https://en.wikipedia.org/wiki/Office_of_Personnel_Management...

jldugger · on March 26, 2021

> the addresses of military generals and spies are less of a target for hackers than the addresses of medical patients?

Why not both? Think how valuable the medical information of military staff would be as a source of coersive power.

clankyclanker · on March 26, 2021

Not just vaccines, but basically all your data, including billing and disease history. Perfect for both scamming and extortion.

Keep in mind that you actually want your medical provider to have that data, so they can treat you with respect to your medical history, without killing you in the process.

anticristi · on March 26, 2021

True. However, reading between the lines, the exfiltration "project" was targeted (i.e. one-off), skilled and long. I would put the cost anywhere between 1 megabuck and 10 megabucks. Given risks and dubious monetization, I would assume the "sponsor" demands at least a 10x ROI.

Is medical data really that valuable?

webnrrd2k · on March 26, 2021

How about psychiatric data from the area around Washington DC? Hospitals/practices that are frequented by New York CEO-types? I can picture that being quite valuable to the right parties.

TameAntelope · on March 26, 2021

If they can get to you, they can get to your clients, who have clients they're now better able to get to, etc...

HVAC company working in a building where a subcontractor of a major financial firm has an office, for a random example...

lucb1e · on March 26, 2021

One thing I didn't get is this magical PII thing. How does the author look at a random network packet -- nay, just packet headers -- and assign a PII:true/false label? I think many corporations would sacrifice the right hand of a sysadmin if that was the way to get this tech.

The article just says:

> I wrote a small python program to scan the port 80 traffic capture and create a mapping from each four-tuple TLS connection to a boolean - True for connection with PII and False for all others.

Is it just matching against a list of source IPs? And perhaps the source port, to determine whether it comes from e.g. a network drive (NFS in this case)? Not sure what he uses the full four-tuple for, if this is the answer in the first place. It's very hand-wavy for what is an integral part of finding the intrusion and kind of a holy grail in other situations as well.

lettergram · on March 26, 2021

We just open sourced one of our libraries used to detect PII:

https://github.com/capitalone/DataProfiler

Amazon and Microsoft also have their own offerings, but can be quite expensive for network packets (and pretty slow).

Most projects / teams will use some basic regular expressions to capture basics like SSN, credit card numbers or phone numbers. They’re typically just strings of a specific length. More difficult if you’re doing addresses, names, etc.

lucb1e · on March 27, 2021

That's great for you but... how's that relevant to the article? The author never speaks of using this sort of thing.

I saw these regex matchers in school but don't understand them. They go off all day long because one in a dozen numbers match a valid credit card number, even in the lab environment the default setup was clearly unusable. But perhaps more my point: who'd ever upload the stolen data plaintext anyhow? Unencrypted connections have not been the default for stolen data since... the 80s? If your developers are allowed to do rsync/scp/ftps/sftp/https-post/https-git-smart-protocol then so can I, and if they can't do any of the above then they can't do their work. Adding a mitm proxy is, aside from a SPOF waiting to happen, also very easily circumvented. You'd have to reject anything that looks high in entropy (so much for git clone and sending PDFs) and adding a few null bytes to avoid that trigger is also peanuts.

These appliances are snakeoil as far as I've seen. But then I very rarely see our customers use this sort of stuff, and when I do it's usually trivial to circumvent (as I invariably have to to do my work).

Now the repository you linked doesn't use regexes, it uses "a cutting edge pre-trained deep learning model, used to efficiently identify sensitive data". Cool. But I don't see any stats from real world traffic, and I also don't see anyone adding custom python code onto their mitm box to match this against gigabits of traffic. Is this a product that is relevant here, or more of a tech demo that works on example files and could theoretically be adapted? Either way, since it's irrelevant to what the author did, I'm not even sure if this is just spam.

lettergram · on March 27, 2021

It was actually a response to your comment:

> One thing I didn't get is this magical PII thing. How does the author look at a random network packet -- nay, just packet headers -- and assign a PII:true/false label? I think many corporations would sacrifice the right hand of a sysadmin if that was the way to get this tech.

Checkout Amazon macie or Microsoft presidio or try actually using the library I linked?

It’s usually used in a constrained way, in no way perfect. But it helps investigators track suspected cases of data exfiltration. You can pull something that looks suspect (say a credit card) and compare against an internal dataset and see if it’s legit.

In the repo I linked you can see the citation for an earlier model on synthetic and real world datasets:

https://github.com/capitalone/DataProfiler#references

https://arxiv.org/pdf/2012.09597.pdf

So I don’t really understand the hostility.

_lx4l · on March 26, 2021

> YOLO rename label

Love the most recent commit

bscphil · on March 27, 2021

My guess was that traffic containing PII was flagged in some way such that it was visible in the pre-GW traffic the researcher had access to. That was the point of linking up the pre-gateway and post-gateway packets. I'm not sure how common such setups are.

What's even more incredible to me is that the researcher somehow recreated exactly the same / correct traffic pattern on their local testing setup, so that they were able to compare the traffic with the production environment to detect that there was a problem. How would you do this?

I'm not even sure what the "time" variable is on the graphs. Response time? (It also seems weird that there's any PII on port 80, but that's an unrelated issue.)

lucb1e · on March 27, 2021

> What's even more incredible to me is that the researcher somehow recreated exactly the same / correct traffic pattern on their local testing setup, so that they were able to compare the traffic with the production environment to detect that there was a problem.

Yeah, that's another thing that has me confused, but I figured one thing at a time...

Thanks for the response, that pre-set PII flag does sound plausible, though it's odd that they'd never mention it and mention a 'four-tuple' instead (sounds like they're trying to use terms not everyone knows? Idk, maybe it's more well-known than it seems to me).

yencabulator · on April 5, 2021

Four-tuple is the standard way to refer to a TCP connection. Source IP address, source port, destination IP address, destination port.

gtirloni · on March 27, 2021

Yes, that was the part where I got lost. It seems he skipped some details about that so it's not clear from the article how that was done. I can't imagine capturing the encrypted data got him that.

sloshnmosh · on March 26, 2021

Wow! What an amazing write-up and find!

It’s also amazing that they noticed the subtle difference in the NFS packet capture.

I can’t wait for the rest to be published.

Bookmarked

arbirk · on March 27, 2021

This observation os way too casual imo: "We noticed a 3 month gap about 5 month ago, and it corresponded with the guy moving the kernel build from a Linux laptop to a new Windows laptop with a VirtualBox VM in it for compiling the kernel. It looks as if it took the attackers three months to gain access back into the box and into the VM build."

If the attackers have access to brute force OS engineers / sysadmins work pc's then that should probably be the headline. The rest is just about not being found

NewJazz · on March 27, 2021

Maybe if you are a business oriented person. But reading through the analysis, I felt like the researcher seriously enjoyed the hunt and the "not being found" part.

abalashov · on March 27, 2021

All I learned is (a reminder that) I actually have no idea how computers work. :-)

alfiedotwtf · on March 26, 2021

Started slow to get me hooked, then bam... slapped you in the face with a wild ride.

Reading this, I know of places that have no hope against someone half as decent as this APT. The internet is a scary place

djmips · on March 27, 2021

"I think they let some intern fresh out of college write that one." - I think it was intentional; They probably had a tool to generate that code.

Syzygies · on March 27, 2021

Wow. They don't pay you enough.

h2odragon · on March 26, 2021

CNA?

> On March 21, 2021, CNA determined that it sustained a sophisticated cybersecurity attack. The attack caused a network disruption and impacted certain CNA systems, including corporate email. Upon learning of the incident, we immediately engaged a team of third-party forensic experts to investigate and determine the full scope of this incident, which is ongoing.

+ [CNA suffers sophisticated cybersecurity attack](https://www.cna.com/)

FDSGSG · on March 26, 2021

No, CNA was hit by ransomware.