A couple years ago at Inky we pivoted away from general improvements to email to focus on phishing prevention using ML and computer vision, and this has been tremendously successful. (This pivot was motivated primarily by Google Inbox, which of course just got shuttered -- but that's another post.)
One challenge with phishing is that virtually all the "best practice" pieces written by the press still follow this Atlantic article's "blame the user" approach to phishing prevention. I.e., train your end users to not click on stuff in bad emails.
Unfortunately, what we've seen over the last year or so is exactly what you'd expect: now that many companies are running simulated phishing training campaigns -- sending fake phishing emails to end users to try to train them to not click on "bad links" -- attackers are now sending brand forgery emails that are essentially perfect looking. The key insight here is that the attacker actually has a labor-saving technique that is also completely devastating to the approach of training users: "Save As HTML".
It's obvious in retrospect, but all the attacker has to do is take the exact HTML from a real transactional email (say, a DocuSign request), edit it to change one link, and resend it. (In security parlance this is a kind of "replay attack".) By definition, the body of this email will look identical to the original transactional mail, so you're left with training users to see the invisible. The hapless end user logs in "to DocuSign" and thereby gives the attacker his/her credentials.
In contrast, a machine learning system that is trained to recognize brand-indicative clues from emails can trivially verify DKIM, etc. on the mail to determine whether the mail really is from DocuSign or not. (It's actually not really trivial, because DocuSign might send mail through MailChimp or some random domain they never told you about, but that's a detail...) Software 1, Humans 0.
This leads me to personally believe that while phishing awareness training is important and a good practice, the future must be one where the machines do the vast majority of phishing email identification, blocking these emails before they reach end users. And it's a hard problem.
Of course, attackers can't precisely control the headers -- e.g., they can't easily send DKIM-signed mail from a domain named docusign.com --
so they can't literally replay a real DocuSign mail. But here again they use lots of clever tricks. One of my favorite (i.e., most evil) real-world phishing emails was a clone of an American Express "confirm card activity" email sent from domain aexp-ib.com. Most recipients would plausibly believe that that domain was some kind of internal Amex mail server or something, so it doesn't look at all weird. Even more devastatingly, this email came DKIM-signed -- with SPF and DMARC "alignment" -- by a very high reputation sender (Google), so it sailed right through mail protection systems based around traditional "good mail / bad mail" signals.
Why was it signed by Google? Because the attacker set up a G Suite account and sent the emails from there. This is another challenging phenomenon we're seeing: it's trivial for attackers to "inherit" the good reputation of a shared service like G Suite in this manner. Similarly, instead of hosting phishing sites on sketchy-looking URLs that can be detected with simple Bayesian models (good vs bad URL detection), attackers now host their stuff on Google Sites and on compromised web sites with high Alexa rankings. You can use simple heuristics like "don't trust mail from a domain set up in the last 3 days" but that's problematic too, because attackers can simply bank domains. (And, for that matter, real senders create new domains and send legitimate mail from them.)
According to the FBI, email-based phishing attacks have cost companies over $12B since 2013. If there's any silver lining to this scourge, it's that it makes for a really interesting technical challenge for the white hats that pretty much everyone understands the need for. (A completely different set of techniques is required to block impersonations of people -- "spear phishing" -- but I'll leave that for another day.)
> One challenge with phishing is that virtually all the "best practice" pieces written by the press still follow this Atlantic article's "blame the user" approach to phishing prevention. I.e., train your end users to not click on stuff in bad emails.
This is the same problem most pieces of advice for avoiding fraud and financial crimes have. Identity theft shouldn't be blamed on the victims' 'carelessness', because at the end of the day, it's the banks/credit card company/whoever that screwed up and let someone access their money that they shouldn't have.
The solution to most financial crimes and fraud isn't to 'train the customer to avoid the bad guys', since the bad guys are getting more sophisticated all the time and people can't be vigilante 24/7. The solution is to fix the underlying systems and procedures at the companies involved so criminals can't exploit their systems in the way they can right now.
Exactly. What people fear is not "identity theft" but bank libel. I could care less if a fraudster borrows money from a bank using my social security number and other public information. That is the banks problem when they try and collect the money. The problem is when the bank libels me by writing to a credit bureau and tells them that I have defaulted on a loan. If this libel was a large crime and prosecuted vigorously, "identity theft" would go away.
If the banks were heavily penalized for giving wrong info to the credit bureaus, the number of people having this problem of "identity theft" would plummet. If, while cleaning up your credit score you could also get a $10,000 judgment from the bank that libeled you, maybe you don't feel so shitty about the hassle the bank has put you through.
Are they heavily penalized for giving wrong info to credit bureaus? Or perhaps it's not a big problem as banks have the drill down with regard to identiy theft's aftermath?
Are banks saying that you yourself defaulted, or just that the given identity defaulted? Whether you as a person defaulted, or your identity was stolen and someone else defaulted with it, there is still increased risk from the baseline that your identity will default.
I would be interested in knowing if this is how the banks get around the libel problem. New laws should be enacted preventing that, if that is the case.
For one thing, what I said above. For another, it is impossible to libel someone by one private party telling another private party information. Being published is a necessary condition of libel.
If a credit bureau creating documents for others to use for various reasons in not publishing then Congress needs to pass a law declaring it is publishing or make a law that what the banks do to people when reporting false statements to the credit bureaus is a crime. Do you think that a creditor that reports a false statement that a person defaulted on a loan is not wrong and should not be penalized? Its great for the people that loan money but not how any reasonable person would think the world should work.
A credit reporting agency seems very much like a publisher. They receive statements from some parties and distribute them (usually more widely) to other parties.
Who pays the costs of phishing? Is it mostly individuals or do corporations feel the pinch?
Your analyses gets at the root causes, the bug fix that woukd stop the trouble tickets. Are the incentives there or the political will? It’s big. You’re getting into major league ball level public policy when banking regulations are involved. The problem must not yet be painful enough to drive the politics but maybe that day will arrive sooner or later.
While this could stop the most lucrative form of phishing, it woukd still leave the other forms of phishing. And as @badrabbit mentioned there are other types of phishing (e.g., Malware delivery).
When you find a phishing email, do you have a program that spams their "docusign" login with vast quantities of plausible-looking but fake credentials?
IMHO active countermeasures are always the best approach to stuff like this. Attack, attack, attack.
Also I could imagine giving them special fake usernames, such that when they try to login via that special fake username, it turns on extra metrology and slowbanning/cpu intensive operations etc.
I always thought the standard anti-phishing advice for end users is to never click on any link in any e-mail. Then it doesn't matter how sophisticated the attack is.
Thanks for sharing! Someone in my social circle recently fell victim to a phishing attack. I had similar thoughts that you really need to prevent the email from getting through in the first place. Success for phishing is all about the weakest link in an organization and training only goes so far.
I think you’ve elucidated a good middle term vision. It seems sound to me but I would not qualify as an expert on this topic.
That leaves us with this transition period before your vision’s realized. You’re realistically imagining a time when we solve this problem, relieving we human beings of the burden of consistently resisting the urge to take the phish hook.
In the scenario you mentioned above, the phisher spoofed the real domain, authenticating it or are you saying they used a similarly spelled domain they owned so implemented email authentication, enabling it to pass as technically legitimate, leaving the final defense up to the judgement of each recipient?
Sorry it’s been a long week I’m slow on the uptake now...
It is a design problem that people are trying to fix in production. I strongly agree with you about not blaming users but I think the focus should be on what happens after they click on the link or open up the attachment.
And no amount of phishing training will solve it - I don't think. People will still click on links, they will click on buttons and do what they think it is expected for them to do.
On the bright side, Google Safe Browsing is pretty good at catching new untargetted phishing campaigns, so most people get a good level of protection from that. I am also a big proponent of using a DNS firewall* layer to help minimize the exposure to phishing domains.
*I blogged about it here, comparing a few free DNS resolvers, if anyone is interested:
This looks like a good idea for everyone, though from your blog post, it’s clear that there’s a massive gap between the DNS firewalls that work and those that don’t, assuming the numbers you cite are accurate and I believe they are.
Training helps a lot,but they typically treat users like toddlers and punish or reward them for getting it right. Training is also either too targeted or too untargeted compared to IRL phish
Treating users like toddlers is a techie attitude that drives me crazy. Most people have their own profession and responsibilities to think about. We'd appear to have toddler level sophistication to, say, an accountant, electrician, or doctor, and we'd rightly expect to be treated like an adult discussing things about which we know almost nothing.
What do you mean by too targeted or too untargeted? Either focused on a specific threat (tree) or too general to be useful (all forest no trees)?
Too targeted would for example be something too relevant to their job. They'd know what the typical emails and logins are so they won't fall for it easily. If the training is for spearphishing, it should contain extensive detail about the user. I mean really, you can't train someone who combats phishing as their day job against spearphishing.
The only real threat training combats against is untargeted dragnet attacks which typically use generic content or attacks that target organizions(not individuals).
In other words,you want them to be trained for the technology threat not the content threat. You want them know the difference between mail.company.com and mail.company.com.seemslegit.site . Currently,training seems to focus on "email looks suspiciois,why did you click on the link" not "what about the link made you think it was legitimate? and this is why you were wrong."
Also,training is done as a campaign at most places.a few users fall for it and suddenly everyone knows about it before opening their inbox. Mostly theatrics. It shouldn't be "send phishme emails to a 1000 users today",it should be more like "pick 50 users out of 1000 at random and send them a new campaign everyday for the next 20 business days quarterly"
For 1), don't use passwords other than to protect private keys,in-browser dlp protection can help leakage of other PII. Alternatively,invest heavily in MFA.
For 2), there already strong mitigations like attachment whitelisting but my idea is to allow email attachments to ne saved to,written to and read from an execution restricted file path(even temp files. noexec on *nix and SRP on windows). Of course app whitelisting is ideal solution which mitigates drive-by download/web phish infections as well.
For 3), don't use email for communicating trust or authorization of change. Plenty of other alternatives (mostly e2e messaging and call apps) but it is more important to have an established protocol(e.g.: call back using listed number X and confirm with another call to Y before changing banking info or transfering money).
That's typically done by mime type and extension. For example,outlook/exchange won't let you send/receive windows executables as attachments at most companies.
I've never worked for a large company but it sounds like a sound policy across the board for businesses and organizations large and small, and perhaps even the major mailbox providers (e.g., gmail)?
Dusting off my usual response: mutual TLS (aka client side certs) could practically eliminate phishing. It would be simply impossible to give your credentials to a phishing site as they never leave your device. There's 2 things missing that browser / device vendors need to do:
1. Improve the UI of client certs.
2. Figure out a way to manage credentials across multiple devices.
In practice, FIDO U2F (Universal 2nd Factor) provides the same benefits and side-steps the major pains associated with mutually authenticated TLS (convoluted user experience, complex trust relationship management).
Google reportedly managed to all but eliminate phishing targeted at employees [1].
They also kind of solve your point 2: since the credentials live on the token, it's easy to move them from one device to the next. For devices with USB/NFC, that is.
Poor UI around client-side certs isn't a law of nature, it's not a priority for browser vendors. It should be super simple. Once you're signed up, the exact same tech that lets a browser show you the green padlock with the name of the site could seamlessly log you in with zero interaction needed, IF client side certs were easier to use (which is within the compass of the browser companies to improve).
Portability. You need to have access to your authentication mechanism from multiple devices, phones, laptops, desktops, even other people's computers. That's why the password still exists. I raised this issue with the WebAuthentication standard on their GitHub and the first reaction was to doesn't its importance. I log into services from a least the different devices every day.
So, the correct way to approach this in WebAuthn, which you'll see in popular implementations of its predecessor U2F and in for-real WebAuthn deployments is that users are allowed to have any plausible number of tokens, in GitHub this feels especially natural because it's managed the same way as your SSH keys, you can add or remove them, give them labels that help you remember what they are, and then you use any of them to prove your identity.
So I have a cheap FIDO token on my keychain that I take everywhere, and then I have one permanently plugged into the big desktop PC in my home and one in a desk drawer. You can buy ones that work nicely with a phone (unless you have an iPhone, can't help Apple) and Microsoft intends to effectively build one into Windows installs.
If you see a WebAuthn deployment that does 1:1 users to FIDO tokens, those people don't know what they're doing and need re-educating just like when people go "Oh, MD5(password) seems pretty secure".
2FA would also help, but people don't want to mandate it. And even if you require 2FA for your own employees, the third party "pay this invoice" request may not be gated on 2FA -- because that's up to the third party to enforce 2FA for (and they probably don't.)
Neither of these solutions helps with spear phishing emails, either. If you get an email that says "please wire money" from a sender you think you trust, the attacker's goal isn't credential harvesting, so protecting logins won't help.
Similarly we're seeing people fall for scams where a "trusted person" asks them to buy a bunch of iTunes gift cards and provide the codes on them in a reply email. Yes, people actually comply with this request when they think the requester is their CEO, etc.
Does 2FA help? It might keep the account itself from being taken over entirely (assuming that another 2FA step also guards the ability to change the account's credentials), but a phishing site could transparently relay both factors to the real site to grant it access for a single session, which is enough time to do plenty of damage. And if the phishing email is of the form "you need to change your password now", the site could easily trick a user into handing over the additional second factor that guards the primary credentials.
Yes, you're absolutely right: 2FA makes things harder for the attacker but doesn't really solve the problem. But in practice, at least circa 2018, most phishing sites are very primitive. Attackers do OK victimizing people without 2FA, so they don't generally do what you describe ("transparently relay both factors").
We don't do Client Certs because the problem is now you're asserting an identity. So what's the identity?
Maybe I'm happy for Hacker News and GitHub to know me as Nick Lamb, but I'd prefer that Grindr thinks of me as Steve Farmer, so that's now an additional certificate and then some sort of choice mechanic so I pick the right one. And if I screw up, or an advertising network is able to tie these identities together I can't undo that.
WebAuthn create a scenario where you can prove to a site that _you_ still have the same FIDO token as when _you_ signed up. But they don't get anything else, you have to mount an _active attack_ to even find out whether the token Alice shows you when she logs in is really the same token that Bob shows when he logs in, or to find out whether the credentials you've stolen from Facebook for alice@example.com are for the same token that Bob is using on GitHub. Passive attacks can't find any of that out, and remember each active attack attempt causes a physical human interaction, you can't just write a Javascript to try it a billion times until it succeeds.
I've been working on PKAuth, which addresses this problem (not at the TLS level though). I wrote a short blog post [1] with a demo video about the underlying protocol. I'd appreciate any feedback!
If you are looking for a straightforward way to protect users from phishing, consider rewriting the URLS in their messages and check their clicks in real time against databases of phishing feeds.
You can check against - Google Safe Browser, Phishtank and other free phishing feeds if you don't have the money for real-time databases or proactive site scrapers.
This is one of the ways that we protect the end user from Phishing at https://www.phishprotection.com - part of the magic is to do a bunch of sanitization before accepting the message including strict SPF, DKIM, DMARC validation / virus protection at the edge / and watching the registration of SSL certificates for commonly exploited domains https://blog.0day.rocks/catching-phishing-using-certstream-9...
If anyone is interested we rewrite URLs to match your domain name - something like linkcheck.yourdomain.com (protected by LetsEncrypt) and if you ever decide to leave you can export out the re-written URLs and redirect your domain to your own servers.
If you'd like to give it a try, feel free to let me know.
Like phishing awareness training, this is a good practice. We actually offer URL rewriting to our customers, but there are some UX downsides to it so not everyone wants it.
One big issue with GSB, Phishtank, OpenPhish, etc. as "the solution" is that, again, it's trivial for attackers to thwart these threat feeds. Using the same approach spammers have been implementing for 20 years now, the attacker just needs to randomize the URL in each sent email. Then when you report the phishing link in your copy, it helps no one else.
One could imagine a system that reverses the patterns used by the URL generation scripts -- we actually do this for DGAs ("domain generation algorithms") -- but even trying to be clever like this just puts you back in an arms race with the attackers.
So I don't think URL "whack-a-mole" is the right answer either. I believe you need the software to straight-up identify fraudulent emails from first principles. (Not saying it's easy.)
Yes, spammers learned to switch IP addresses very quickly. The email filters became more sophisticated ways of identifying your spam from a different IP addresss and even from a different domain name. Spammers could run but not hide.
I think I see a LOT less spam that actually gets to my inbox than in years past. Hats off to those who worked hard on making an incredible amount of progress on hard provlems, and continue to plow ahead.
I use localbitcoins.com and recently was fooled into giving away credentials even with 2fa enabled. Their ingenious method almost got me.
New message from LocalBitcoins support ticket #570241.
---
Your most recent attempt to access your private support ticket was rejected due to: INVALID AND/OR EXPIRED CREDENTIALS.
Please try again, you have (2) attempts remaining.
---
This account has been flagged as high-risk after receiving a support ticket submitted by another user regarding the following ad:
ONLINE_SELL #2322 - Bank Transfer
Please review and reply to this ticket as soon as possible to help us further investigate this matter. Failure to access this ticket may result in the suspension and/or revocation of your account until contact is made.
The solution is simple technically but complicated socially.
Simply, never send a customer a link for them to click. Instead, tell them to go to your site and to log in; then ensure anything important is easily found.
This would work if everyone did it, but one website doing it in isolation seems unlikely to have much impact. Even if you have a very clearly spelled out "We will never send links" policy, your customers interact with dozens of websites and are unlikely to specifically remember your policy when they read the email.
The point is, you could hopefully train users that a link in an email is an unusual and suspect behavior. And/or start disabling rendering of links in email.
Yes, I think you’re right. Despite the established potential threat, many people still don’t think twice about clicking on links. I think you’re on to something. It starts micro and hopefully becomes the norm at some point.
'Con' comes from the phrase 'confidence scheme'. Phishing is simply the most up to date version of the confidence scheme, using modern communication tech to achieve the same end - subverting the mark's context & communication expectations so they'll unwittingly give up secrets or money
Yes, even a normally alert, somewhat savvy human's not on the ball all the time. People get tired, careless, mindless, drunk, you name it. We are flawed.
How helpful do you think full implementation of email authentication: SPF, DKIM, and DKMARC (p=reject)?
I know it does not help against close mispellings, but it at least prevents spoofing the actual domain name. Is email authentication an important part of mitigating phishing?
One challenge with phishing is that virtually all the "best practice" pieces written by the press still follow this Atlantic article's "blame the user" approach to phishing prevention. I.e., train your end users to not click on stuff in bad emails.
Unfortunately, what we've seen over the last year or so is exactly what you'd expect: now that many companies are running simulated phishing training campaigns -- sending fake phishing emails to end users to try to train them to not click on "bad links" -- attackers are now sending brand forgery emails that are essentially perfect looking. The key insight here is that the attacker actually has a labor-saving technique that is also completely devastating to the approach of training users: "Save As HTML".
It's obvious in retrospect, but all the attacker has to do is take the exact HTML from a real transactional email (say, a DocuSign request), edit it to change one link, and resend it. (In security parlance this is a kind of "replay attack".) By definition, the body of this email will look identical to the original transactional mail, so you're left with training users to see the invisible. The hapless end user logs in "to DocuSign" and thereby gives the attacker his/her credentials.
In contrast, a machine learning system that is trained to recognize brand-indicative clues from emails can trivially verify DKIM, etc. on the mail to determine whether the mail really is from DocuSign or not. (It's actually not really trivial, because DocuSign might send mail through MailChimp or some random domain they never told you about, but that's a detail...) Software 1, Humans 0.
This leads me to personally believe that while phishing awareness training is important and a good practice, the future must be one where the machines do the vast majority of phishing email identification, blocking these emails before they reach end users. And it's a hard problem.
Of course, attackers can't precisely control the headers -- e.g., they can't easily send DKIM-signed mail from a domain named docusign.com -- so they can't literally replay a real DocuSign mail. But here again they use lots of clever tricks. One of my favorite (i.e., most evil) real-world phishing emails was a clone of an American Express "confirm card activity" email sent from domain aexp-ib.com. Most recipients would plausibly believe that that domain was some kind of internal Amex mail server or something, so it doesn't look at all weird. Even more devastatingly, this email came DKIM-signed -- with SPF and DMARC "alignment" -- by a very high reputation sender (Google), so it sailed right through mail protection systems based around traditional "good mail / bad mail" signals.
Why was it signed by Google? Because the attacker set up a G Suite account and sent the emails from there. This is another challenging phenomenon we're seeing: it's trivial for attackers to "inherit" the good reputation of a shared service like G Suite in this manner. Similarly, instead of hosting phishing sites on sketchy-looking URLs that can be detected with simple Bayesian models (good vs bad URL detection), attackers now host their stuff on Google Sites and on compromised web sites with high Alexa rankings. You can use simple heuristics like "don't trust mail from a domain set up in the last 3 days" but that's problematic too, because attackers can simply bank domains. (And, for that matter, real senders create new domains and send legitimate mail from them.)
According to the FBI, email-based phishing attacks have cost companies over $12B since 2013. If there's any silver lining to this scourge, it's that it makes for a really interesting technical challenge for the white hats that pretty much everyone understands the need for. (A completely different set of techniques is required to block impersonations of people -- "spear phishing" -- but I'll leave that for another day.)