How to Compare Email Threat Detection Capabilities

May 18, 2020

This is the fourth in our five-part series evaluating anti-phishing tools. To start at the beginning, read “Automated phishing response tools: 4 things to consider”

In our last couple of blogs, we’ve focused on post-delivery skills that every email security solution should have, specifically threat remediation and user engagement. Email security vendors that fail to provide such tools are falling prey to the false narrative that 100% of email threats will be caught and blocked prior to reaching end users.

But that doesn’t mean that threat detection doesn’t matter. It’s still a critical component to evaluating an email security solution. Yet, because this is often the only tool in an email security vendor’s arsenal, it’s also the area that makes up the largest amount of marketing bluster. That’s why looking under the covers is really critical.

There’s no question that both the email security industry and the cloud email platforms themselves (i.e. Microsoft and Google) have invested heavily in improving email threat detection, particularly around phishing. And there’s been progress – it’s much harder to get traditional malware or generic, volumetric phishing attacks through that first layer of defense. However, it isn’t enough.

Evaluating new email threat detection techniques

In response to an onslaught of phishing attacks that have resulted in billions of dollars of losses, vendors now claim capabilities outside of their tried-and-true threat intelligence arena. You’ll hear nebulous terms such as “machine learning”, “artificial intelligence”, and “data science” with very little detail behind them. As a result, a critical component to evaluating the efficacy of any email threat detection system is to dig into what vendors really mean by such terms.

Data science, machine learning, and artificial intelligence, oh my!

Vendors often use these terms to generically (and sometimes interchangeably!) refer to their next generation email threat detection capabilities. You’ll often find them on product data sheets with little explanation as to where, how, or why they apply to a given vendor’s product.

This is an area that requires digging in: None of these terms should by themselves be considered indicative of any particularly advanced technique. What matters is the application of such techniques against the goal – the use of machine learning or data science for example is not a goal in itself. And in fact, using such techniques is not always the optimal way to accomplish a task. Indeed, as with most things, they can be used poorly, even dangerously. For example, if the hypothesis used to train a system hasn’t been proven, biases can be hard coded, increasing false positive or negative rates.

All that being said, these techniques, when used thoughtfully, CAN be used to improve detection by surfacing suspicious characteristics for analysis. Here are a couple of examples how.

Example: Convolutional neural networks for credential theft site detection

A good example is identifying emerging credential theft attacks. Such sites pop up and are taken down so fast that they sometimes never make it to blacklists. But you can use a convolutional neural network to recognize the characteristics of such sites, combine those findings with URL analysis and threat intelligence to determine that a given link is actually a credential theft site.

Note the critical distinction here – in this case, “machine learning” doesn’t tell you that the site is good or bad. It simply tells you that a destination site looks like a particular login page (e.g. an Office 365 login page). The analysis means nothing if it isn’t also combined with sophisticated analysis from other dimensions as well.

Example: Relationship analytics as an output of data science

Relationship analytics are often held up as an example of data science in action, but rarely does the application provide details beyond that. Many tools use relatively rudimentary analysis between sender and recipient – i.e. “Has there ever been email communication before?”

For relationship analytics to be truly useful, however, you don’t just care whether there’s been any communication between sender and recipient, you care whether it’s both bidirectional and recent, whether it’s strong or weak, whether the sender has any relationship with others in the organization, etc. These additional factors all help provide nuance to the analysis that helps to identify anomalies, but (equally important) reduce false positives.

By digging deeper and creating a robust graph-based risk analysis based on prior interactions, you can get a much clearer understanding of whether a particular sender poses much threat. And again, as with the previous example, relationship analytics don’t tell you by itself whether a given email is good or bad. Indeed, new acquaintances correspond with your employees all the time. But by understanding the strength of that relationship and combining it with other risk factors, a much more informed decision about the relative risk of any given email.

Other threat detection techniques to look for

Of course, there are other factors to consider when evaluating an email security vendor’s detection capabilities. Here are just a few:

Spoofing likelihood – ability to analyze for employee display name spoofs (and update these automatically as new employees join the company), domain spoofs, domain look-alikes, and more
Technical fingerprints – a sophisticated analysis of domain reputation, sending IP, header information, and other details that are either unintelligible to or hidden from recipients. It’s not enough to just review authentication standards on a pass/fail basis (thousands of businesses and government agencies don’t have DMARC set up properly, and most don’t have a reject policy in place), but rather to understand what “normal” authentication looks like for a given organization and to identify whether a given email “drifts” from that norm
and URL sandboxing – deep content inspection based on keywords, regular expressions (RegEx), attachments, and URLs. There should be a way for the recipient to preview the destination of a suspicious URL before opening the link to protect users from malware and malicious sites

Detecting email threats is the first line of defense

As email threats get more sophisticated, our detection methods have needed to follow suit. Moreover, each organization has a different risk profile and tolerance. A well-designed email security solution looks not just at how it can stop threats, how it can adapt to both the changing threat landscape and your own organization’s unique situation.

By analyzing email not just for confirmed malicious intent, but also for anomalous details that might indicate threat, the system can easily tune itself or be tuned to meet your needs. When combined with the user engagement tools mentioned in our previous post, you can reduce false positive rates without increasing your risk. By surfacing such indicators to your employees, you provide them with the tools they need to make better decisions about engaging with suspicious emails.

In the final post in our anti-phishing evaluation series, we’ll discuss evaluating potential vendors on their readiness for enterprise-scale implementations.