“The” Problem With “The” Perimeter

“It’s secure, we transmit it over over SSL to a cluster behind a firewall in a restricted vlan.”
Protontorpedo
“But my PCI QSA said I had to do it this way to be compliant.”

This study by Gemalto discusses interesting survey results about perceptions of security perimeters such as that 64% of IT decision makers are looking to increase spend on perimeter security within the next 6 months and that 1/3 of those polled believe that unauthorized users have access to their information assets. It also revealed that only 8% of data affected by breaches was protected by encryption.

The perimeter is dead, long live the perimeter! The Jericho Forum started discussing “de-perimeterization” in 2003. If you hung out with pentesters, you already knew the the concept of ‘perimeter’ was built on shaky foundations. The growth of mobile, web API, and Internet of Things have only served to drive the point home.  Yet, there is an entire industry of VC-funded product companies and armies of consultants who are still operating from the mental model of there being “a perimeter.”[0]

In discussion about “the perimeter,” it’s not the concept of “perimeter” that is most problematic, it’s the word “the.”

There is not only “a” perimeter, there are many possible logical perimeters, depending on the viewpoint of the actor you are considering. There are an unquantifiable number of theoretical overlaid perimeters based on the perspective of the actors you’re considering and their motivation, time and resources, what they can interact with or observe, what each of those components can interact with, including humans and their processes and automated data processing systems, authentication and authorization systems, all the software, libraries, and hardware dependencies going down to the firmware, the interaction between different systems that might interpret the same data to mean different things, and all execution paths, known and unknown, etc, etc.

The best CSOs know they are working on a problem that has no solution endpoint, and that thinking so isn’t even the right mindset or model. They know they are living in a world of resource scarcity and have a problem of potentially unlimited size and start by asset classification, threat modeling[1] and inventorying. Without that it’s impossible to even have a rough idea of the shape and size of the problem. They know that their actual perimeter isn’t what’s drawn inside an arbitrary theoretical border in a diagram. It’s based on the attackable surface area seen by an potential attacker, the value of the resource to the attacker, and the many possible paths that could be taken to reach it in a way that is useful to the attacker, not some imaginary mental model of logical border control.

You’ve deployed anti-malware and anti-APT products, Network and web app firewalls, endpoint protection and database encryption. Fully PCI compliant!  All useful when applied with knowledge of what you’re protecting, how, from whom, and why. But if you don’t consider what you’re protecting and from whom as you design and build systems, not so useful. Imagine the following scenario:  All of these perimeter protection technologies allow SSL traffic through port 443 to your webserver’s REST API listeners. The listening application has permission to access the encrypted database to read or modify data. And when the attacker finds a logic vulnerability that lets them access data which their user id should not be able to see, it looks looks like normal application traffic to your IDS/IPS and web app firewall. As requested, the application uses its credentials to retrieve decrypted data and present it to the user.

Footnotes

0. I’m already skeptical about the usefulness of studies that aggregate data in this way. N percent of respondents think that y% is the correct amount to spend on security technology categories A, B, C. Who cares? The increasing yoy numbers of attacks are the result of the distribution of knowledge during the time surveyed and in any event these numbers aggregate a huge variety of industries, business histories, risk tolerance, and other tastes and preferences.
1. Threat modeling doesn’t mean technical decomposition to identify possible attacks, that’s attack modeling, through the two are often confused, even in many books and articles. The “threat” is “customer data exposed to unauthorized individuals.” The business “risk” is “Data exposure would lead to headline risk(bad press) and loss of data worth approx $N dollars.” The technical risk is “Application was built using inline SQL queries and is vulnerable to SQL injection” and “Database is encrypted but the application’s credentials let it retrieve cleartext data” and probably a bunch of other things.

How To Front-Run Financial Markets With Your Web Browser

twitterlogoEarly access to financial data can mean big profits. Imagine if you could learn the quarterly financial results of a company or macroeconomic statistical data before everyone else? Foreknowledge of the results means being able to take a position before the market moves as a result to everyone else becoming similarly informed. Until now, it took having a white-shoe country-club membership or buddies that hang out at the Eccles building in Washington, DC. What if all it took was a web browser?

At least one  financial-sector news organization is now obtaining and publicizing financial reports prior to their official release, utilizing a technique used for website content enumeration originally pioneered by security testers and attackers. As an artifact of the (bad) design decisions made in the development of many CMS(Content Management Systems) and of the processes used by the organizations deploying such systems, a confluence of factors leads to unintended information exposure through enumeration within a limited time window. Later, the information exposure just becomes intended information release.

Mashable recently published a story about a financial reporting and analytics firm, Selerity, using these simple but too-often effective techniques to retrieve and publish Twitter’s earnings report prior to its official release.

Some relevant definitions:

  • Frontrunning is the the practice of dealing on advance information provided by insiders before their clients or the public have been given the information. While it’s generally illegal, public information that is being further distributed publicly doesn’t fit the criteria set by the Securities Exchange Act, the Investment Company Act, and related rules. The CFAA as currently interpreted, that’s a trickier question.
  •  Forced browsing is an attack where the aim is to enumerate and access resources that are not referenced by the application, but are still accessible.

Once investors learned the reality of Twitter’s earnings report, its stock dove 18%. The Mashable article reported,

Although the company managed to briefly halt trading, and says that it is investigating the source of its leak, it seems likely that what happened was bad information management on Twitter’s part. On its earnings call with investors, Twitter made sure to put the onus of the blame for the situation on the Nasdaq-owned Shareholder.com, the company it pays to maintain its investor relations page.

Selerity maintains that it didn’t hack into anything; it simply found the information on Twitter’s own public investor relations website.

Predictability is a problem that plagues many systems, in many forms. I discussed some in the FuzzDB docs:

Software standardization means that predictable resource locations are the norm. Platforms like IIS, Cold Fusion, and Apache Tomcat store files that are known to leak information about system configuration in predictable places. Because of the popularity of a small number of package managers, log, configuration, and password files for popular software platforms are likely to be stored in a small number of places. Lists of platform-categorized web scripts that have been mentioned in a vulnerability database, lists of login page names from popular applications, all known compressed file type extensions, and countless other data elements on can be leveraged to turn “brute force” into a highly targeted discovery tool.

It’s just as commonly a problem for the file or other resource names and HTTP request data formats chosen for website resources.  In a nutshell, Shareholder.com has a CMS system which loads content in advance but doesn’t turn on the link to it until the appropriate time. Since as the Mashable article explained the information request format was predictable, all it took was making the request after the data had been loaded but before the link was published in order to find the earnings reports sooner.

I have to admit, I’ve thought about doing this before[0] with a job scheduler, some shellscripts wrapping wget,  and monitoring the results for HTTP 200 success responses. Which leads me to conclude that if Selerity is doing this and publishing their results, so there must be 100 or 1000 others doing it and keeping their results to themselves.

How do you defend against this?
Anti-automation controls are relatively easy to bypass and single-use hashes that are returned by a script can be gathered with automation and replayed into a subsequent request.
The only solution is to apply the control to the closest point possible – don’t load the data until the embargo time is up. If it’s not there, it can’t be retrieved.
But first, threat model your applications. Think about the data you’re seeking to protect, understand who it might have value to and why, and based on the risk you’re willing to accept and how the tradeoffs align, describe protections like the one described in to the application or business process. Doing this well takes knowledge of how systems are attacked. Without that knowledge, it’s too easy to design ineffective controls.

0. Thought about it, but ultimately decided against it due to the interpretation of and open questions about the CFAA in the case of Andrew “Weev” Aurenheimer, particularly the conflation of Brute forcing open access data and unauthorized access by the courts. The upside is good, but the risk of being another CFAA test case is not worth it.