Clearly, web browsers, and their associated document formats and communication protocols, evolved in an unusual manner.
This evolution may explain the high number of security problems we see, but by itself it hardly proves that these problems are unique or noteworthy.
Let's take a quick look at the very special characteristics behind the most prevalent types of online security threats and explore why these threats had no particularly good equivalents in the years before the Web.
The User as a Security Flaw
Perhaps the most striking (and entirely nontechnical) property of web browsers is that most people who use them are overwhelmingly unskilled. Sure, nonproficient users have been an amusing, fringe problem since the dawn of computing. But the popularity of the Web, combined with its remarkably low barrier to entry, means we are facing a new foe: Most users simply don't know enough to stay safe.
For a long time, engineers working on general-purpose software have made seemingly arbitrary assumptions about the minimal level of computer proficiency required of their users. Most of these assumptions have been without serious consequences; the incorrect use of a text editor, for instance, would typically have little or no impact on system security. Incompetent users simply would not be able to get their work done, a wonderfully self-correcting issue.
Web browsers do not work this way, however. Unlike certain complicated software, they can be successfully used by people with virtually no computer training, people who may not even know how to use a text editor. But at the same time, browsers can be operated safely only by people with a pretty good understanding of computer technology and its associated jargon, including topics such as Public-Key Infrastructure. Needless to say, this prerequisite is not met by most users of some of today's most successful web applications.
Browsers still look and feel as if they were designed by geeks and for geeks, complete with occasional cryptic and inconsistent error messages, complex configuration settings, and a puzzling variety of security warnings and prompts.
A notable study by Berkeley and Harvard researchers in 2006 demonstrated that casual users are almost universally oblivious to signals that surely make perfect sense to a developer, such as the presence or absence of lock icons in the status bar.
In another study, Stanford and Microsoft researchers reached similar conclusions when they examined the impact of the modern "green URL bar" security indicator. The mechanism, designed to offer a more intuitive alternative to lock icons, actually made it easier to trick users by teaching the audience to trust a particular shade of green, no matter where this colour appeared.
Some experts argue that the ineptitude of the casual user is not the fault of software vendors and hence not an engineering problem at all. Others note that when creating software so easily accessible and so widely distributed, it is irresponsible to force users to make security-critical decisions that depend on technical prowess not required to operate the program in the first place.
To blame browser vendors alone is just as unfair, however: The computing industry as a whole has no robust answers in this area, and very little research is available on how to design comparably complex user interfaces (UIs) in a bulletproof way. After all, we barely get it right for ATMs.
The Cloud, or the Joys of Communal Living
Another peculiar characteristic of the Web is the dramatically understated separation between unrelated applications and the data they process.
In the traditional model followed by virtually all personal computers over the last 15 years or so, there are very clear boundaries between high level data objects (documents), user-level code (applications), and the operating system kernel that arbitrates all cross-application communications and hardware input/output (I/O) and enforces configurable security rules should an application go rogue. These boundaries are well studied and useful for building practical security schemes. A file opened in your text editor is unlikely to be able to steal your email, unless a really unfortunate conjunction of implementation flaws subverts all these layers of separation at once.
In a sense, the model is reminiscent of CP/M, DOS, and other principally nonmultitasking operating systems with no robust memory protection, CPU preemption, or multiuser features. The obvious difference is that few users depended on these early operating systems to simultaneously run multiple untrusted, attacker-supplied applications, so there was no particular reason for alarm.
In the end, the seemingly unlikely scenario of a text file stealing your email is, in fact, a frustratingly common pattern on the Web. Virtually all web applications must heavily compensate for unsolicited, malicious cross-domain access and take cumbersome steps to maintain at least some separation of code and the displayed data. And sooner or later, virtually all web applications fail. Content-related security issues, such as cross-site scripting or cross-site request forgery, are extremely common and have very few counterparts in dedicated, compartmentalised client architectures.
Nonconvergence of Visions
Fortunately, the browser security landscape is not entirely hopeless, and despite limited separation between web applications, several selective security mechanisms offer rudimentary protection against the most obvious attacks.
But this brings us to another characteristic that makes the Web such an interesting subject: There is no shared, holistic security model to grasp and live by. We are not looking for a grand vision for world peace, mind you, but simply a common set of flexible paradigms that would apply to most, if not all, of the relevant security logic. In the Unix world, for example, the rwx user/group permission model is one such strong unifying theme. But in the browser realm?
In the browser realm, a mechanism called same-origin policy could be considered a candidate for a core security paradigm, but only until one realises that it governs a woefully small subset of cross-domain interactions. That detail aside, even within its scope, it has no fewer than seven distinct varieties, each of which places security boundaries between applications in a slightly different place. Several dozen additional mechanisms, with no relation to the same-origin model, control other key aspects of browser behaviour(essentially implementing what each author considered to be the best approach to security controls that day).
As it turns out, hundreds of small, clever hacks do not necessarily add up to a competent security opus. The unusual lack of integrity makes it very difficult even to decide where a single application ends and a different one begins. Given this reality, how does one assess attack surfaces, grant or take away permissions, or accomplish just about any other security-minded task?
Too often, "by keeping your fingers crossed" is the best response we can give.
Curiously, many well-intentioned attempts to improve security by defining new security controls only make the problem worse. Many of these schemes create new security boundaries that, for the sake of elegance, do not perfectly align with the hairy juxtaposition of the existing ones.
When the new controls are finer grained, they are likely to be rendered ineffective by the legacy mechanisms, offering a false sense of security; when they are more coarse grained, they may eliminate some of the subtle assurances that the Web depends on right now.
This excerpt is taken from The Tangled Web: A Guide to Securing Modern Web Applications by Michal Zalewski. The book is available in print (RRP US$49.95) and e-book (RRP $39.95) formats from Amazon. It is published by No Starch Press.