Permanent Record(51)



My official job title was systems analyst, with responsibility for maintaining the local NSA systems, though much of my initial work was that of a systems administrator, helping to connect the NSA’s systems architecture with the CIA’s. Because I was the only one in the region who knew the CIA’s architecture, I’d also travel out to US embassies, like the one I’d left in Geneva, establishing and maintaining the links that enabled the agencies to share intelligence in ways that hadn’t previously been possible. This was the first time in my life that I truly realized the power of being the only one in a room with a sense not just of how one system functioned internally, but of how it functioned together with multiple systems—or didn’t. Later, as the chiefs of the PTC came to recognize that I had a knack for hacking together solutions to their problems, I was given enough of a leash to propose projects of my own.

Two things about the NSA stunned me right off the bat: how technologically sophisticated it was compared with the CIA, and how much less vigilant it was about security in its every iteration, from the compartmentalization of information to data encryption. In Geneva, we’d had to haul the hard drives out of the computer every night and lock them up in a safe—and what’s more, those drives were encrypted. The NSA, by contrast, hardly bothered to encrypt anything.

In fact, it was rather disconcerting to find out that the NSA was so far ahead of the game in terms of cyberintelligence yet so far behind it in terms of cybersecurity, including the most basic: disaster recovery, or backup. Each of the NSA’s spoke sites collected its own intel, stored the intel on its own local servers, and, because of bandwidth restrictions—limitations on the amount of data that could be transmitted at speed—often didn’t send copies back to the main servers at NSA headquarters. This meant that if any data were destroyed at a particular site, the intelligence that the agency had worked hard to collect could be lost.

My chiefs at the PTC understood the risks the agency was taking by not keeping copies of many of its files, so they tasked me with engineering a solution and pitching it to the decision makers at headquarters. The result was a backup and storage system that would act as a shadow NSA: a complete, automated, and constantly updating copy of all of the agency’s most important material, which would allow the agency to reboot and be up and running again, with all its archives intact, even if Fort Meade were reduced to smoldering rubble.

The major problem with creating a global disaster-recovery system—or really with creating any type of backup system that involves a truly staggering number of computers—is dealing with duplicated data. In plain terms, you have to handle situations in which, say, one thousand computers all have copies of the same single file: you have to make sure you’re not backing up that same file one thousand times, because that would require one thousand times the amount of bandwidth and storage space. It was this wasteful duplication, in particular, that was preventing the agency’s spoke sites from transmitting daily backups of their records to Fort Meade: the connection would be clogged with a thousand copies of the same file containing the same intercepted phone call, 999 of which the agency did not need.

The way to avoid this was “deduplication”: a method to evaluate the uniqueness of data. The system that I designed would constantly scan the files at every facility at which the NSA stored records, testing each “block” of data down to the slightest fragment of a file to find out whether or not it was unique. Only if the agency lacked a copy of it back home would the data be automatically queued for transmission—reducing the volume that flowed over the agency’s transpacific fiber-optic connection from a waterfall to a trickle.

The combination of deduplication and constant improvements in storage technology allowed the agency to store intelligence data for progressively longer periods of time. Just over the course of my career, the agency’s goal went from being able to store intelligence for days, to weeks, to months, to five years or more after its collection. By the time of this book’s publication, the agency might already be able to store it for decades. The NSA’s conventional wisdom was that there was no point in collecting anything unless they could store it until it was useful, and there was no way to predict when exactly that would be. This rationalization was fuel for the agency’s ultimate dream, which is permanency—to store all of the files it has ever collected or produced for perpetuity, and so create a perfect memory. The permanent record.

The NSA has a whole protocol you’re supposed to follow when you give a program a code name. It’s basically an I Ching–like stochastic procedure that randomly picks words from two columns. An internal website throws imaginary dice to pick one name from column A, and throws again to pick one name from column B. This is how you end up with names that don’t mean anything, like FOXACID and EGOTISTICALGIRAFFE. The point of a code name is that it’s not supposed to refer to what the program does. (As has been reported, FOXACID was the code name for NSA servers that host malware versions of familiar websites; EGOTISTICALGIRAFFE was an NSA program intended to exploit a vulnerability in certain Web browsers running Tor, since they couldn’t break Tor itself.) But agents at the NSA were so confident of their power and the agency’s absolute invulnerability that they rarely complied with the regulations. In short, they’d cheat and redo their dice throws until they got the name combination they wanted, whatever they thought was cool: TRAFFICTHIEF, the VPN Attack Orchestrator.

Edward Snowden's Books