tl;dr There was a vulnerability in CouchDB caused by a discrepancy between the database’s native JSON parser and the Javascript JSON parser used during document validation. Because CouchDB databases are meant to be exposed directly to the internet, this enabled privilege escalation, and ultimately remote code execution, on a large number of installations. If it had been exploited, this bug could have allowed for the modification of arbitrary packages in the npm registry. [edit: I’m wrong, and the main npm registry is unaffected. See correction below. My bad!] CVE-2017-12635

Background

Last time, I wrote about a deserialization bug leading to code execution on rubygems.org, a repository of dependencies for ruby programs. The ability to inject malware into upstream project dependencies is a scary attack vector, and one from which I doubt most organizations are adequately protected.

With this in mind, I started searching for bugs in registry.npmjs.org, the server responsible for distributing npm packages. According to their homepage, the npm registry serves more than 3 billion (!) package downloads per week.

CouchDB

The npm registry uses CouchDB, which I hadn’t heard of before this project. The basic idea is that it’s a “NoSQL” database that makes data replication very easy. It’s sort of like a big key-value store for JSON blobs (“documents”), with features for data validation, querying, and user authentication, making it closer to a full-fledged database. CouchDB is written in Erlang, but allows users to specify document validation scripts in Javascript. These scripts are automatically evaluated when a document is created or updated. They start in a new process, and are passed JSON-serialized documents from the Erlang side.

CouchDB manages user accounts through a special database called _users. When you create or modify a user in a CouchDB database (usually by doing a PUT to /_users/org.couchdb.user:your_username), the server checks your proposed change with a Javascript validate_doc_update function to ensure that you’re not, for example, attempting to make yourself an administrator.

Vulnerability

The problem is that there is a discrepancy between the Javascript JSON parser (used in validation scripts) and the one used internally by CouchDB, called jiffy. Check out how each one deals with duplicate keys on an object like {"foo":"bar", "foo":"baz"}:

Erlang:

> jiffy:decode("{\"foo\":\"bar\", \"foo\":\"baz\"}"). 
{[{<<"foo">>,<<"bar">>},{<<"foo">>,<<"baz">>}]}

Javascript:

> JSON.parse("{\"foo\":\"bar\", \"foo\": \"baz\"}")
{foo: "baz"}

For a given key, the Erlang parser will store both values, but the Javascript parser will only store the last one. Unfortunately, the getter function for CouchDB’s internal representation of the data will only return the first value:

% Within couch_util:get_value 
lists:keysearch(Key, 1, List).

And so, we can bypass all of the relevant input validation and create an admin user thusly:

curl -X PUT 'http://localhost:5984/_users/org.couchdb.user:oops'
--data-binary '{
  "type": "user",
  "name": "oops",
  "roles": ["_admin"],
  "roles": [],
  "password": "password"
}'

In Erlang land, we’ll see ourselves as having the _admin role, while in Javascript land we appear to have no special permissions. Fortunately for the attacker, almost all of the important logic concerning authentication and authorization, aside from the input validation script, occurs the Erlang part of CouchDB.

Now that we have an administrator account, we have complete control of the database. Getting a shell from here is usually easy since CouchDB lets you define custom query_server languages through the admin interface, a feature which is basically just a wrapper around execv. One funny feature of this exploit is that it’s slightly tricky to detect through the web GUI; if you try to examine the user we just created through the admin console, the roles field will show up empty since it’s parsed in Javascript before being displayed!

Impact on npm

I’ve been trying to figure out exactly how npm was affected by this bug. Since I didn’t actually exploit the vulnerability against any of npm’s production servers, I have to make educated guesses about which parts of the infrastructure were vulnerable to which parts of the attack, based on publicly available information.

I am almost certain that registry.npmjs.org was vulnerable to the privilege escalation/admin account creation part of this attack, which would have allowed an attacker to modify packages. This is because user creation on npm is more or less identical to the vanilla CouchDB user creation flow. Then, after authenticating as our newly created admin user, the user context passed to subsequent validation scripts will have the _admin role visible, allowing us to pass the isAdmin check in one of the registry’s validation docs. That said, as far as I can tell from what’s on Github, their production server doesn’t provide a route to the administrator’s configuration API, meaning I’m not sure if the bug could have enabled RCE on that server. [edit: It turns out that registry.npmjs.org simply exposes an identical API to the CouchDB user creation flow in order to maintain backwards compatibility with old clients. It has been using a custom authentication system since early 2015, and is therefore not vulnerable to my attack. The skim database mentioned below was affected by the bug, however. I apologize for being completely wrong in the initial version of this blog post!]

Npm also exposes a “skim database” which does look like it would have been vulnerable to the RCE part of the attack, but it’s unclear to me how that database is used in the infrastructure today. There’s a blog post from 2014 which indicates that all writes go to the skimdb, but I don’t know if this is still true.

Conclusion

It’s probably a bad idea to use more than one parser to process the same data. If you have to, perhaps because your project uses multiple languages like in CouchDB, do your best to ensure that there aren’t any functional differences between the parsers like there were here. It’s unfortunate that the JSON standard does not specify the behavior of duplicate keys.

Thanks to the CouchDB team for having a published security@ email address and working quickly to get this fixed.

Shameless plug

If you’re interested in ditching #birdsite and want to use a social network that actually respects your freedoms, you should consider joining Mastodon! It’s a federated social network, meaning that it works in a distributed way sort of like email. Join us over in the fediverse and help us build a friendly security community!