A chap who found two serious security bugs in Git servers and clients has urged people to patch their software.
The flaws are present in Git including the 2.x, 1.9 and 1.7 branches, meaning the vulnerabilities have been lurking in the open-source version control tool for years.
It is possible these two programming blunders can be potentially exploited to corrupt memory or execute malicious code on remote servers and clients. To do so, an attacker would have to craft a Git repository with a tree of files that have extremely long filenames, and then push the repo to a vulnerable server or let a vulnerable client clone it from the internet. One of the bugs was squashed in version 2.7.1, which was quietly released last month. The other bug – arguably the most serious of the pair – is fixed in version 2.8, although that hasn't been officially released yet.
While the security patches are available and have been discussed on public mailing lists, no alarm has been raised. Linux distributions, users and sysadmins have thus been slow to update their software, seemingly unaware of the problem. Laël Cellier, who found the bugs, believes the vulnerabilities can be exploited, and spent Tuesday trying to draw people's attention to the fixes. He started poking around Git's handling of repositories at the end of October last year, and homed in on file paths.
He found two bugs in a function called path_name(), which is used to append a filename to the end of a path in a repo tree. The path and filename are attacker-controlled. The function works by allocating memory for the full NULL-terminated string, copying the filename to the end of the string and then copying in the path, element by element, working backwards from the filename.
Here's what the code, in revision.c, looked liked in Git pre-2.7.0. See if you can spot the mistakes:
There is a lot of danger here: the string length variables nlen and len are signed integers rather than unsigned: they can be positive and negative, and are in this case susceptible to integer overflow. If strlen() is passed a very large filename and returns a very large number, nlen will overflow and be negative rather than positive. That'll make len negative too, at least to begin with, and xmalloc() could end up being called to allocate too few bytes for the final combined string.
Next, the code strcpy(), which is extremely risky as it invites buffer overflows. It will blindly copy from the large attacker-controlled filename over the small amount of heap memory allocated until it hits a NULL byte, causing a heap overwrite, and thus allows the attacker to manipulate the program's data structures. Ding, CVE-2016-2315.
The fix, applied in Git 2.7.0, replaces the strcpy() call with the safer memcpy():
memcpy(m, name, nlen + 1);
That limits the extent of the copy operation to the size of the buffer allocated for the filename. It won't copy in more data than the space allocated, thus preventing heap overwrite.
But that's not the end of the story. The first loop in path_name() still uses a signed integer to add up the length of the path. By creating a long path out of lots of subdirectories with very long filenames, you'll eventually wrap len around to a small number, allocate a small space, and then write a larger amount of data from the filename over the space, again overwriting the heap.
If you try to build the full path X/Y/Z, with X = 231-5 bytes long, Y = 231-5 bytes, and, the filename, Z = 20, len will work out to 10, just 10 bytes will be allocated, and memcpy() will copy Z's 20 bytes into the space, and overwrite the heap. Ding, CVE-2016-2324.
It is absolutely possible to store non-ASCII data – such as a malicious code payload – in paths. And you don't have to worry about the huge filenames being too large and unwieldy to transfer from server to client or client to server. Git squashes down data stored in repos using zlib compression, including filenames which are expanded when needed.
It is theoretically possible to create a Git tree of a few hundred kilobytes or megabytes that can grow substantially to attack the code as described above. A hacker would have to defeat various security defenses, particularly ASLR, to gain reliable code execution; this is tricky and audacious, but not impossible.
The expanded filenames would also have to fit in memory, so if the Git server or client is running on a machine with limited RAM, the attack may crash the software. If it turns out to be too difficult to exploit, then given the above unsafe code, Git dodged a bullet. So, what's the fix for CVE-2016-2324? Answer: tear out path_name() and replace it with something much more secure. That was applied to Git 2.8, which is awaiting release.
Cellier reported the flaws to GitHub last year, and was awarded 5,000 points – an amount reserved for the most serious flaws – in its bug bounty scheme at the end of November. As a vulnerability hunter rather than an exploit writer, he wasn't able to provide a working proof-of-concept exploit to prove remote-code execution is possible. Thus, GitHub wrote up his discovery as...
...a heap-based memory corruption bug in Git that exploited an unsigned to signed integer conversion. An attacker could have exploited this flaw by pushing a malicious repository to GitHub to perform a denial of service or possibly read/write to unexpected memory locations. We addressed the bug by updating Git to use unsigned integers consistently. We also added validation logic to Git that looks for potentially malicious repository contents (ex. excessively long path lengths).
Cellier created a repo on GitHub called "longpath" to test his work: the main page for the repository triggers a HTTP 500 error, although you can read his notes in the repo wiki. GitHub addressed the reported security vulnerabilities and rolled out the changes to GitHub Enterprise users and its hosted service. GitHub staffer Jeff King fixed and shored up the mainline Git source to incorporate the changes and ironed out similar security problems. Version 2.7.1 of Git was modestly rolled out in February; version 2.8 is waiting patiently in the queue.
Then everything went quiet. "I’m not aware of any Linux distribution that issued a fix for their stable branch," Cellier complained on Tuesday. He described the vulnerabilities he found as "widespread," and said the issue needed "advertisement." Indeed: at time of writing, for example, all supported releases of Debian GNU/Linux are vulnerable to CVE-2016-2324. All but Debian Testing and Debian Unstable are vulnerable to CVE-2016-2315. Overnight, in response to Cellier's efforts to highlight the flaws, GitHub rival GitLab updated its software to enforce the use of Git 2.7.3, which also includes another security fix in its handling of pack files.
"Today we are releasing version 8.5.7 for GitLab Community Edition (CE) and Enterprise Edition (EE)," GitLab announced. "This version raises the minimum required Git version to 2.7.3 to address the recent remote code execution vulnerability." If you are using Git pre-2.8 you have two options: ride it out and assume no one is going to exploit this right away – your distro will issue a patch via the official channels soon enough; or install the latest version from the source.
It was initially believed that version 2.7.1 fixed both CVE-2016-2315 (the unsafe strcpy) and CVE-2016-2324 (the signed len addition loop). However, the fix for CVE-2016-2324, which replaces path_name() completely and kills the whole vulnerability dead, has been applied to version 2.8, which is not formally available yet. You can build it and install it from source, however.
Axarhöfði 14,
110 Reykjavik, Iceland