Key Takeaways
Why Go is Particularly Vulnerable to Repojacking
The Go module ecosystem is unique because it’s decentralized. Other packaging systems like Pypi or NPM require developers to create accounts to upload their packages. This gives the package platform the ability to moderate users and content. That isn’t the case with Go. Go developers publish modules by pushing their code to source control platforms like GitHub. Anyone can then instruct the Go module mirror and pkg.go.dev to cache the module’s details.
This decentralization makes Go module repositories particularly vulnerable to repojacking. A repository becomes vulnerable when the module author changes their username or deletes their account. At that time, an attacker can register the newly unused username, duplicate the module repository, and publish a new module to proxy.golang.org
and go.pkg.dev
. A detailed step-by-step breakdown of how that is done and what the resulting go.pkg.dev
page looks like can be found in Appendix A.
GitHub does have some protections against repojacking. Their popular repository namespace retirement feature prevents repojacking of any repository “that had more than 100 clones in the week leading up to the owner’s account being renamed or deleted.” That might sound reasonable, but it isn’t necessarily for Go. Go modules are typically cached by the Module mirror, so there is no real need to interact with or clone from the source repository. For some context, VulnCheck has an open source library hosted on GitHub that we use daily and has 170+ stars, but that repository has only seen 20 clones in the last week. Based on that, the 100 clones protection isn’t necessarily as good as GitHub might think it is.
Hunting for Hijackable Go Modules
In June 2023, Aquasec published research positing millions of hijackable GitHub repositories. Knowing that Go is particularly vulnerable to this attack vector, we set out to enumerate exactly how many Go module-versions might be affected. By module-version, we mean a module plus all its versions. This is important to note because, as detailed in Appendix A, after successfully hijacking the attacker's new module will be listed as an “updated” module for all of the old module-versions.
VulnCheck tracks more than 20 million Go module-versions. It’s not exactly a small dataset, but the algorithm for tracking down the repojackable modules is relatively straightforward:
- For each module, infer the repository URL from the module name. For example,
github.com/vulncheck-oss/go-exploit
is a Go module whose source is hosted at https://github.com/vulncheck-oss/go-exploit. - Attempt to connect to each repository.
- An HTTP 301 response indicates a username change, a repository name change, or both. For our purposes, we validated that the repository name was the same (e.g.
go-exploit
), but the username had changed. We also validated that the original user account (e.g.vulncheck-oss
) didn’t exist. That further weeded out things like repository transfers. The repositories that made it through all those steps were considered potentially vulnerable to repojacking. - An HTTP 404 response indicated the repository no longer exists. We then would see if the username still existed. If not, then this, too, is potentially vulnerable to repojacking.
- An HTTP 200 response indicated the repository was not vulnerable.
- We then validated that the potentially hijackable repository actually had an entry in
go.pkg.dev
(you can cache pretty much anything with the Module mirror, so it’s important to weed out non-Go module stuff).
What you have left over is a lot of repositories that could be repojacked (assuming the 100 clones in the last seven days issue isn’t a problem). Our first finding is that more than 9,000 Go module GitHub repositories are vulnerable to repojacking due to a username change. The potential repojacking affects more than 500,000 Go module-versions.
In order to determine how popular the repositories are, we grabbed each repositories’ number of GitHub stars. We’ve graphed the results in buckets from 0 to 1000.
Hijackable Go Module Repositories Grouped by Stars
The majority of the repositories have zero stars. Those are of little to no value to an attacker. The remaining 3,000 repositories have between 1 and 1000 stars. The likelihood of a repository being valuable to an attacker increases with stars, but stars alone can’t say how useful an attack might be. The actual usage within the Go ecosystem matters, because exploitation relies on a developer updating to the new module.
The other category we tracked is repositories that are hijackable because the GitHub account had been deleted. We identified more than 6,000 of this type of repository, ultimately affecting nearly 300,000 module-versions. Because the repositories have been deleted, we can’t find how many stars they have. An attacker would have to rely on finding usage patterns - ultimately searching for imports of the code.
That’s a heavy lift, but worthwhile for the right attacker. Still, even finding use in the wild isn’t enough. Ultimately, the victim will need to update to the attacker’s new module, because the attacker cannot overwrite old modules. So, while the threat is real, repojacking within the Go ecosystem is not an immediate win for the attacker.
Conclusion
Unfortunately, mitigating all of these repojackings is something that either Go or GitHub will have to take on. A third-party can’t reasonably register 15,000 GitHub accounts. Until then, it’s important for Go developers to be aware of the modules they use, and the state of the repository that the modules originated from.
Appendix A: Hijacking a Go Module on GitHub
Once published, a Go module is available via proxy.golang.org
. This ensures that modules can’t be deleted. Disappearing modules would break downstream software, which is something no one wants. However, proxy.golang.org
is not a centralized repository like Pypi, Gem, or NPM. It’s simply a proxy and cache. When the source of the Go module is deleted or moved, proxy.golang.org
is unaware (or doesn’t care). The result is that anyone can hijack those modules. Let’s walk through the steps of how that works.
Step 1: Dev-A creates a GitHub repository and makes a Go module.
In the example below, Dev-A created https://github.com/vcresearcher/helloworld. They added code to create the Helloworld module, and tagged it version 1.0.0.
The module's hello.go
file contains one function for other developers to use:
package helloworld
import "fmt"
func Hello() {
fmt.Println("hi.")
}
Step 2: Dev-B imports the HelloWorld module
Another developer, Dev-B, sees the super cool Helloworld module and decides to use it in their new project.
package main
import "github.com/vcresearcher/helloworld"
func main() {
helloworld.Hello()
}
After running go mod init
and go mod tidy
, Dev-B's go.mod
looks like so:
module github.com/vcresearcher/helloworld-impl
go 1.21.0
require github.com/vcresearcher/helloworld v1.0.0
Dev-B's project dependency listing (go list -m -u all
) now looks like:
github.com/vcresearcher/helloworld-impl
github.com/vcresearcher/helloworld v1.0.0
And, finally, when Dev-B executes their program they get the following output:
albinolobster@mournland:~/helloworld-impl$ ./helloworld-impl
hi.
Step 3: Dev-A changes their GitHub username
Dev-A changes their username from vcresearch to vclabresearch. GitHub automatically moves the helloworld repository to https://github.com/vclabresearch/helloworld.
Step 4: Attacker hijacks the original git repository
When Dev-A changed its name from vcresearcher to vclabresearch, GitHub moved their repositories to the new username, and set up HTTP redirects so that any request to vcresearcher/helloworld would redirect to vclabresearch/helloworld.
Attacker registers as vcresearcher
, creates a repository called helloworld
, and uploads the original repositories content. This perfectly matches the original repository created by Dev-A. Attacker then updates hello.go
with "malicious" code.
package helloworld
import "fmt"
func Hello() {
fmt.Println("hi world 😈")
}
Attacker then publishes the module as version v1.0.3
.
Finally, Attacker tells the Go Module Proxy to grab the updated version. The new version is merged with the old versions in go.pkg.dev
.
Step 5: Dev-B checks for helloworld module updates and pulls in Attacker’s "malicious" package
In the output below, Dev-B checks the available updates for their Go program. They see a new version of the helloworld module, they fetch it, and subsequently execute Attacker’s code.
albinolobster@mournland:~/helloworld-impl$ go list -m -u all
github.com/vcresearcher/helloworld-impl
github.com/vcresearcher/helloworld v1.0.1 [v1.0.3]
albinolobster@mournland:~/helloworld-impl$ nano go.mod
albinolobster@mournland:~/helloworld-impl$ go mod tidy
go: downloading github.com/vcresearcher/helloworld v1.0.3
albinolobster@mournland:~/helloworld-impl$ go build
albinolobster@mournland:~/helloworld-impl$ ./helloworld-impl
hi world. 😈
About VulnCheck
The VulnCheck Exploit & Vulnerability team tracks more than a dozen package managers including NPM, Pypi, and Maven. For details, sign up to start a trial of our Exploit & Vulnerability Intelligence product today.