Skip to content(if available)orjump to list(if available)

Whose code am I running in GitHub Actions?

dgl

Unfortunately this makes a mistake by using a short commit ID: "(e.g. a5b3abf)"

That's not a full commit ID, so it can still result in a mutable reference if either someone can find a clash[1] or if they can push a tag with that name and it takes priority in the context it is used (this is somewhat complex, e.g. GitHub prohibits pushes of branches and tags which are exactly 40 hex characters long, but other services may not).

[1]: https://people.kernel.org/kees/colliding-with-the-sha-prefix...

bewuethr

Shortened commit SHAs are actually not supported by Actions; if you try, you get

"Unable to resolve action `actions/checkout@11bd719`, the provided ref `11bd719` is the shortened version of a commit SHA, which is not supported. Please use the full commit SHA `11bd71901bbe5b1630ceea73d27597364c9af683` instead."

chatmasta

What if the repository has a tag called 11bd719? Does Git/GitHub forbid creation of this tag if a commit exists with that prefix?

What if a Git commit is created that matches an existing tag? Does Git have a procedure to make a new one? e.g. imagine I pregenerate a few million 8 character tags and wait for a collision

btw: Even if you specify the full commit SHA, this can still be attacked; there have been pre-image attacks against Git commit hashes in the past. At least for older versions of Git, the algorithm was Sha1. Maybe that’s changed but an attacker could always construct a malicious repository with intentionally weak hashes with the intent of later swapping one of them. (But at that point they may as well just push the malicious code in the first place.)

anamexis

What is the attack exactly? Only full commit SHAs are valid to reference a commit by SHA. GitHub disallows tags and branch names that could collide with a full commit SHA. There is never any collision between commit SHAs and tags.

immibis

It's still SHA-1 by the way, but they included counter-cryptanalysis to reject objects that appear to be one side of a collision using known techniques.

password4321

So just so I'm clear based on what you've mentioned, even the policy prohibiting 40 hex character tags isn't doing anything to stop a tage the same as the short commit ID?

Also, per this comment on a previous discussion on this incident at https://news.ycombinator.com/item?id=43367987#43369710:

> the real renovate bot immediately took the exfiltration commit from the fake renovate bot and started auto-merging it (updating full SHA1 references)

mmmaantu

SHA pinning won't necessarily help if the dependency you are pinning doesn't pin its own dependencies! You still get stuff pulled via vulnerable tags etc. How long till we get this https://github.com/github/roadmap/issues/592 ...

sepositus

Yes, this is a crucial distinction to make. The fact of the matter is that you have to treat GitHub Actions like a compromised system. Sure, there's not a ton of steps you can take for protecting builds if it's your primary builder, but you can for example not hook up an AWS account with full admin privileges to it (which I've seen more times than I would have like to).

bracketfocus

https://github.com/features/preview/immutable-actions

They are actually releasing this very soon. I’ve seen some of my workflows use an immutable OCI image for some of GH’s actions like actions/checkout.

sureIy

Isn't that wrong? I think you have to pre-bundle your actions, it won't do an npm install.

mikepurvis

I set up this recently at a new company and did yarn + ncc to build a compiled js out of typescript. It was a bit hairy as a novice, but ended up working fine.

That protects from npm supply chain stuff, but obviously third-party includes like docker/build-push-action are still a risk.

thenaturalist

Thanks for highlighting this open issue.

The fact they've been stalling this for a good 2.5 years is... insane??

daveisfera

I don't believe that's true. If you pin to a hash, then it will always run that version and can't change

ptx

There seems to be a slight misunderstanding in the article. It says that the "v2" tag "looks like an immutable reference" and points out that it's actually mutable, as if this was surprising and unintended. It also says that the reason people use tags despite this (making a tradeoff against security) is that "tags are easier to read and compare".

But the GitHub documentation [0] makes it clear that tags for major versions are intended to be mutable and be updated to point to new minor versions as they are released, not because it's "easier to read" but because you "can expect an action's patch version to include necessary critical fixes and security patches, while still remaining compatible with their existing workflows" (as long as the author follows their recommended semantic versioning scheme).

So choosing a major-version tag is GitHub's recommended practice precisely because it is mutable and does change.

[0] https://docs.github.com/en/actions/sharing-automations/creat...

cesarb

> major versions are intended to be mutable and be updated to point to new minor versions as they are released [...] because you "can expect an action's patch version to include necessary critical fixes and security patches [...]

It's two sides of the same coin: on one hand, an update can include fixes for bugs and vulnerabilities; on the other hand, an update can also include new bugs and vulnerabilities (or even malicious code). Updating too quickly can be risky. Updating too slowly can also be risky.

linuxftw

While this may be technically correct, the general git-community at large mostly treats tags as immutable (contrary to docker, for example).

Release branches are typically the mutable reference. So I would create a 'v2' release branch, but not a 'v2' tag which gets updated.

Also, by convention, git references starting with 'v' are typically immutable tags and not branches.

But, even given the above, the git-community at large knows that tags can be mutable, and so if we care about that, we reference the sha (malicious collisions excepted).

ptx

That's a good point and might explain the source of this confusion. Actions on GitHub can come from either Docker or a Git repo, using exactly the same syntax [0], so the tag can be either a Docker tag or a Git tag.

[0] https://docs.github.com/en/actions/writing-workflows/workflo...

bsza

That wording is even more misleading, because it implies that using the full version string, by contrast, is not mutable, even though it presumably is.

dietrichepp

I just started using GitHub Actions for a personal project, and as you do, I trawled HN for opinions on how to use it.

At first I built a workflow out of steps published on GitHub. Use ilammy/mms-dev-cmd, lukka/get-cmake, lukka/run-vcpkg, all to build a project with CMake for Windows targets. Of course I referred to actions by SHA like you should

   uses: ilammy/msvc-dev-cmd@0b201ec74fa43914dc39ae48a89fd1d8cb592756
But one comment stuck with me. Something like, “You should just run your own code on GitHub Actions, rather than piecing it together from publicly available actions.” That made a lot of sense. I ended up writing a driver program for my personal project’s CI builds. One job builds the driver program, and then the next job runs the driver program to do the entire build.

I wouldn’t do this if I were getting paid for it… it’s more time-consuming. But it means that I am only minimally tied to GitHub actions. I can run the build driver from my own computer easily enough.

huijzer

> I ended up writing a driver program for my personal project’s CI builds. One job builds the driver program, and then the next job runs the driver program to do the entire build.

Yes things like that have been discussed before on HN. Also for example use a justfile (or something similar) and then call that from inside the Action to reduce vendor lock-in.

timewizard

I use Actions merely as a way to trigger a custom Webhook. Then I do everything on the server that receives the hook with my own code. I hate YAML that much.

dietrichepp

What I want is something like “please run this command on a server somewhere when event X happens”. Seems like the options are along the lines of:

1. SaaS CI/CD products, like GitHub Actions,

2. Run your own Jenkins cluster,

3. Figure out how to orchestrate cloud resources to do this for you.

Maybe there are easy options that I’m missing. I don’t really want to create docker containers just to build some program I’m working on.

timewizard

That's effectively what we are doing. The webhook receives any "custom properties" you have defined on your repo, the ssh url, and critically, the name of the Action that was run. The receiving server can use all of this to select the appropriate pipeline. Our build server is not containerized.

CrimsonRain

Listen to a port from a server. Do a post with an API key. Then run your bash script there.

Run GitHub actions self hosted (takes 2 mins to setup)

Just ssh in and run it.

So many options.

arccy

why not just use regular GitHub webhooks...

timewizard

We do, but you can only trigger those on predefined events, and we want our release manager to be independent of any push or pull mechanisms on the repo. You can also run actions from the github web ui which makes them available even to non technical managers.

Our Action has a single step, it has an "if: false" declaration so it never runs, and no runners are engaged. This immediately completes and fires off a "workflow_job" webhook which triggers the build server to act.

djoel

Do you mind linking the repo, if it's public? Thanks!

bdcravens

Github Actions is definitely a vector for abuse.

I was looking at Seleniumbase recently, and they tell you that you can use Github Actions for web scraping to bypass a lot of blocks (apparently Github Actions use a residential IP-space)

https://seleniumbase.com/new-video-unlimited-free-web-scrapi...

Marsymars

This seems like a wild thing for a third-party project to promote. The intention of GitHub Actions to run CI/CD and other repository-related tasks. You’d never see, for instance, Adobe promoting, via YouTube, “unlimited free web OCR with Adobe CLI on GitHub Actions!”

I’ve never heard of Seleniumbase, but this makes them look like a rinky-dink project.

dylan604

That's the whole point of hacking is to use something in a way unintended by its maker. That could be for something cool/interesting, or it could be for something nefarious. Nobody ever thought a coffee maker should run Doom, but they do. Not sure if there's a morality clause type of dis-qualifier for a Show HN, but there's a lot of people that would be interested in seeing how something benign was used for a different purpose. Especially if if saved them money/compute/time/resources/etc.

Intralexical

Yep. And "hackers" who apply that mindset to abusing publicly shared resources are why the rest of us are going to get DRM on our own, private coffee makers.

bastardoperator

I've found this to be a problem that a lot CI providers suffer from. They allow extension via third party code which is awesome, people write useful code, a lot of that useful code doesn't get maintained properly or ever, rots, and eventually everyone has a security issue.

You can also see the GitHub IP space here, I don't think it's "residential", unless that terminology includes azure and aws?: https://api.github.com/meta

bdcravens

I'm not sure. Perhaps when you call a browser it's going through another network? I haven't tested this for myself, only going off of of what was reported by that project.

bastardoperator

Maybe it was a self hosted runner? I run those locally all the time.

bri3d

They don't use a residential IP space; they use Azure Data Center (which, being less popular, isn't blocked as often as for example EC2).

meltyness

Network enabled compute is definitely an unusual free lunch, but I suppose the trade off is handing out free source code.

spaceywilly

This does not bode well for genetic AI

dqh

I've never liked the idea of community actions in the critical build path, so I use official actions/* when I can, and otherwise use actions/github-script to invoke the GitHub API via inline JavaScript when I can't.

Uvix

I agree on community ones, but I’m happy to use the official actions from vendors like the Terraform and Azure ones too.

donatj

> At a glance, this looks like an immutable reference to an already-released “version 2” of this action, but actually this is a mutable Git tag. If somebody changes the v2 tag in the tj-actions/changed-files repo to point to a different commit, this action will run different code the next time it runs.

The worst part of this is that this is BY DESIGN.

I maintain a small handful of actions. You are expected to, as an action maintainer DELETE and RETAG your major versions when you release a new minor or patch version. That is to say for instance your v2 tag should point to the same commit as your latest 2.x.x tag.

Not everyone does this mind you, but this is the default and the expected way of operating.

I was frankly kind of taken aback when I learned this. I know for a fact documentation of this used to exist, but I am failing to find it currently.

You can see GitHub themselves however doing exactly this here, with the v4 and v4.2.2 tags matching here (as of today, v4 will move in future)

https://github.com/actions/checkout/tags

sureIy

This is probably the dumbest design decision of GitHub Actions. You'd think that the biggest git platform would know better than asking you to force push every release. They should have just used branches, because that's exactly what they're for. Or they should have resolved the right tag themselves, like npm does.

CrimsonRain

I only use a handful of official github and docker actions. This behavior is something that I (and many others) want in that case.

Here @v4 is very similar to how you'd tag docker image v4-latest.

Ultimately, it should be a choice. Like you can do with package.json. Pin an exact, allow patch updates, allow minor updates or always latest etc.

dboreham

Oh. I always assumed that v2 here was a branch, not a moving tag.

password4321

This article appears to be in response to the linked Tj-actions/changed-files GitHub Action Compromised – used by over 23K repos discussed at https://news.ycombinator.com/item?id=43367987 10 days ago; not a duplicate as it discusses a detection tool but perhaps it rhymes.

TheRealPomax

Was this an auto-generated comment? Because yes, of course it's related, it's someone doing their own investigations based on the news.

password4321

I tried to point out that though the topic is a duplicate, the link does add additional value. Do you not think the previous discussion is worth linking? Yeesh

ohgr

This is a minor worry when the entire software ecosystem is based on “download any old shit off the Internet at run it”.

sureIy

My company went into full panic mode after this. 5 minutes later dependabot opens and auto merges a random patch from npm, but that's fine.

TheRealPomax

Say what you want about dependabot, and people who allow it to auto-merge changes, but at least NPM releases are not mutable (... anymore, at least. NPM had to learn that one the hard way, but unlike github it actually learned something).

jrochkind1

> Tags vs commit IDs is a tradeoff between convenience and security.

Well, it's also a trade-off between security and security. If you specify an immutable commit ID, then if the dependency releases a security update you won't get it until you notice and update the commit ID. If you specify a tag, you'll get it on next build.

I guess we need Dependabot updating commitID's in workflows too? But then they'd update you to the vulnerable new @2, how would they know otherwise if it hadn't been reported yet?

mystified5016

Wait, people actually just blindly paste together calls into GitHub actions written by someone else who can change it at any time?

You know what, no this makes perfect sense. This is exactly, perfectly in line with the modern software ethos.

Jesus. I'm so glad that 100% of my GitLab pipelines is code I wrote. It's owned by the company and it lives in our source control and runs on our hardware. I think you'd be nuts to do anything else, honestly.

For entirely related reasons, I'm thrilled that my career is moving in a direction away from devops and software in general. I can't stomach it anymore.

jillesvangurp

Most people use several hundreds millions of lines of code provided by somebody else on a daily basis (your laptop, your phone, your hair dryer, car, etc). Most of that stuff gets built using libraries, components, frameworks, etc. provided by third parties.

The whole system runs on trust that all those people do the right things. Sometimes that trust is broken. But mostly it's surprisingly fine. Part of the reason is that bad people are the exception and not the norm and all those other people react when we find one, some are mildly paranoid about this, and processes exist for flagging suspicious things (e.g. CVEs).

What we need is not to audit everything ourselves. Because that's humanly impossible. But better trust verification mechanisms and tools. Github has some mechanisms for actions but it still has some vulnerabilities. It's not perfect. But it's better than nothing. Replacing those by auditing/building yourself is going to either result in a lot of work or security with holes in it (i.e. you are moving the problem, not solving it).

You could argue that most GH Actions are simple enough that building yourself is not the end of the world. It depends on what you are doing.

I take the middleground. I use GH actions but only with widely used actions maintained by Github. Actions are just docker containers. So, the advice can be generalized to those. Check where they come from; who is building them; what their release practices are. Etc.

re-thc

> just blindly paste together calls into GitHub actions written by someone else who can change it at any time?

That’s old news. Now they have the AI do it.

Gigachad

Before this article I had no idea this was even something you could do let alone something that's common.

Why would you not just copy the couple of lines in to your own config. It's not like you need to subscribe to updates on a command to get changed files. You want the exact opposite so your CI doesn't randomly break due to external changes.

dboreham

And then 2 years later you discover there was a security bug in that code you copied..

Gigachad

A security bug on a one liner regex to select some files in my own repo that changed? Seems far fetched.

achierius

Where to (just out of curiosity)?

ph3t

GitHub’s dependency graph is supposed to give us this kind of visibility without any custom scripting, but from my experience it’s pretty spotty and often misses dependencies entirely.

Also, the script from the article doesn’t cover transitive GitHub Actions dependencies. So if a third-party action you’re using relies on a vulnerable action internally, it won’t catch that.