The Pain That Is GitHub Actions

583 comments

·March 20, 2025

deng

Already see people saying GitLab is better: yes it is, but it also sucks in different ways.

After years of dealing with this (first Jenkins, then GitLab, then GitHub), my takeaway is:

* Write as much CI logic as possible in your own code. Does not really matter what you use (shell scripts, make, just, doit, mage, whatever) as long as it is proper, maintainable code.

* Invest time that your pipelines can run locally on a developer machine as well (as much as possible at least), otherwise testing/debugging pipelines becomes a nightmare.

* Avoid YAML as much as possible, period.

* Don't bind yourself to some fancy new VC-financed thing that will solve CI once and for all but needs to get monetized eventually (see: earthly, dagger, etc.)

* Always use your own runners, on-premise if possible

hi_hi

I came to the exact same conclusion accidentally in my first role as a Tech Lead a few years back.

It was a large enterprise CMS project. The client had previously told everyone they couldn't automate deployments due to the hosted platform security, so deployments of code and configs were all done manually by a specific support engineer following a complex multistep run sheet. That was going about as well as you'd expect.

I first solved my own headaches by creating a bunch of bash scripts to package and deploy to my local server. Then I shared that with the squads to solve their headaches. Once the bugs were ironed out, the scripts were updated to deploy from local to the dev instance. Jenkins was then brought in an quickly setup to use the same bash scripts, so now we had full CI/CD working to dev and test. Then the platform support guy got bored manually following the run sheet approach and started using our (now mature) scripts to automate deployments to stage and prod.

By the time the client found out I'd completely ignored their direction they were over the moon because we had repeatable and error free automated deployments from local all the way up to prod. I was quite proud of that piece of gorilla consulting :-)

badloginagain

I hate the fact that CI peaked with Jenkins. I hate Jenkins, I hate Groovy, but for every company I've worked for there's been a 6-year-uptime Jenkins instance casually holding up the entire company.

There's probably a lesson in there.

mike_hearn

It peaked with Jenkins? I'm curious which CI platforms you've used.

I swear by TeamCity. It doesn't seem to have any of these problems other people are facing with GitHub Actions. You can configure it with a GUI, or in XML, or using a type safe Kotlin DSL. These all actually interact so you can 'patch' a config via the GUI even if the system is configured via code, and TeamCity knows how to store config in a git repository and make commits when changes are made, which is great for quick things where it's not worth looking up the DSL docs or for experimentation.

The UI is clean and intuitive. It has all the features you'd need. It scales. It isn't riddled with insecure patterns like GH Actions is.

finnthehuman

Jenkins is cron with bells and whistles. The result is a pile of plugins to capture all the dimensions of complexity you are likely to otherwise bury in the shell script but want them easier to point and click at. I'll hate on jenkins with the rest of them, but entropy is gonna grow and Jenkins isn't gonna say "no, you can't do that here". I deal with multiple tools where if tried to make fun about how low the jenkins plugin install starts are, you'd know exactly where I work. Once I've calmed down from working on CI I can appreciate Jenkins' attempts to manage all of it.

Any CI product play has to differentiate in a way that makes you dependent on them. Sure it can be superficially nicer when staying inside the guard rails, but in the age of docker why has the number of ways I configure running boring shell scripts gone UP? Because they need me unable to use a lunch break to say "fuck you I don't need the integrations you reserve exclusively for your CI" and port all the jobs back to cron.

And that's why jenkins is king.

marcosdumay

And the lesson is that you want a simple UI to launch shell scripts, maybe with complex triggers but probably not.

If you make anything more than that, your CI will fail. And you can do that with Jenkins, so the people that did it saw it work. (But Jenkins can do so much more, what is the entire reason so many people have nightmares just by hearing that name.)

skor

well, I got tired of Groovy and found out that using Jenkins with plain bash under source control is just right for us. Runs everywhere, very fast to test/develop and its all easy to change and improve.

We build Docker images mostly so ymmv.

I have a "port to github actions" ticket in the backlog but I think we're not going to go down that road now.

k4rli

It's feature complete. Anything more will just be bloat, probably 25% of it could be reduced at least.

rrr_oh_man

> gorilla consulting

Probably 'guerilla', but I like your version more.

hi_hi

Haha, I'm gonna admit it, all these years and I thought gorilla/guerilla was one of those American/British spelling things, like cheque/check or gaol/jail. Boy do I feel stupid.

DonHopkins

That's when the devs all wear gorilla suits in Zoo meetings.

Wikipedia: Gorilla Suit: National Gorilla Suit Day:

https://en.wikipedia.org/wiki/Gorilla_suit#National_Gorilla_...

Put the Gorilla back in National Gorilla Suit Day:

https://www.instagram.com/mad.magazine/p/C2xgmVqOjL_/

Gorilla Suit Day – January 31, 2026:

https://nationaltoday.com/gorilla-suit-day/

National Gorilla Suit Day:

https://www.youtube.com/watch?v=N2n5gAN3IlI

noplacelikehome

There sure is a lot of chest beating

noplacelikehome

Nix is awesome for this -- write your entire series of CI tools in she'll or Python and run them locally in the exact same environment as they will run in CI. Add SOPS to bring secrets along for the ride.

jimbokun

Would Nix work well with GitHub Actions? Or is it more of a replacement? How do you automate running tests and deploying to dev on every push, for example?

mikepurvis

Strongly isolated systems like Nix and Bazel are amazing for giving no-fuss local reproducibility.

Every CI "platform" is trying to seduce you into breaking things out into steps so that you can see their little visualizations of what's running in parallel or write special logic in groovy or JS to talk to an API and generate notifications or badges or whatever on the build page. All of that is cute, but it's ultimately the tail wagging the dog— the underlying build tool should be what is managing and ordering the build, not the GUI.

What I'd really like for next gen CI is a system that can get deep hooks into local-first tools. Don't make me define a bunch of "steps" for you to run, instead talk to my build tool and just display for me what the build tool is doing. Show me the order of things it built, show me the individual logs of everything it did.

Same thing with test runners. How are we still stuck in a world where the test runner has its own totally opaque parallelism regime and our only insight is whatever it chooses to dump into XML at the end, which will be probably be nothing if the test executable crashes? Why can't the test runner tell the CI system what all the processes are that it forked off and where each one's respective log file and exit status is expected to be?

steeleduncan

> Write as much CI logic as possible in your own code

Nix really helps with this. Its not just that you do everything via a single script invocation, local or ci, you do it in an identical environment, local or ci. You are not trying to debug the difference between Ubuntu as setup in GHA or Arch as it is on your laptop.

Setting up a nix build cache also means that any artefact built by your CI is instantly available locally which can speed up some workflows a lot.

mikepurvis

Absolutely. Being able to have a single `nix build` line that gets all the way from source to your final asset (iso, ova, container image, whatever) with everything being aggressively cached all the way along is a game changer. I think it's worth the activation energy for a lot more organizations than realize it.

shykes

Dagger.io does this out of the box:

- Everything sandboxed in containers (works the same locally and in CI)

- Integrate your build tools by executing them in containers

- Send traces, metrics and logs for everything at full resolution, in the OTEL format. Visualize in our proprietary web UI, or in your favorite observability tool

nand_gate

It's doa, Sol.

nand_gate

Why would you need extra visualisation anyway, tooling like Nix is already what you see is what you get!

jkarni

It’s still helpful to eg fold different phases in Nix, and different derivation output.

I work on garnix.io, which is exactly a Nix-based CI alternative for GitHub, and we had to build a lot of these small things to make the experience better.

squiggleblaz

Basically an online version of nix-output-monitor. Might be half an idea. But it doesn't get you 100%: you get CI, but not CD.

mikepurvis

Delivery meaning the deployment part? I think by necessity that does differ a bit from happens locally just because suddenly there's auth, inventory, maybe a staging target, whatever.

All of that is a lot more than what a local dev would want, deploying to their own private test instance, probably with a bunch of API keys that are read-only or able to write only to other areas meant for validation.

specialist

We used to just tail the build script's output.

Maybe add some semi-structured log/trace statements for the CI to scrap.

No hooks necessary.

mikepurvis

That works so long as the build script is just doing a linear series of things. But if it's anything remotely modern then a bunch of stuff is going on in parallel, and if all the output is being funneled to a single log, you can end up with a fair bit of wind-down spew you have to scroll through to find the real/initial failure.

How much better would it be if the CI web client could just say, here's everything the build tool built, with their individual logs, and here's a direct link to the one that failed, which canceled everything else?

teeray

> What I'd really like for next gen CI is a system that can get deep hooks into local-first tools.

But how do you get that sweet, sweet vendor-lock that way? /s

doix

I came from the semiconductor industry, where everything was locally hosted Jenkins + bash scripts. The Jenkins job would just launch the bash script that was stored in perforce(vcs), so all you had to do to run things locally was run the same bash script.

When I joined my first web SaaS startup I had a bit of a culture shock. Everything was running on 3rd party services with their own proprietary config/language/etc. The base knowledge of POSIX/Linux/whatever was almost completely useless.

I'm kinda used to it now, but I'm not convinced it's any better. There are so many layers of abstraction now that I'm not sure anybody truly understands it all.

Xcelerate

Haha, I had the same experience going from scientific work in grad school to big tech. The phrase “a solution in search of a problem” comes to mind. The additional complexity does create new problems however, which is fine for devops, because now we have a recursive system of ensuring job security.

It blows my mind what is involved in creating a simple web app nowadays compared to when I was a kid in the mid-2000s. Do kids even do that nowadays? I’m not sure I’d even want to get started with all the complexity involved.

DrFalkyn

Creating a simple web app isn’t that hard.

If you want to use a framework The React tutorials from Traversy media are pretty good. You can even do cross platform into mobile app with frameworks like React Native or Flutter if you want iOS/Android native apps.

Vite has been a godsend for React/Vue. It’s no longer the circus it was in the mid 2010s. Google’s monopoly has made things easier for web devs. No more babel or polyfill or createReactApp.

People do still avoid frameworks and use raw HTML/CSS/Javascript. HTMX has made sever fetches a lot easier.

You probably want a decent CSS framework for reponsive design. Everyone used to use minimalist ones like Tailwimd have become more popular.

If you need a backend and want to do something simple you can use BaaS (Backend as a service) platforms like Firebase. Otherwise setting up a NodeJS server with some SQL or KV store like SQLLite or MongoDB isn’t too difficult

CI/CD systems exist to streamline testing and deployment for large complex apps. But for individual hobbyist projects it’s not worth it.

null

[deleted]

sgarland

> I'm kinda used to it now, but I'm not convinced it's any better.

It’s demonstrably worse.

> The base knowledge of POSIX/Linux/whatever was almost completely useless.

Guarantee you, 99% of the engineering team there doesn’t have that base knowledge to start with, because of:

> There are so many layers of abstraction now that I'm not sure anybody truly understands it all.

Everything is constantly on fire, because everything is a house of cards made up of a collection of XaaS, all of which are themselves houses of cards written by people similarly clueless about how computers actually operate.

I hate all of it.

zamalek

> I'm not convinced it's any better.

Your Jenkins experience is more valuable and worth replicating when you get the opportunity.

verdverm

We're doing the same, but replacing the bash script with Dagger.

Once you get on Dagger, you can turn your CI into minimal Dagger invocations and write the logic in the language of your choice. Runs the same locally and in automation

nsonha

it's just common sense, which is unfortunately lost with sloppy devs. People go straight from junior dev to SRE without learning engineering principles through building products first.

jimbokun

I feel like more time is spent getting CI working these days than on the actual applications.

Between that and upgrading for security patches. Developing user impacting code is becoming a smaller and smaller part of software development.

cookiengineer

This.

I heavily invested in a local runner based CI/CD workflow. First I was using gogs and drone, now the forgejo and woodpecker CI forks.

It runs with multiple redundancies because it's a pretty easy setup to replicate on decentralized hardware. The only thing that's a little painful is authentication and cross-system pull requests, so we still need our single point of failure to merge feature branches and do code reviews.

Due to us building everything in go, we also decided to have always a /toolchain/build.go so that we have everything in a single language, and don't need even bash in our CI/CD podman/docker images. We just use FROM scratch, with go, and that's it. The only exception being when we need to compile/rebuild our ebpf kernel modules.

To me, personally, the Github Actions CVE from August 2024 was the final nail in the coffin. I blogged about it in more technical detail [1] and guess what was the reason that the TJ actions have been compromised last week? Yep, you guessed right, the same attack surface that Github refuses to fix, a year later.

The only tool, as far as I know, that somehow validates against these kind of vulnerabilities, is zizmor [2]. All other tools validate schemas, not vulnerabilities and weaknesses.

[1] https://cookie.engineer/weblog/articles/malware-insights-git...

[2] https://github.com/woodruffw/zizmor

pcthrowaway

My years using Concourse were a dream compared to the CI/CD pains of trying to make github actions work (which I fortunately didn't have to do a lot of). Add that to the list of options for people who want open source and their own runners

regularfry

One of the very few CI platforms that I've heard spoken well of was a big shared Concourse instance where the entire pipeline was predefined. You added some scripts named by convention to your project to do the right thing at each step, and it all just worked for you. Keeping it running was the job of a specific team.

sleepybrett

Did they finally actually say how the tj actions repo got compromised. When I was fixing that shit on saturday it was still 'we don't know how they got access!?!?'

cookiengineer

(I'm assuming you read my technical article about the problem)

If you take a look at the pull requests in e.g. the changed-files repo, it's pretty obvious what happened. You can still see some of the malformed git branch names and other things that the bots tried out. There were lots of "fixes" that just changed environment variable names from PAT_TOKEN to GITHUB_TOKEN and similar things afterwards, which kind of just delays the problem until malware is executed with a different code again.

As a snarky sidenote: The Wiz article about it is pretty useless as a forensics report, I expected much more from them. [1]

The conceptual issue is that this is not fixable unless github decides to rewrite their whole CI/CD pipeline, because of the arbitrary data sources that are exposed as variables in the yaml files.

The proper way to fix this (as Github) would be to implement a mandatory linter step or similar, and let a tool like zizmor check the file for the workflow. If it fails, refuse to do the workflow run.

[1] https://www.wiz.io/blog/github-action-tj-actions-changed-fil...

JanMa

Whenever possible I now just use GitHub actions as a thin wrapper around a Makefile and this has improved my experience with it a lot. The Makefile takes care of installing all necessary dependencies and runs the relevant build/Test commands. This also enables me to test that stuff locally again without the long feedback loop mentioned in other comments in this thread.

oulipo

mise (https://mise.jdx.dev/) and dagger (https://github.com/dagger/dagger) seem like nice candidates too!

Mise can install all your deps, and run tasks

jimmcslim

In addition to the other comments suggesting dagger is not the saviour due to being VC-funded, it seems like they have decided there's no money in CI, but AI... yes there's money there! And "something something agents".

From dagger.io...

"The open platform for agentic software.

Build powerful, controllable agents on an open ecosystem. Deploy agentic applications with complete visibility and cross-language capabilities in a modular, extensible platform.

Use Dagger to modernize your CI, customize AI workflows, build MCP servers, or create incredible agents."

lou1306

> * Don't bind yourself to some fancy new VC-financed thing that will solve CI once and for all but needs to get monetized eventually (see: earthly, dagger, etc.)

Literally from comment at the root of this thread.

fireflash38

I implemented a thing such that the makefiles locally use the same podman/docker images as the CI/CD uses. Every command looks something like:

target: $(DOCKER_PREFIX) build

When run in gitlab, the DOCKER_PREFIX is a no-op (it's literally empty due to the CI=true var), and the 'build' command (whatever it is) runs in the CI/CD docker image. When run locally, it effectively is a `docker run -v $(pwd):$(pwd) build`.

It's really convenient for ensuring that if it builds locally, it can build in CI/CD.

akanapuli

I dont quite understand the benefit. How does running commands from the Makefile differ from running commands directly on the runner ? What benefit does Makefile brings here ?

ZeWaka

You can't run GitHub actions yml workflows locally (officially, there's tools like act).

fiddlerwoaroof

If you have your CI runner use the same commands as local dev, CI basically becomes an integration test for the dev workflow. This also solves the “broken setup instructions” problem.

mwenge

Do you have a public example of this? I'd love to see how to do this with Github Actions.

cmsj

I don't have a makefile example, but I do functionally the same thing with shell scripts.

I let GitHub actions do things like the initial environment configuration and the post-run formatting/annotation, but all of the actual work is done by my scripts:

https://github.com/Hammerspoon/hammerspoon/blob/master/.gith...

JanMa

Sure, here's one example: https://github.com/JanMa/nomad-driver-nspawn/blob/master/.gi...

williamcotton

It doesn't (perhaps yet?) install the dependencies from the Makefile, but it runs a number of commands from the Makefile, eg, make test-leaks:

https://github.com/williamcotton/webdsl/blob/main/.github/wo...

ehansdais

After years of trial and error our team has come to the same conclusion. I know some people might consider this insanity, but we actually run all of our scripts as a separate C# CLI application (The main application is a C# web server). Effectively no bash scripts, except as the entry point here and there. The build step and passing the executable around is a small price to pay for the gain in static type checking, being able to pull in libraries as needed, and knowing that our CI is not going to down because someone made a dumb typo somewhere.

The other thing I would add is consider passing in all environment variables as args. This makes it easy to see what dependencies the script actually needs, and has the bonus of being even more portable.

baq

> I know some people might consider this insanity

Some people here still can’t believe YAML is used for not only configuration, but complex code like optimized CI pipelines. This is insane. You’re actually introducing much needed sanity into the process by admitting that a real programming language is the tool to use here.

I can’t imagine the cognitive dissonance Lisp folks have when dealing with this madness, not being one myself.

TeMPOraL

> I can’t imagine the cognitive dissonance Lisp folks have when dealing with this madness, not being one myself.

After a decade trying to fight it, this one Lisper here just gave up. It was the only way to stay sane.

I remain hopeful that some day, maybe within our lifetimes, the rapid inflation phase of software industry will end, and we'll have time to rethink and redo the fundamentals properly. Until then, one can at least enjoy some shiny stuff, and stay away from the bleeding edge, aka. where sewage flows out of pipe and meets the sea.

(It's gotten a little easier now, as you can have LLMs deal with YAML-programming and other modern worse-is-better "wisdom" for you.)

no_wizard

I'm shocked there isn't a 'language for config' that hasn't become the de facto standard and its YAML all the way down seemingly. I am with you 100%.

It would really benefit from a language that intrinsically understood its being used to control a state machine. As it is, that is what nearly all folks want in practice is a way to run different things based on different states of CI.

A lisp DSL would be perfect for this. Macros would make things alot easier in many respects.

Unfortunately, there's no industry consensus and none of the big CI platforms have adopted support for anything like that, they all use variants of YAML (I always wondered who started it with YAML and why everyone copied that, if anyone knows I'd love to read about it).

Honestly, I can say the same complaints hold up against the cloud providers too. Those 'infrastructure as code' SDKs really don't lean into the 'as code' part very well

motorest

> Some people here still can’t believe YAML is used for not only configuration, but complex code like optimized CI pipelines.

I've been using YAML for ages and I never had any issue with it. What do you think is wrong with YAML?

mschuster91

> Some people here still can’t believe YAML is used for not only configuration, but complex code like optimized CI pipelines. This is insane.

It's miles better than Jenkins and the horrors people created there. GitLab CI can at least be easily migrated to any other GitLab instance and stuff should Just Work because it is in the end not much more than self contained bash scripts, but Jenkins... is a clown show, especially for Ops people of larger instances. On one side, you got 50 plugins with CVEs but you can't update them because you need to find a slot that works for all development teams to have a week or two to fix their pipelines again, and on the other side you got a Jenkins instance for each project which lessens the coordination effort but you gotta worry about dozens of Jenkins instances. Oh and that doesn't include the fact many old pipelines aren't written in Groovy or, in fact, in any code at all but only in Jenkins's UI...

Github Actions however, I'd say for someone coming from GitLab, is even worse to work with than Jenkins.

robinwassen

Did a similar thing when we needed to do complex operations towards aws.

Instead of wrapping the aws cli command I wrote small Go applications using the boto3 library.

Removed the headaches when passing in complex params, parsing output and and also made the logic portable as we need to do the builds on different platforms (Windows, Linux and macOS).

noworriesnate

I've used nuke.build for this in the past. This makes it nice for injecting environment variables into properties and for auto-generating CI YAML to wrap the main commands, but it is a bit of a pain when it comes to scaling the build. E.g. we did infrastructure as code using Pulumi, and that caused the build code to dramatically increase to the point the Nuke script became unwieldy. I wish we had gone the plain C# CLI app from the beginning.

ozim

I don’t think it is insanity quite the opposite - insanity is trying to force everything in yaml or pipeline.

I have seen people doing absolutely insane setups because they thought they have to do it in yaml and pipeline and there is absolutely no other option or it is somehow wrong to drop some stuff to code.

motorest

> I don’t think it is insanity quite the opposite - insanity is trying to force everything in yaml or pipeline.

I'm not sure I understood what you're saying because it sounds too absurd to be real. The whole point of a CICD pipeline is that it automates all aspects of your CICD needs. All mainstream CICD systems support this as their happy path. You specify build stages and build jobs, you manage your build artifacts, you setup how things are tested, deployed and/or delivered.

That's their happy path.

And you're calling the most basic usecases of a standard class if tools as "insanity"?

Please help me explain what point you are trying to make.

mst

Honestly, "using the same language as the application" is often a solid choice no matter what the application is written in. (and I suspect that for any given language somebody might propose as an exception to that rule, there's more than one team out there doing it anyway and finding it works better for them than everything else they've tried)

7bit

> The other thing I would add is consider passing in all environment variables as args. This makes it easy to see what dependencies the script actually needs, and has the bonus of being even more portable.

This is the dumbest thing I see installers do a lot lately.

no_wizard

Am I an outlier in that not only do I find GitHub actions pleasant to use, but that most folks over complicate their CI/CD pipelines? I've had to re-write alot of actions configurations over the last few years, and in every case, the issue was simply not thinking through the limits of the platform, or when things would be better to run as custom docker images (which you can do via GitHub Actions) etc.

It tends to be that folks want to shoehorn some technology into the pipeline that doesn't really fit, or they make these giant one shot configurations instead of running multiple small parallel jobs by setting up different configurations for different concerns etc.

davidham

I'm with you! I kind of love GitHub Actions, and as long as I keep it to tools and actions I understand, I think it works great. It's super flexible and has many event hooks. It's reasonably easy to get it to do the things I want. And my current company has a pretty robust CI suite that catches most problems before they get merged in. It's my favorite of the CI platforms I have used.

gchamonlive

The way that gitlab shines is just fundamentally better than GitHub actions.

It's really easy to extend and compose jobs, so it's simple to unit test your pipeline: https://gitlab.com/nunet/test-suite/-/tree/main/cicd/tests?r...

This way I can code my pipeline and use the same infrastructure to isolate groups of jobs that compose a relevant functionality and test it in isolation to the rest of the pipeline.

I just wish components didn't have such a rigid opinion on folder structure, because they are really powerful, but you have to adopt gitlab prescription

tobinfekkes

This is the joy of HN, for me, at least. I'm genuinely fascinated to read that both GitHub Actions and DevOps are (apparently) so universally hated. I've been using both for many years, with barely a hiccup, and I actually really enjoy and value what they do. It would never have dawned on me, outside this thread, to think that so many people dislike it. Nice to see a different perspective!

Are the Actions a little cumbersome to set up and test? Sure. Is it a little annoying to have to make somewhat-useless commits just to re-trigger an Action to see if it works? Absolutely. But once it works, I just set it and forget it. I've barely touched my workflows in ~4 years, outside of the Node version updates.

Otherwise, I'm very pleased with both. My needs must just be simple enough to not run into these more complicated issues, I guess?

dathinab

It really depends on what you do?

GitHub CI is designed in a way which tends to work well for

- languages with no or very very cheap "compilation" steps (i.e. basically only scripting languages)

- relatively well contained project (e.g. one JS library, no mono repo stuff)

- no complex needs for integration tests

- no need for compliance enforcement stuff, especially not if it has to actually be securely enforced instead of just making it easier to comply then not to comply

- all developers having roughly the same permissions (ignore that some admin has more)

- fast CI

but the moment you step away from this it just falls more and more and more apart and I every company which doesn't fit the constraints above I have seen so far has non stop issues with GitHub Actions.

But the worst part, which maybe is where a lot of hatred comes from, is that it's there for cheap maybe even free (if you anyway pay for GitHub) and it doesn't need an additional contract, billing, etc. Not an additional vetting of 3rd party companies. Doesn't need managing your own CI service etc. So while it does cause issues non stop it also seems initially still "cheaper" solution for the company. And then when your company realizes it's not and has to setup their own GitHub runner etc. it probably isn't. But that is if you properly account dev time spend on "fixing CI issues" and even then there is the sunk cost fallacy because you already spend so much time to make github actions work and you would have to port everything over etc. Also, realistically speaking, a lot of other CI solutions are only marginally better.

voxic11

> no need for compliance enforcement stuff

I find github actions works very well for compliance. The ability to create attestations makes it easy to enforce policies about artifact provenance and integrity and was much easier to get working properly compared to my experience attempting to get jenkins to produce attestations.

https://docs.github.com/en/actions/security-for-github-actio...

What was your issue with it?

guappa

They also work very well to leak all your secrets and infect people who download your software from pypi :D

tasuki

> languages with no or very very cheap "compilation" steps (i.e. basically only scripting languages)

This is not true at all. It's fine with Haskell, just cache the dependencies to speed up the build...

dathinab

except that

- GitHub Action cache and build artifact handling is a complete shit show (slow upload, slow download and a lot of practical subtle annoyances, finished off with sub-par integration in existing build systems)

- GitHub runners are comparatively small, so e.g. larger linker steps can already lead to pretty bad performance penalties

and sure like I said, if you project is small it doesn't matter

Marsymars

> But the worst part, which maybe is where a lot of hatred comes from, is that it's there for cheap maybe even free (if you anyway pay for GitHub) and it doesn't need an additional contract, billing, etc.

Or even if you pay $$$ for big runners you can roll it onto your Azure bill rather than having to justify another SAAS service.

lolinder

> Also, realistically speaking, a lot of other CI solutions are only marginally better.

This is the key point. Every CI system falls apart when you get too far from the happy path that you lay out above. I don't know if there's an answer besides giving up on CI all together.

jillesvangurp

I use GH actions. You should treat it like all build systems: let them do what they are good at and nothing else. The rest should be shell scripts or separate docker containers. If it gets complicated, dumb it down to "run this script". Scripts are a lot easier to write and debug than thousands of lines of yaml doing god knows what.

The problem isn't github actions but people overloading their build and CI system with all sorts of custom crap. You'd have a hard time doing the same thing twenty years ago with Ant and Hudson (Jenkin's before the fork after Oracle inherited that from Sun). And for the same reason. These systems simply aren't very good as a bash replacement.

If you don't know what Ant is. That was a popular build system for Java before people moved the problem to Maven and then to Gradle (without solving it). I've dealt with Maven files that were trying to do all sorts of complicated things via plugins that would have amounted to two or three lines of bash. Gradle isn't any better. Ant at least used to have simple primitives for "doing" things. But you had to spell it out in XML form.

The point of all this, is that build & CI systems should mainly do simple things like building software. They shouldn't have a lot of conditional logic, custom side effects, and wonky things that may or may not happen depending on the alignment of the moon and stars. Debugging that stuff when it fails to work really sucks.

What helps with Yaml is using Yaml generators. I've used a Kotlin one for a while. Basically, you get auto complete, syntactical sanity, type checking and if it compiles it runs. Also makes it a lot easier to discover new parameters, plugin version updates, etc.

motorest

> I use GH actions. You should treat it like all build systems: let them do what they are good at and nothing else. The rest should be shell scripts or separate docker containers.

That's supposedly CICD 101. I don't understand why people in this thread seem to be missing this basic fact and instead they vent about irrelevant things like YAML.

You set your pipeline. You provide your own scripts. If a GitHub Action saves you time, you adopt it instead of reinventing the wheel. That's it.

This whole discussion reads like the bike fall meme.

int_19h

If the sole purpose of GitHub Actions is to run a few shell scripts in order, why does it have expression evaluation, conditions, and dozens of stock actions other than `run`?

pepoluan

People hates YAML because doing so makes them look cool and trendy. Just like Python-hating. Even if their 'hate' is misdirected.

I'm an experienced SaltStack user. If I found something I need is too complex to be described in YAML, I'll just write a custom module and/or state. Use YAML just to inform Salt what should happen, and shove the logic in the Python files.

People really should become generalists if they handle the plumbing.

anonzzzies

We see quite a lot of organisations inside because of the business we have, and, while usually this is not our task, when I hear these stories and see people struggle with devops stuff in reality, the first thing we push for is to do anything to dumb it down and remove all the dependencies on 3rd party providers so we are back to having everything run again like, in this case, the hello world of github actions. It is literally always the case that the people who complain have this (very HN, so funny you say that) thing of absolutely grossly overarchitecting and writing things that are just there because they read it on HN/some subreddits/discord. We sometimes walk into struggling teams where we check the commits / setup only to find out they did things like switch package manager/bundler/etc 5x in the past year (this is definitely a HN thing where a new packagemanager for js pops up every 14 minutes). Another terrible thing looking at 10+ year codebases, we see js, ts, py, go, rust and when we ask wtf, they tell us something something performance. Of course the language was never the bottleneck of these (people here would be pretty scared to see how bad database setups are even for multi million$ projects in departmental or even enterprise wide; the DBA's in the basement know but they are not consulted for various reasons), mostly LoB, apps. And the same happens with devops. We only work for large companies, almost never startups, and these issues are usually departmental (because big bad Java/Oracle IT in the basement doesn't allow anything so they have budgets to do their own), but still, it's scary how much money is being burnt on these lame new things that won't survive anyway.

IshKebab

Sounds like you have the same pain points as everyone else; you're just more willing to ignore them.

I am with the author - we can do better than the status quo!

tobinfekkes

I guess it's possible. But I also don't really have anything to ignore....? I genuinely never have an issue; it builds code, every time.

I commit code, push it, wait 45 seconds, it syncs to AWS, then all my sites periodically ping the S3 bucket for any changes, and download any new items. It's one of the most reliable pieces of my entire stack. It's comically consistent, compared to anything I try building for a mobile app or pushing to a mobile app store.

I look forward to opening my IDE to push code to the Actions for my web app, and I dread the build pipeline for a mobile app.

IshKebab

> I genuinely never have an issue; it builds code, every time.

Well yeah because nobody is saying it isn't reliable. It's the setup stage that is painful. Once you've done it you can just leave it mostly.

I guess if your CI is very simple and always the same you are exposed to these issues less.

michaelmior

> I dread the build pipeline for a mobile app.

I would recommend looking at Fastlane[0] if you haven't already.

[0] https://github.com/fastlane/fastlane

dkdbejwi383

The pain points sound pretty trivial though.

You notice a deprecation warning in the logs, or an email from GitHub and you make a 1 line commit to bump the node version. Easy.

Sure you can make typos that you don’t spot until you’ve pushed and the action doesn’t run, but I quickly learned to stop being lazy and actually think about what I’m writing, and get someone else to do an actual review (not just scroll down and up and give it a LGTM).

My experience is same as the commenter above, it’s relatively set and forget. A few minutes setup work for hours and hours of benefit over years of builds.

ironmagma

The non-solution solution, to simply downplay the issues instead of fixing them. You can solve almost anything this way, but also isn't it nice when things around you aren't universally slightly broken?

raffraffraff

It probably depends on your org size and how specialised you are. Right now I dislike GitHub Actions and think that Gitlab CI is way better, but I also don't give it to much thought because it's a once in a blue moon task for me to mess with them. But I would absolutely hate to be a "100% DevOps guy" for a huge organisation that wants me to specialise in this stuff all the time. I think that by the end of week 1 I'd go mad.

Marsymars

I don't mind it per se; to me the problem is then that some devs don't bother with basic debugging steps of CI failures - if anything works locally and fails in CI, their first step is to message me - so instead of being "100% DevOps" I spend a pile of time debugging other devs' local environments.

thom

Unless I'm misunderstanding, you can use workflow_dispatch to avoid having to make useless commits to trigger actions.

duped

I have a small gripe that I think exemplifies a bigger problem. actions/upload-artifact strips executable permissions from binaries (1). The fact they fucked this up in the first place, and six years later haven't fixed it, gives me zero confidence in the team managing their platform. And when I'm picking a CI/CD service, I want reliability and correctness. GH has neither.

When it takes all of a day to self host your own task runner on a laptop in your office and have better uptime, lower cost, better performance, and more correct implementations, you have to ask why anyone chooses GHA. I guess the hello-world is convincing enough for some people.

(1) https://github.com/actions/upload-artifact/issues/38

chanux

You must have simple, straightforward flow touched only by a handful of folks max.

The world is full of kafkaesque nightmares of Dev-ops pipeline "designed" and maintained by committees of people.

It's horrible.

That said, for some personal stuff I have Google Cloud Build that has a very VERY simple flow. Fire, forget and It's been good.

eru

You might like 'git commit --allow-empty' to make your somewhat-useless commits.

But honestly, doesn't github now have a button you can press to retrigger actions without a commit?

GitHub Actions are least hassle, when you don't care about how much compute time you are burning through. Either because you are using the free-for-open-source repositories version, or because your company doesn't care about the cost.

If you care about the compute time you are burning, then you can configure them enough to help with that, but it quickly becomes a major hassle.

xlii

There is one thing that I haven’t seen mentioned: worst possible feedback loop.

I’ve noticed this phenomenon few times already, and I think there’s nothing worse than having a 30-60s feedback loop. The one that keeps you glued to the screen but otherwise is completely nonproductive.

I tried for many moons to replicate GHA environment on local and it’s impossible in my context. So every change is like „push, wait for GH to pickup, act on some stupid typo or inconsistency, rinse, repeat”.

It’s like a slot machine „just one more time and it will run”, eating away focus and time.

It took me 25 minutes to get 5s build process. Naive build with GHA? 3 minutes, because dependencies et al. Ok, let’s add caching. 10 hours fly by.

The cost of failure and focus drop is enormous.

kelseydh

Feel this pain so much. If you are debugging Github Action container builds, and each takes over ~40 minutes to build.. you can burn through a whole work day only testing six or seven changes.

There has to be a better way. How has nobody figured this out?

elAhmo

There is act, that allows you to run actions locally. Although not exactly the same as the real thing, it can save time.

https://github.com/nektos/act

mab122

In organization setting this is almost useless if you are (or forced to) use some pre-made actions and/or actions that are for your organization only (they cannot be downloaded) also useless if you are forced to use self hosted runner with image that you don't have access to. Not to mention env/secrets and networking...

terminalbraid

This is a great tool, but I always cringe when something so important comes from a third party

cantagi

act is brilliant - it really helps iterate on github or gitea actions locally.

esafak

There's dagger; CI as code. Test your pipeline locally, in your IDE.

hv42

With GitLab, I have found https://github.com/firecow/gitlab-ci-local to be an incredible time-saver when working with GitLab pipelines (similar to https://github.com/nektos/act for GitHub)

I wish GitLab/GitHub would provide a way to do this by default, though.

cantagi

act is great. I use it to iterate on actions locally (I self-host gitea actions, which uses act, so it's identical to github actions).

lsuresh

This is exactly a big piece of our frustration -- the terrible feedback loop and how much mental space it wastes. OP does talk about this at the end (babysitting the endless "wip" commits till something works).

figmert

Highly recommend nektos/act, and if it's something complex enough, you can Ssh into the server to investigate. There are many action that facilitate this.

tomjakubowski

I use LLMs for a lot of things these days, but maybe the most important one is as a focus-preserving mechanism for exactly these kinds of middle-ground async tasks that have a feedback loop measured in a handful of minutes.

If the process is longer than a few minutes, I can switch tasks while I wait for it. It's waiting for those things in the 3-10 minute range that is intolerable for me: long enough I will lose focus, not long enough for me to context switch.

Now I can bullshit with the LLM about something related to the task while I wait, which helps me to stay focused on it.

silisili

I worked at companies using Gitlab for a decade, and got familiar with runners.

Recently switched to a company using Github, and assumed I'd be blown away by their offering because of their size.

Well, I was, but not in the way I'd hoped. They're absolutely awful in comparison, and I'm beyond confused how it got to that state.

If I were running a company and had to choose between the two, I'd pick Gitlab every time just because of Github actions.

yoyohello13

Glad I’m not the only one. GitLab runners just make sense to me. A container you run scripts in.

I have some GitHub actions for some side projects and it just seems so much more confusing to setup for some reason.

briansmith

Actions have special integration with GitHub (e.g. they can annotate the pull request review UI) using an API. If you forgo that integration, then you can absolutely use GitHub Actions like "a container you run scripts in." This is the advice that is usually given in every thread about GitHub Actions.

byroot

That helps a bit but doesn't solve everything.

If you want to make a CI performant, you'll need to use some of its features like caches, parallel workers, etc. And GHA usability really fall short there.

The only reason I put up with it is that it's free for open source projects and integrated in GitHub, so it took over Travis-ci a few years ago.

mubou

Devil's advocate: They could make the github CLI capable of doing all of those things (if it's not already), and then the only thing the container needs is a token.

HdS84

There are lots of problems. Actions try to abstract the script away and give you a consistent experience and, must crucially, allow sharing. Because gitlab has no real way to share actions or workflows (I can do yaml include, but come on that sucks even harder than actions) you are constantly reinventing the wheel. That's ok if all you do is " build folder" but if you need caching, reporting of issues, code coverage etc. Pp it gets real ugly really fast. Example: yesterday I tried services, i.e. starting up some DB and backend containers to run integration tests against. Unfortunately, you cannot expand dynamic variables (set by previous containers) but are limited to already set bars. So back to docker compose...and the gitlab pipelines are chock full of such weird limitations

kroolik

You can apply dynamic env to other jobs by exporting an env file as a dotenv artifact. So first job creates a dotenv file and export it as artifact. Second depends on the first so it can consume the artifact. https://docs.gitlab.com/ci/yaml/artifacts_reports/#artifacts...

raffraffraff

I haven't looked too much into how sharing workflows works, but isn't the use of shared GitHub workflows (from outside your org) a little dangerous? I get it, we use other people's code all the time. Some we trust more (ISO of a Linux OS with SHA) and others we trust a little less even if it comes from a verified source with GPG, because we know that supply chain attacks can happen.

Every time someone introduced a new way to use someone else's shared magic I feel nervous about using it. Like GitHub Actions. Perhaps it's time for me to dig into them a bit more and try to understand if/how they're safe to use. But I seem to remember just a few days ago someone mentioning a GitHub action getting hijacked?

daveau

they have this now: https://docs.gitlab.com/ci/components/

null

[deleted]

usr1106

So Github was really the perfect acquisation for the Microsoft portfolio. Applications with a big market share that are technically inferior to the competition.

// Luckily still a gitlab user, but recently forced to Microsoft Teams and office.

out-of-ideas

> recently forced to Microsoft Teams

my condolences to you and your team for that switch; it's my 2nd used-and-disliked thing (right next to atlassian) - oh well

but one cool feature i found with ms teams that zoom did not have (some years ago - no clue now) is turning off incoming video so you dont have to be constantly distracted in meetings

edit: oh yeah, re github actions and the user that said: > Glad I’m not the only one

me too, me too; gh actions seem frustrating (from a user hardly using gh actions, and more gitlab things - even though gitlab seems pretty wonky at times, too)

rhubarbtree

Technical superiority is so irrelevant compared to distribution. Welcome to capitalism, where the market rewards marketing.

OJFord

> I have some GitHub actions for some side projects and it just seems so much more confusing to setup for some reason.

Because the docs are crap perhaps? I prefer it, having used both professionally (and Jenkins, Circle, Travis), but I do think the docs are really bad. Even just the nesting of pages once you have them open, where is the bit with the actual bloody syntax reference, functions, context, etc.

globular-toast

Same. I'd been using Gitlab for a few years when Actions came out. Looked at it and thought, wow that's weird, but gave it the benefit of the doubt as it's just different, surely it would make sense eventually. Well no, it doesn't make sense, and seeing all the shocked Pikachu at the action compromise the other day was amusing.

zamalek

> I'm beyond confused how it got to that state.

A few years back I wanted to throw in the towel and write a more minimal GHA-compatible agent. I couldn't even find where in the code they were calling out to GitHub APIs (one goal was to have that first party progress UI experience). I don't know where I heard this, so big hearsay warning, but apparently nobody at GitHub can figure it out either.

jalaziz

GitHub Actions started off great as they were quickly iterating, but it very much seems that GitHub has taken its eye of the ball and the improvements have all but halted.

It's really upsetting how little attention Actions is getting these days (<https://github.com/orgs/community/discussions/categories/act...> tells the story -- the most popular issues have gone completely unanswered).

Sad to see Earthly halting development and Dagger jumping on the AI train :(. Hopefully we'll get a proper alternative.

On a related note, if you're considering https://www.blacksmith.sh/, you really should consider https://depot.dev/. We evaluated both but went with Depot because the team is insanely smart and they've solved some pretty neat challenges. One of the cooler features is that their caching works with the default actions/cache action. There's absolutely no need to switch out popular third party actions in favor of patched ones.

shykes

> Sad to see Earthly halting development and Dagger jumping on the AI train :(. Hopefully we'll get a proper alternative.

Hi, Dagger CEO here. We're advertising a new use case for Dagger (running AI agents) while continuing to support the original use case (running complex builds and tests). Dagger has always been a general purpose engine, and our community has always used it for more than just CI. It's still the exact same engine, CLI, SDKs and observability stack. It's not like we're discontinuing a product, to the contrary: we're getting more workloads on the platform, which benefits all our users.

jalaziz

Great to know. I think the fear is that so many companies are prioritizing AI workloads for the valuation bump rather than delivering actual meaningful value.

shykes

I completely understand that fear. I see lots of other tech companies making that mistake, throwing away a perfectly good product and market out of pure "FOMO". I really, really don't want us to be one of those companies.

I think what we're doing is different: we built a product that was always meant to be general purpose; encouraged our community to experiment with alternative use cases; and are now doubling down on a new use case, for the same product. We are still worried about the perception of a FOMO-driven AI pivot (and the reactions on this thread confirm that we still have work to do there); but we're confident that the product really is capable of supporting both.

Thank you for the thoughtful comments, I appreciate it.

SamuelAdams

A lot of GH actions teams were impacted by layoffs in November.

Example:

https://github.com/actions/runner/pull/2477#issuecomment-244...

mike_hearn

Presumably the issue is that GH underpriced Actions such that it's not worth improving because driving more usage won't drive revenue, and that then forced prices down for everyone else because everyone fixed on the Actions pricing.

pinkgolem

I might have missed the news, but I did not find anything in regards to earthly stopping development

What happened there?

jalaziz

I missed it too, but then found this: https://github.com/earthly/earthly/issues/4313

12_throw_away

Sigh, this is awful. Earthly is/was not perfect, but is basically the most capable build tool I've ever used. Fingers crossed there's enough enthusiasm in the community to fork it (I'd be organizing it myself if I had any experience with Go at all)

pimeys

We switched to Depot last week. Our Rust builds went down from 20+ minutes to 4-8 minutes. The easy setup and their docker builds with fast caching are really good.

lsuresh

This sounds promising. What made your Rust builds become that fast? Any repo you could point us to?

jalaziz

Check out this Dockerfile template if you're building Rust in Docker: https://depot.dev/docs/container-builds/how-to-guides/optima...

What makes Depot so fast is that they use NVMe drives for local caching and they guarantee that the cache will always be available for the same builders. So you don't suffer from the cold-start problem or having to load your cache from slow object storage.

suryao

If you're building rust containers, we have the world's fastest remote container builders with automated caching.

You wouldn't really have to change anything on your dockerfile to leverage this and see significant speed up.

The docs are here: https://docs.warpbuild.com/docker-builders#usage

solatic

> Trivial mistakes (formatting, unused deps, lint issues) should be fixed automatically, not cause failures.

Do people really consider this best practice? I disagree. I absolutely don't want CI touching my code. I don't want to have to remember to rebase on top of whatever CI may or may not have done to my code. Not all linters are auto-fixable so anyway some of the time I would need to fix it from my laptop. If it's a trivial check it should run as a pre-commit hook anyway. What's next, CI should run an LLM to auto-fix failing test cases?

Do people actually prefer CI auto-fixing anything?

thedougd

I think this is where things went off the rails for him. Commiting back to the same branch that is running CI has too many gotchas in any CI system. You touched on the first issue, the remote branch immediately deviates unexpectedly from the local branch. Care has to be taken not to trigger additional CI runs from that commit.

stared

I do such things with pre-commit.

Doing it in CI sounds like making things more complicated by resetting to remote branches after pushing commits. And, in the worst case, something that actually brakes code that works locally.

Marsymars

I have team members who complain that installing and running pre-commit is too much overhead, so instead I see them pushing commit after broken commit that tie up CI resources to fail on the pre-commit workflow. :(

michpoch

> I have team members who complain that installing and running pre-commit is too much overhead

Why do they have a say in this? This is up to tech leadership to set standards that need to be followed.

anbotero

I’ve had people like this. I’m with the other commenter that: Why do they have a say in this? No way I’m letting them decide each day when to format, what style to format to... Meet, discuss, pick a style, enforce formatting, screw you if you don’t follow.

I’m also with the other commenter about settings these things at the Editor level, but also at the pre-push level.

We benchmark how long it takes to format/lint only changed files, usually no more than a second, maybe two, but I admit for some languages this may take more. An editor with a language server properly setup would have helped you find issues earlier.

We also have reports for our CI pipeline linters, so if we see more than 1 report there, we sent a message to the team: It means someone didn’t setup their editors nor their git hooks.

If the checks take more than a second, yeah, probably pre-commit is not the place/moment. Reliability is important, but so is user experience. I had companies where they ran the unit test suite at the pre-commit level, alright? And that is NOT fine. While it sounds like it’ll find issues earlier, it’ll screw your developer time if they have to wait seconds/minutes each time they fix a comma.

tenacious_tuna

I'm one of these; I'm loathe to put anything between me and making a commit and most of our linters take several dozen seconds to run. That's unacceptable UX to me; I can disable with `--no-check`, but it's always annoying to remember that when the thing I most want to do is save my working state.

I'd rather have linting pushed into the editing process, within my IDE/VS Code/vim plugins, whathaveyou, where it can feedback-loop with my actual writing process and not just be some ancillary command I run with lots of output I never read.

llm_nerd

That part immediately made me short circuit out of the piece. That sounds like a recipe for disaster and an unnecessary complexity that just brings loads of new failure modes. Not a best practice.

Trivial mistakes in PRs are almost always signs of larger errors.

ben_pfaff

I'm new to CI auto-fixes. My early experience with it is mixed. I find it annoying that it touches my code at all, but it does sometimes allow a PR to get further through the CI system to produce more useful feedback later on. And then a lot of the time I end up force-pushing a branch that is revised in other ways, in which case I fold in whatever the CI auto-fix did, either by squashing it in or by applying it in some other way.

(Most of the time, the auto-fix is just running "cargo fmt".)

kylegalbraith

This was an interesting read and highlighted some of the author's top-of-mind pain points and rough edges. However, in my experience, this is definitely not an exhaustive list, and there are actually many, many, many more.

Things like 10 GB cache limits in GitHub, concurrency limits based on runner type, the expensive price tag for larger GitHub runners, and that's before you even get to the security ones.

Having been building Depot[0] for the past 2.5 years, I can say there are so many foot guns in GitHub Actions that you don't realize until you start seeing how folks are bending YAML workflows to their will.

We've been quite surprised by the `container` job. Namely, folks want to try to use it to create a reproducible CI sandbox for their build to happen in. But it's surprisingly difficult to work with. Permissions are wonky, Docker layer caching is slow and limited, and paths don't quite work as you thought they did.

With Depot, we've been focusing on making GitHub Actions exponentially faster and removing as many of these rough edges as possible.

We started by making Docker image builds exponentially faster, but we have now brought that architecture and performance to our own GHA runners [1]. Building up and optimizing the compute and processes around the runner to make jobs extremely fast, like making caching 2-10x faster without having to replace or use any special cache actions of ours. Our Docker image builders are right next door on dedicated compute with fast caching, making the `container` job a lot better because we can build the image quickly, and then you can use that image right from our registry in your build job.

All in all, GHA is wildly popular. But, the sentiment around even it's biggest fans is that it could be a lot better.

[0] https://depot.dev/

[1] https://depot.dev/products/github-actions

SkiFire13

By what measure is this "exponentially faster"? Surely GH doesn't take an exponential time in the number of steps of the workflow...

magicalhippo

Depot looks nice, but also looks fairly expensive to me. We're a small B2B company, just 10 devs, but we'd be looking at 200+500 = $700/mo just for building and CI.

I guess that would be reasonable if we really needed the speedup, but if you're also offering a better QoL GHA experience then perhaps another tier for people like us who don't necessarily need the blazing speed?

suryao

You might want to check out my product, WarpBuild[0].

We are fully usage based, no minimums etc., and our container builders are faster than others on the market.

We also have a BYOC option that gives 10x cost reduction and used by many customers at scale.

[0] https://warpbuild.com

kylegalbraith

We're rolling out new pricing in the next week or two that should likely cover your use case. Feel free to ping me directly, email in my bio, if you'd like to learn more.

axelfontaine

At https://sprinters.sh we offer AWS-hosted runners at a price point that will be much more suitable for a company like yours.

Aeolun

Depot is fantastic. Can heavily recommend it. It’s like magic when your builds suddenly take 1m instead of 5+ just by switching the runner.

tasuki

> Things like 10 GB cache limits in GitHub

10,000,000,000 bytes should be enough for anyone! It really is a lot of bytes...

hn_throwaway_99

> A few days ago, someone compromised a popular GitHub Action. The response? "Just pin your dependencies to a hash." Except as comments also pointed out, almost no one does.

I used GitHub actions when building a fin services app, so I absolutely used the hash to specify Action dependencies.

I agree that this should be the default, or even the required, way to pull in Action dependencies, but saying "almost no one does" is a pretty lame excuse when talking about your own risk. What other people do has no bearing on your options here.

Pin to hashes when pulling in Actions - it's much, much safer

dijit

I think the HN community at large had a bit of a learning experience a couple of days ago.

"Defaults matter" is a common phrase, but equally true is: "the pattern everyone recommends including example documentation matters".

It is fair to criticise the usage of GH Actions, just like it's fair to criticise common usage patterns of MySQL that eat your data - even if smarter individuals (who learn from deep understanding, or from being burned) can effectively make correct decisions, since the population of users are so affected and have to learn the hard way or be educated.

hn_throwaway_99

I wholeheartedly agree, and perhaps it was just how I was interpreting the author's statement in the article. If it's saying that the "default" way of using GitHub Actions is dangerous and leads to subtle security footguns, I completely agree. But if you know the proper way to use and secure Actions, saying "everyone else does it a bad way" is irrelevant to your security posture.

null

[deleted]

gazereth

Pinning dependencies is trading one problem for another.

Yes, your builds will work as expected for a stretch of time, but that period will come to an end, eventually.

Then one day you will be forced to update those pinned dependencies and you might find yourself having to upgrade through several major versions, with breaking changes and knock-on effects to the rest of your pipelines.

Allowing rolling updates to dependencies helps keep these maintenance tasks small and manageable across the lifetime of the software.

StrLght

You don’t have to update them manually. Renovate supports pinned GitHub Actions dependencies [1]. Unfortunately, I don’t use Dependabot so can’t say whether it does the same.

Just make sure you don’t leak secrets to your PRs. Also I usually review changes in updated actions before merging them. It doesn’t take that much time, so far I’ve been perfectly fine with doing that.

[1]: https://docs.renovatebot.com/modules/manager/github-actions/...

chuckadams

Dependabot does support pinned hashes, even adds the comment after them with the tag. Dependabot fatigue is a thing though, and blindly mashing "merge" doesn't do much for your security, but at least there's some delay between a compromise and your workflow being updated to include it.

baq

Not pinning dependencies is an existential risk to the business. Yes it’s a tradeoff, you must assign a probability of any dependency being hijacked in your timeframe yourself, but it is not zero.

tasuki

I don't think others were necessarily talking about "business".

Though, yes, I prefer pinning dependencies for my personal projects. I don't see why things should break when I explicitly keep them the same.

kevincox

That isn't even the biggest problem. That breaks, and breakage gets fixed. Other than some slight internal delays there is little harm done. (You have a backup emergency deploy process that doesn't depend on GitHub anyways right?)

The real problem is security vulnerabilities in these pinned dependencies. You end up making a choice between:

1. Pin and risk a malicious update.

2. Don't pin and have your dependencies get out of date and grow known security vulnerabilities.

progbits

But there is no transitive locking like package manager lockfiles. So if I depend on good/foo@hash, they depend on bad/hacked@v1 and V1 gets moved to malicious version I get screwed.

This is for composite actions. For JS actions what if they don't lock dependencies but pull whatever newest package at action setup time? Same issue.

Would have to transitively fork everything and pin it myself, and then keep it updated.

smpretzer

I have been using renovate, which automatically pins, and updates, hashes. So I can stay lazy, and only review the new hash when a renovate PR gets opened: https://docs.renovatebot.com/modules/manager/github-actions/...

ruuda

To make sure that you can test CI locally, the best way I've found so far is to make sure the checks can run with Nix, and then keep the CI config itself as simple as possible and just call Nix.

As for reducing boilerplate in the CI configs, GitHub Actions is a programming language with support for functions! It's just that function calls can only appear in very limited places in the program (only inside `steps`), and to define a function, you have to create a Git repository. The function call syntax is also a bit unusual, it's written with the `uses` keyword. So there is a lot of boilerplate that you can't remove this way, though there are several other yaml eDSLs hidden in GitHub Actions that address some points of it. E.g. you can create loops with `matrix`, but again, not general-purpose loops, they can only appear in a very specific syntactic location.

To really duplicate stuff, rather than copy-pasting blocks of yaml, without using a mix of these special yaml eDSLs, in the past I've used Nix and Python to generate json. Now I'm using RCL for this (https://rcl-lang.org). All of them are general-purpose yaml deduplicators, where you can put loops or function calls anywhere you want.

duijf

> It's just that function calls can only appear in very limited places in the program (only inside `steps`), and to define a function, you have to create a Git repository.

FYI there is also `on: workflow_call` which you can use to define reusable jobs. You don't have to create a new repository for these

https://docs.github.com/en/actions/writing-workflows/workflo...

mcqueenjordan

Usually if you’re using it, it’s because you’re forced to.

In my experience, the best strategy is to minimize your use of it — call out to binaries or shell scripts and minimize your dependence on any of the GHA world. Makes it easier to test locally too.

sepositus

This is what I do. I've written 90% of the logic into a Go binary and GitHub Actions just calls out to it at certain steps. It basically just leaves GHA doing the only thing it's decent at...providing a local UI for pipelines. The best part is you get unit tests, can dogfood the tool in its own pipeline, and can run stuff locally (by just having the CLI nearby).

noisy_boy

Makes migrations easier too; better to let gitHub or gitlab etc to just be the platform to host source code and trigger events which you decide how to deal with. Your CI itself should be another source controlled repo that provides the features for the application code's thin CI layer to invoke and use. That allows you to be able to run your CI locally in a pretty realistic manner too.

I have done something similar with Jenkins and groovy CI library used by Jenkins pipeline. But it wasn't super simple since a lot of it assumed Jenkins. I wonder if there is a more cleaner open source option that doesn't assume any underlying platform.

raffraffraff

> Usually if you’re using it, it’s because you’re forced to.

Like teams.

0xbadcafebee

I have used Travis, CircleCI, GitHub Actions, GitLab Pipelines, AWS CodeBuild/CodeDeploy, Bazel, Drone, GoCD, and Jenkins. And I have used GitLab, GitHub, and Bitbucket for hosting VCS files. (I'm the guy who manages this crap for a living, so I have used it all extensively, from startups to enterprises)

GitHub Actions is the worst possible CI platform - except for all the others. Every single CI platform has weird limitations, missing features, gotchas, footguns, pain points. Every single one requires workarounds, leaves you tearing your hair out, banging the table trying to figure out how to do something that should be simple.

Of all of them I've tried, Drone is the platonic ideal of the best, simplest, most generally useful system. It is limited. But that limitation is usually easy to work around and doesn't impose artificial constrictions. However, you won't find nearly as many canned solutions or plugins as GitHub Marketplace, and the enterprise features are few.

GHA is great because of things like Dependabot, and the million canned Marketplace actions, and it's all tightly integrated with GH's features, so you don't have to work hard to get anything advanced or specific to work. Tight integration can save you weeks to months of development time on a CI solution. I've literally seen teams throw out versioning of dependencies entirely because they weren't updating their dependencies, because there's no Dependabot orb for CircleCI. If they had just been on GHA using Dependabot it would have saved them literal years of headaches.

Jenkins is, ironically, both the most full-featured, and the absolute worst to configure/maintain. Worst design, worst security, worst everything... except it does have a plugin for everything, and a UI for everything. I hate it with the fire of a million suns. But people won't stop using it, partially because it's so goddamn configurable, and they learned it years ago and won't stop using it. If anyone wants to write a replacement, I'm happy to help (I even wrote a design doc!).

tech_tuna

It's funny, I've used them all too. . . I like GHA overall but it sure has its quirks.

Anyone who claims that GHA is garbage and any of the others are amazing is either doing something very basic or is crazy, or lying.

At the end of the day, you run shell scripts and commands using a YAML based config language (except for Jenkins). Amazingly, it's hard to build something that does that with the right abstractions and compromises between flexibility and good hygiene.

Tainnor

> GHA is great because of things like Dependabot [...] so you don't have to work hard to get anything advanced or specific to work.

That may have been true before GitHub decided that PRs can't access repository secrets anymore. Apparently now you can at least add these secrets to Dependabot too (which is still duplicate effort for setup and any time you rotate secrets), but at the time when the change was introduced there were only weird workarounds.

ThomasRooney

> A few days ago, someone compromised a popular GitHub Action. The response? "Just pin your dependencies to a hash." Except as comments also pointed out, almost no one does.

I'm surprised nobody has mentioned dependabot yet. It automates this, keeping action dependencies pinned by hash automatically whilst also bringing in stable upgrades.

huijzer

Well but that’s the problem. You cannot fully automate this. You have to manually check the diff of each dependency and only accept the dependabot PR if the changes are safe.

The only automation that I know of is cargo vet. Although it doesn’t work for GitHub Actions, the idea sounds useful. Basically, vet allows people who trust each other to vet updates. So one person verifies the diff and then approves the changes. Next, everyone who trusts this person can update the dependency automatically since it has been “vetted”.

[1]: https://github.com/mozilla/cargo-vet

hinkley

Dependabot is only approximately as good as your tests. If you have holes in your testing that you can drive a bus through, you're gonna have a bad time.

We also, to your point, need more labels than @latest. Most of the time I want to wait a few days before taking latest, and if there have been more updates since that version, I probably don't want to touch anything for a little bit.

Common reason for 2 releases in 2 days: version 1 has a terrible bug in it that version 2 tries to fix. But we won't be certain about that one either until it's been a few more days with no patch for the patch for the patch.

null

[deleted]

esafak

dependabot now has beta support for delayed upgrades.

sureIy

Thank god. Getting dependabot PRs for a major version released yesterday is just a waste of time.

presentation

Wasn’t part of the problem though that renovate was automatically upgrading people to the compromised hash? Or is that just the fault of people configuring it to be too aggressive with upgrades?

Arbortheus

No, someone just impersonated renovate bot and the repo author got tricked

null

[deleted]