pipeline_peak
…yeah I’d be worried except the 90% of us weren’t hired to write performant code. We were hired to perform well at writing good enough code.
kadushka
This is a Codeflash ad.
csomar
tl;dr: LLM struggle to write performant/correct code unless you use codeflash. The "trick" is that you run tests/benchmarks against the generated code. It is not clear whether codeflash does that for you, or you have to write your tests/benchmarks and then codeflash runs a loop trying different versions while testing against your code.
misrasaurabh1
Hi, I am the creator of Codeflash. The idea is to do everything that a professional performance engineer would do and to automate it. So codeflash, analyzes your code, creates correctness tests, creates performance benchmarking tests, executes these tests to figure out the behavior and performance of code. Along side it also profiles and determines the line coverage of your original code. Then it uses all this info to generate multiple optimizations and applies them one by one to see if the new code is indeed an optimization, i.e. is correct and is more performant. If this is the case then it opens a Pull Request for dev's review with all the info.
misrasaurabh1
Codeflash is still new but it is very strong at optimizing general purpose python code, especially PyTorch, NumPy or algorithmic code. If you want to try out optimizations - try optimizing this sample repo https://github.com/codeflash-ai/optimize-me
Philpax
Well, they're being trained with reinforcement learning now, so I, for one, am excited to see what kinds of unique extremely performant insanity they'll produce in the future =D
misrasaurabh1
What we discovered that LLMs aren't really the answer wrt performance, the whole system around verifying correct and performance is the key.
a couple days ago i asked a buddy (he's an exceptionally good SWE) who was raving about AI coding if he could ask it to write some code that would match or beat the 170 LoC uWrap [0] in size, output quality, or perf. i asked him to do it soonish, since i was sure the lib was likely to be in most training sets very soon, if not already.
all models but one failed completely. only grok made a version that was several orders of magnitude slower than the lib which uWrap bests by 10x, which he considered a small win. but when he checked what it used as a reference, it turned out that uWrap is what it had pulled from the web. so it very poorly regurgitated my lib.
i guess i dont need look for a new career just yet.
[0] https://news.ycombinator.com/item?id=43583478