Skip to content(if available)orjump to list(if available)

AI isn't going to kill the software industry

simonw

I've been trying to put this effect into words for a while, now I don't have to - this is really clearly stated:

"AI tools create a significant productivity boost for developers. Different folks report different gains, but most people who try AI code generation recognize its ability to increase velocity. Many people think that means we’re going to need fewer developers, and our industry is going to slowly circle the drain.

This view is based on a misunderstanding of why people pay for software. A business creates software because they think that it will give them some sort of economic advantage. The investment needs to pay for itself with interest. There are many software projects that would help a business, but businesses aren’t going to do them because the return on investment doesn’t make sense.

When software development becomes more efficient, the ROI of any given software project increases, which unlocks more projects. [...] Cheaper software means people are going to want more of it. More software means more jobs for increasingly efficient software developers."

wruza

I think this effect will be even greater this time (last time being higher-level “slow” languages like python and js), because AI will allow for a new wave of developers who won’t care about the “right” code that much and will perceive it as a disposable resource rather than a form of art. This aligns well with many smaller businesses that are by nature temporary or very dynamic and have to actually fight with developers tendencies to create beautiful solutions after the ship has sailed.

paulryanrogers

IME the software quality (including my own) is already on the edge of the unmaintainable slop threshold. I don't think it can slip much further without taking their hosts down too. Even Apple's software quality seems to be hit or miss lately.

wruza

That's really okay for the most non-clueless-user facing software. Inhouse worked like that since forever. Half-assed, barely working out of happy path, dangerous if you tick that box, but doing its job anyway. There's still a large margin to drop from, to make software that isn't "boxed/saas" but tailored to very specific needs. Many devs may have felt that moment, when some personal tool or a script worked just fine, but any attempt to turn it into "an app for a user" resulted in a complexity mess. Maybe the answer is not better software, but less software and more code doing the job.

I don't think it's a bad thing, if it will help real people rather than the usual bigcorp/unicorn critters.

afavour

I feel like a lot of those problems are already addressed by no-code products. CRMs, project management tools, web site builders… those with relatively straightforward needs who aren’t that fussy about how it gets done are already served. I don’t doubt AI will help some people here but I’m not convinced it’ll be by an industry changing amount.

gsf_emergency_2

As it becomes easier and more profitable to accrue tech debt, more tech debt will be accrued..

See also: Jevon's paradox

>[Tech companies], both historical and modern, typically expect that higher [revenues] will lower [tech debt], rather than expecting the Jevons paradox

https://en.wikipedia.org/wiki/Jevons_paradox

spamizbad

It depends on the definition of “right”

As software becomes more essential to a business its reliability becomes more important. If your customers can tolerate defects or downtime it’s a signal that:

A) you’re not providing any real value

B) You provide so much additional value compared to your competition you still come out ahead in the wash

C) Your customers are hostages via vendor lock-in

A and C are the most common cases of persistent bad software.

null

[deleted]

jamil7

I just witnessed an example of more or less this at a startup I’ve been contracting at. The engineering team on the core product is tiny with no slack for extra projects. One non-technical founder and a few other non-technical people build a prototype piece of software using AI and low code tools for facilitating another revenue stream. They started using it with a few customers and raised more money around it. The money they raised is going directly into expanding the engineering team to work on both products.

etrautmann

This is essentially Jevon's paradox [1] applied to software development. As it becomes easier and more efficient to create software, more if it will be consumed/demanded/created.

[1] https://en.wikipedia.org/wiki/Jevons_paradox

edit: whoops that's the point of the original article

gsf_emergency_2

I'd go further than TFA and say it with tech debt..

As it becomes easier and more profitable to accrue tech debt, more tech debt will be accrued

jameslk

> More software means more jobs for increasingly efficient software developers

This assumes software developers will be the ones needed to meet the demand. I think it will be more like technical product managers

swatcoder

You're splitting hairs over a title, but effectively talking about the same individuals.

Whatever juice there is to squeeze out of generative AI coding, the people who will squeeze the most from it in the near future (10-20 years) are current and incoming software developers. Some may adopt different titles as the role matures, especially if there's a pay differential. This already happened with data engineering, database administration, etc

So it's possible that the absolute number of people carrying the title approximately like "software developer" may be fewer in 10 years than it is today, although I personally find that very unlikely. But the people leading and maturing whatever alternate path comes up, for the next generation or so, will more often than not have a professional or academic background in software development regardless.

The whole point of the article is that generative AI represents a lever that amplifies the capabilities of people who best know how to apply it. And for software development, that's ultimately going to be software developers.

jameslk

> You're splitting hairs over a title, but effectively talking about the same individuals.

I disagree. SWE skills are not the same as TPM skills. Those who like doing SWE work may not like TPM work nor be good at it. And my point is TPM skills will be what’s likely more needed. Therefore they may not be the same individuals, though many will need to adapt. Or go into woodworking or something

simonw

I think it will be the software developers that lean more into the kind of skills you see in technical product managers, or maybe vice-versa.

nicoburns

> I think it will be more like technical product managers

Even with traditional software development you'll be a much more effective software developer with good product management skills in a lot of niches (basically anything that's building a something user-facing)

QuiDortDine

But who will they blame when the solution inevitably fails to live up to the inherently contradictory demands of the different stakeholders?

nine_zeros

> A business creates software because they think that it will give them some sort of economic advantage. The investment needs to pay for itself with interest. There are many software projects that would help a business, but businesses aren’t going to do them because the return on investment doesn’t make sense

All this is fine but then they take this to the extreme by laying off developers when there are no more ROI projects. Then bit rot sets in, systems degrade, customers abandon ship, and suddenly there are no engineers to help the business. And no, hiring contractors last minute doesn't cut it because they don't know what pile of gunk has been running and what caused the degradation.

Software requires maintenance, much like a bespoke car. For every bespoke car you produce, you need to continuously service it with bespoke skills else it will stop working - and cause your revenue to drop.

paulryanrogers

Doesn't this assume there will always be significant business needs that can be met by software, but are not yet? Or at least not as efficiently as they could be?

I imagine there is an upper bound on how much of the world can be eaten by software, and the trend seems to be getting closer to that point. Unless there are some massive breakthroughs in robotics or cybernetics which open more of the physical world to software.

There's also a point where incremental software improvement requires bulldozing, paving, or burning so much nature that we'll be worse off in the end. Watching billionaires squabble about who gets the new AI data centers makes me wonder if we haven't crossed over that point a few funding rounds ago.

throwup238

I don't think we’re anywhere near the upper bound yet and IMO the prevalence of SaaS software that could be better done in house if resources permitted demonstrates that. The future will be a lot more bespoke.

Like Simon’s quote above says, software is a competitive advantage so when one company develops software that makes them more efficient or grows revenue, competitors have to follow suit or get left behind. It’s an economic arms race. That’s why the dreaded outsourcing wave of the 2000s never materialized: companies ended up hiring a bunch more software engineers in the US and outsourced a bunch of other engineering to India and other countries.

The interesting question is how this will interact with current interest rates and the end of ZIRP.

dutchbookmaker

I think it really depends how much improvement we get in AI from here.

I work at a small business and understand the entire system as a whole.

Most the complexity we have is in the UI so non-technical people can interact and work with company data/databases. Our SaaS is really the same thing but stuff we didn't want to build in house. There is no rocket science being done here.

At some level of AI accuracy though it would seem like there would be a phase transition that most of this UI goes away along with the need for many of the employees.

Right now there is no risk of this happening but at some level of accuracy we go from 200 people to like 20 with the only future head count coming in sales. Most likely though I would expect a competitor to eat our lunch in the phase transition and we go from 200 to zero.

Pushing 50 here, I am trying to think right now of my AI hedge/career change to something more creative/artistic that being a human itself has value. I managed to avoid working in a factory like my father by being a shitty computer but there might be no market for shitty human computers in 10 years.

paulryanrogers

Software is being commoditized by SAAS. Generally that means it's table stakes, not necessarily a competitive advantage.

natemwilson

I actually strongly believe the universe has an effectively infinite carrying capacity for software. This is because all systems can be improved upon recursively

Izkata

> This is because all systems can be improved upon recursively

Until it becomes cosmic code.

( https://minimaxir.com/2025/01/write-better-code/ / https://news.ycombinator.com/item?id=42584400 )

franktankbank

It doesn't take infinite iterations to solve a problem.

andsoitis

> and the trend seems to be getting closer to that point.

What evidence suggests this to you?

paulryanrogers

Things that seem borderline worse when being done in software. Like touchscreens in car controls, beta self driving, shitty/broken websites that should be paper flyers on bulletin boards, a tablet at the barbers where a sign in sheet or paper numbers would do better, those "smart" doors some grocery stores tried, etc.

Nevermark

> AI tools create a significant productivity boost for developers.

I think the first step betrays the weakness in this reasoning.

We already know that more powerful software tools do not make all developers more efficient. So the reverse can be true.

For software jobs to remain, not only must the tools keep making human developers more efficient, but the human developers need to continue making the automated developers more efficient.

Both directions must be accounted for.

Question: what is something our best human developers can do, that will always enhance the artificial developers results? Regardless of how far automated developers scale in quantity, quality, speed, economy and complexity of work?

It is the same question for any new automation vs. previous process.

You need to identify that, for anything past your first reasoning step to have a chance.

But if you can definitively answer it, I don’t think you need any more steps.

streptomycin

AI has been drastically improving for the past couple decades, and if that continues then AI is obviously going to kill the software industry and many other industries.

Sure, maybe apparent exponential growth tails off into an S curve at some point, as often happens. But this blog post seems to assume that is guaranteed to happen, which is a big leap of faith, and also makes this not a very interesting blog post. Because it's basically just, "Well if I assume the best arguments against my position are wrong, then I don't even need to address them, and I can instead focus on lesser arguments that are easier for me to discuss".

65

It's beneficial for executives to say AI will kill the software industry, it's a way to stoke fear in workers and a convenient way to say "with AI you could be x more productive," which makes the expectation that the worker should be however many times more productive than they already are, with or without AI "help". This is an attempt to increase hours worked at the same wages, which is itself an attempt to lower wages.

andsoitis

While this behavior may exist in pockets, my experience suggests that this is not broadly the general mental model / approach by executives. I think this is a narrative in some peoples heads, but is neither an accurate reflection of the world, nor a healthy way to approach employment.

65

You're probably right, though I speak from anecdotal experience, as the CTO of my company recently said that developers should be 5X more productive with AI. So what if I'm not 5X more productive with AI? Doesn't matter, because there's more tickets in the sprint that, in his head, I should theoretically be spending the same time working on as "before AI."

lexandstuff

Sounds like your CTO is dangerously incompetent and isn't actively using the AI tooling they are endorsing.

tivert

> While this behavior may exist in pockets, my experience suggests that this is not broadly the general mental model / approach by executives. I think this is a narrative in some peoples heads, but is neither an accurate reflection of the world, nor a healthy way to approach employment.

But it may be consistent with worker's experience with the effects of executives' actual decisions, and a better fit than the executives' actual mental model.

simonw

While I found myself in furious agreement with the section titled "Jevons Paradox", I'm less convinced by this argument from the "Comparative Advantage" section:

"While AI is powerful, it’s also computationally expensive. Unless someone decides to rewrite the laws of physics, there will always be a limit on how much artificial intelligence humanity can bring to bear."

The cost for running a prompt through the best-available model has collapsed over the past couple of years. GPT-4o is about 100x times less expensive than GPT-3 was, and massively more capable.

DeepSeek v3 and R1 are priced at a fraction of OpenAI's current prices for GPT-4 and o1 model and appear to be highly competitive with them.

I don't think we've hit the end of that trend yet - my intuition is that there's a lot more performance gains still to be had.

I don't think LLM systems will be competitive with everything that humans can do for a long time, if ever. But for the things they CAN do the cost is rapidly dropping to almost nothing.

thfuran

A human brain runs on about 20 W, and I see no reason to believe that that is the absolute limit of efficiency. It's probably quite efficient as far as mammalian meat goes, but evolution can only optimize somewhat locally.

mitthrowaway2

Indeed, the laws of physics don't put any limits on artificial intelligence that don't also apply to natural intelligence. It's a strange place to look for a comparative advantage argument.

null

[deleted]

TheDudeMan

> I don't think LLM systems will be competitive with everything that humans can do for a long time

It's looking like about 3 years. That will be another ~100x cost reduction, $100s more billions in infra, new training algorithms, new capabilities.

parpfish

i agree that the industry wont be killed, but I do have some worries about what the future will look like.

- If we keep making AI-assistance tools that make mid- and senior-level ICs more and more efficient, where does that leave entry-level junior positions? It's already tough enough for juniors to get a foot in the door, but will it get even harder as we continue to make the established older devs more and more efficient?

- The current crop of AI-assistance tools are being tailored to meet the needs of mid- and senior-level ICs that learned programming in a pre-AI world. But incoming junior devs are "AI native" and may approach software development in a very different way.

- I would wager that there will be substantial workplace/generational divides between devs that learned programming before using AI assistance later vs "AI native" devs that had AI assistance the whole time. I have no idea what these new ways of working will be, but I'm curious to see how it plays out.

carbocation

I remember when Hinton said that that we should "stop training radiologists now" in 2016[1]. Meanwhile, radiologists are in high demand and are getting paid better than ever. I believe the same will be true for programmers in the future. Sure, some of the boilerplate will be handled for you, just like segmentation is for radiologists. That's great for everyone.

1 = https://www.youtube.com/watch?v=2HMPRXstSvQ&t=30s

aitchnyu

Optimistic me waiting for all clinics to have one-home worth MRIs which can be placed between steel racks and plugged into a socket. Now every doc will need to become a radiologist.

https://www.news-medical.net/news/20240510/Machine-learning-...

cameldrv

A radiologist is training for a 35-40 year career though. I definitely would not have wanted to have started that training in 2016.

selimnairb

The notion of no longer training radiologists because computer vision algorithms and deep learning are good at detecting cancers in imagery strikes me troublingly naïve. Who fucking trained the AIs? Will the AIs magically be able to detect yet to be discovered maladies?

dboreham

Fwiw this is not new. I worked on a machine vision project to evaluate x-rays for cancer...in 1985.

TheDudeMan

Did the radiologists suddenly get better than the AI at reading images? Or is the system simply unchangeable?

carbocation

A few responses:

- Since when have the radiologists ever been worse than AI at reading images?

- AI is doing mechanical tasks for the radiologists, so it stands to reason that this makes radiologist more efficient.

- The radiologist is a liability sponge, so if at all possible it of course makes sense to augment the radiologist with AI rather than to try to do away with them. (This roughly gets at your point about the system being unchangeable.)

aiiizzz

> Since when have the radiologists ever been worse than AI at reading images?

Since 2 years~ish?

nialv7

I am quite tired of seeing titles like this. No, you _don't know_. Vast majority of definitive statements like this is going to be meaningless. The whole point about this is that it's an uncertainty, the impact of AI on our society is unpredictable, you could be right, but you could be wrong too. And merely assigning a probability to this is going to be very non-trivial.

I just can't understand where people find the kind of confidence to say AI is (or is not) going to <insert your scenario here>.

rahimnathwani

AI is going to make building software way cheaper and more profitable, but that's actually bad news for a lot of developers out there. Think about how many people are only employed because they know the basics of React or Django or whatever. They can copy-paste code and tweak existing patterns, but that's exactly what AI is getting really good at.

The developers who are actually going to thrive are the ones who can architect complex systems and solve gnarly technical problems. That stuff is getting more valuable, not less.

But a lot of folks have built careers on pretty basic skills. They've gotten by because there just aren't many humans who can do even simple technical work. That advantage is disappearing fast.

65

The barrier to entry will be raised, though isn't that always happening in software, regardless of if it's AI or not? Companies used to hire HTML email developers, for example. There are many HTML email builders out there that do the job for a marketing person.

Better tooling, if it's AI tooling or a framework, continuously changes the job requirements. Even your average React developer still has to deal with plenty of other things people in the past didn't have to think about. E.g. dependency management, responsive screen sizes for all screen widths, native apps, state management, etc.

rahimnathwani

"Better tooling, if it's AI tooling or a framework, continuously changes the job requirements."

This might be true if you work at a tech company, but it's not universally true. There are many people who are gainfully employed as software developers, based solely on technical knowledge they acquired years ago.

senordevnyc

Ironically though, if you do need to build raw HTML emails in 2025, it's still a huge PITA due to the horrible support for HTML / CSS across many email clients, and a dearth of good resources and information out there.

9rx

What we are currently calling AI is just a fancy programming language/REPL/compiler anyway, so obviously software developers aren't going away any time soon. You fundamentally must be a software developer to use these tools.

Elevator operators never went away either. In fact, there have never been more elevator operators in human history! Not a good career choice, though. That is what these warnings, realistic or not, are actually calling attention to.

Aperocky

> there have never been more elevator operators in human history

Press X to doubt

9rx

When was there more? We keep building more and more buildings with elevators in places where there are more and more people. With defined elevator attendants being almost unheard of nowadays, leaving elevator users to be operator in nearly every case, anything else is mathematically unlikely.

Software developers aren't going anywhere, but, like the elevator operator, everyone might become a software developer. At least that is the theory the grifters are grifting on. They aren't literally saying software developers are going away. That couldn't work given that you become, if you weren't already, a software developer when you use these tools.

herval

That’s very easily googled: https://www.researchgate.net/figure/Number-of-Elevator-Opera...

There were more elevator operators before elevators became easily operated by the passengers - as expected

jeremyjh

Whoosh

Aperocky

good one. I get the point now.

Though not necessarily agreeing with it. Maybe if the AI is AGI, but then everything would be moot.

jeremyjh

This is a great analogy. Maybe someday, computers will work like they are supposed to. You tell it what you want, and it does the work of understanding you instead of vice-versa. And then it just actually does what you want. That would be amazing. Our world would change so much, so fast that we can't really predict whether we'll actually be better off or not.

apeace

The job of "software engineer" as we know it will end.

Before the industrial revolution, shoemakers would make shoes. It was a specialized skill, meaning shoes were very expensive, so most people couldn't afford them.

Then factories were invented. Now shoes could be made cheaply and quickly, by machines, so more people could afford them. This meant that far more people could be employed in the shoe industry.

But those people were no longer shoemakers. Shoemakers were wiped out overnight.

Think of how huge the shoe industry is now. There are jobs ranging from factory worker to marketing manager. But there are zero shoemakers.

AI writing software doesn't mean it's the end of the industry. Humanity will benefit greatly, just like we did from getting cheaper shoes.

But the software engineers are screwed.

bamboozled

But those people were no longer shoemakers. Shoemakers were wiped out overnight.

Have you seen the cost and popularity of "Made in X" handmade boots though? Red Wing, Origin, Red Back. It's absolutely crazy

The difference is, all of a sudden we could make a lot of CHEAP shoes and yes I'm sure it wiped out a lot of shoe maker jobs, but there is still a lot of good shoe makers around and there is still a high demand for handmade shoes and boots.

senordevnyc

Shoemaking is an interesting analogy, and it brings to mind a few other facts that might be relevant. I have zero experience in the industry of shoemaking, so these are my impressions, they could be wrong:

1. There are still many thousands of people in the US alone employed today as traditional shoemakers at boutique firms. It's a very niche career, but it does exist.

2. As the cost of shoes plummeted and our ability to create more complex designs exploded, we also got a huge proliferation of innovation and creativity in shoe design.

3. Yes, today's shoe industry has lots of factory workers and marketing managers...but it also has many tens of thousands of more specialized roles like shoe designers, materials specialists, process automation engineers, etc.

I can see a future where software is almost entirely created by AI, but we have many specialized roles of people who know how to apply AI tools to software creation, or who sit in the interface between the business and the AI in some way that it's hard to foresee now.

On the flip side, if we truly get ASI, then it is hard to see what exactly those specialized roles represent that can't be replaced.

What does Sam Altman do that an ASI won't be able to?

QuiDortDine

The shoe comparison is ridiculous. For let's say 99% of people, shoe requirements are the same (in function), with almost all variations being purely esthetic. There are, let's say, 10 kinds of shoes, or perhaps 100. Make it a thousand, for argument's sake.

Meanwhile, every single business has different workflows and therefore different needs. The most common ones (browsers, etc.) are answered by traditional software. If you can write in detail the business needs as pertaining to workflows - business rules, let's call them - you've effectively made the software already. The only difference being that telling ChatGPT to do something in English and telling the computer to do it in code is that one is non-deterministic.

Software is, primarily, a means to process information, which is to say reality (in a business setting). An AI that can replace software developers can, in effect, replace every job that happens on a computer, in every company on Earth. Apart from Jevon's paradox (which is much more applicable to software than shoes), this shift would be so gargantuan that it's barely worth thinking about, in the same way that it's not worth thinking about a supervolcanic eruption: the consequences would be earth-shattering, and finding employment would be the least of your worries.

apeace

To add to this: the author is missing a major aspect of the Jevons paradox.

They keep referencing "more efficient software developers," but the Jevons paradox isn't only about efficiency. The efficiency creates lower cost, which in turn increases demand.

The main cost of software is software engineers. It's a specialized skill, so it's a high-salary job.

With AI doing most of the work, salaries will begin to fall. It will no longer make sense to study computer science, or spend years learning to code, for such a low salary. There will no longer be people doing what we call software engineering today.

So the author is right, Jevons paradox will take effect. But like I said above, it will replace the current industry with a very different-looking industry.

squishington

I really don't see AI generating safe code for automotive embedded systems that is maintainable and MISRA and HIS compliant. And there will need to be software engineers who are trained to debug these systems.

nunez

Tons of shoemakers exist! And not unlike cheap AI swill, the best shoes are handmade by them .

iterateoften

Working with nontechnical people make prompts is interesting.

I’m seeing a lot of frustration with people dealing with markdown. Even though it’s free form and not really like code at all the hashes, dashes etc throw them off

Also seeing a lot of people having a hard time expressing their desired behavior in a concrete way. It reminds me in 3rd grade when we had to write recipes and then the teacher had a classmate maliciously comply to only what was written.

Overall I think tools will improve and barriers will continue to disappear but for the time being still has big demand for people to convert abstract intention to concrete machine usable format. It’s just how those ideas are expressed get more flexible with llms

1shooner

"Cheaper software means people are going to want more of it. More software means more jobs for increasingly efficient software developers. Economists call this Jevons Paradox."

If we accept there will be increased demand for software, it's a big jump from that to concluding the efficiency of AI will be outpaced by the demand for software, specifically along the dimension of required developers.

Software isn't wheat or fuel, it can be reused and resold.

Kerrick

“Crack the books”—can anybody recommend good books for this shared future of ours? I’m tired of trying to piece it together from blog posts, READMEs, and short video tutorials.

simonw

I haven't read these all the way through myself but I've seen enough of them that I'm confident suggesting them:

- Prompt Engineering for LLMs by John Berryman and Albert Ziegler: https://www.amazon.com/Prompt-Engineering-LLMs-Model-Based-A...

- AI Engineering by Chip Huyen, which I recommend based on the strength of this extract about "agents": https://huyenchip.com/2025/01/07/agents.html

airstrike

I like hands-on learning more than I like textbooks, so in case that matches your requirements, maybe try training your own GPT to have a sense for how it works. I wrote a Rust version of the famous https://github.com/karpathy/nanoGPT (which is in Python) so that I could learn how it's built.

I wrote it in Rust because I wanted to improve my skills in that language, be forced to write code instead of just reading the existing implementation so that I would truly learn, and test the quality of the nascent Rust AI/ML ecosystem, but you could pick your own language

brainbag

Would you say more about your experience writing it in Rust? It worked well, what didn't, anywhere you found that you struggled unexpectedly or that was easier than you expected?

airstrike

Hey, thanks for asking. I'm the furthest from an authority in this so I encourage you to take everything I say with a grain of salt.

I was using the burn[0] crate which is pretty new but in active development and chock-full of features already. It comes with a lot of what you need out of the box including a TUI visualizer for the training and validation steps.

The fact that it's so full of features is a blessing and a curse. The code is very modular so you can use the pieces you want the way you want to use them, which is good, but the "flavor" of Rust in which is written felt like a burden compared to the way I'm used to writing Rust (which, for context, is 99% using the glorious iced[1] GUI library). I can't fault burn entirely for this, after all they are free to make their own design choices and I was a beginner trying to do this in less than a week. I also think they are trying to solve for getting a practitioner to just get up and going right away, whereas I was trying to build a modular configuration on top of the crate instead of a one-and-done type script.

But there were countless generic types, several traits to define and implement in order to make some generic parameter fit those bounds, and the crate has more proc_macro derives than I'd like (my target number is 0) such as `#[derive(Module, Config, new)]` because they obfuscate the code that I actually have to write and don't teach me anything.

TL;DR the crate felt super powerful but also very foreign. It didn't quite click to the point where I thought it was intuitive or I felt very fluent with it. But then again, I spent like 5 days with it.

One other minor annoying thing was that I couldn't download exactly what I wanted out of HuggingFace directly. I ended up having to use `HuggingfaceDatasetLoader::new("carlosejimenez/wikitext__wikitext-2-raw-v1")` instead of `HuggingfaceDatasetLoader::new("Salesforce/wikitext")` because the latter would get an auth error, but this may also be my ignorance about how HF is supposed to work...

Eventually, I got the whole thing to work quite neatly and was able to tweak hyperparameters and get my model to increasingly better perplexity. With more tweaks, a better tokenizer, possibly better data source, and an NVIDIA GPU rather than Apple Silicon, I could have squeezed even more out of it. My original goal was to try to slap an iced GUI on the project so that I could tweak the hyperparameters there, compare models, plot the training and inference, etc. with a GUI instead of code. Sort of a no-code approach to training models. I think it's an area worth exploring more, but I have a main quest I need to finish first so I just wrote down my findings in an unpublished "paper" and tabled it for now.

________

[0]: https://github.com/tracel-ai/burn

[1]: https://github.com/iced-rs/iced

rramadass

Some that i am looking into; these are "practical" books which do not focus on the theory/algorithms but given that they are available (library/models/whatever), how to build your apps using them;

1) Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications by Chip Huyen.

2) AI Engineering: Building Applications with Foundation Models by Chip Huyen (this is a very recent book).

3) Generative Deep Learning: Teaching Machines To Paint, Write, Compose, and Play by David Foster.

4) Building LLMs for Production: Enhancing LLM Abilities and Reliability with Prompting, Fine-Tuning, and RAG by Bouchard & Peters.

senordevnyc

It honestly feels like books are a poor fit for this topic, because things are moving so much faster than the publishing industry. A book published in 2024 would have been written in 2023, and would give you a pretty skewed picture of the SOTA.