Skip to content(if available)orjump to list(if available)

Libraries are under-used. LLMs make this problem worse

seunosewa

I disagree. Every python package we install seems to install dozens of libraries, each of which can could harbour malware. Many of them are only used for a single function within them. We have no idea of what most of the packages are for. It's a lot.

aDyslecticCrow

https://en.m.wikipedia.org/wiki/Log4j https://en.m.wikipedia.org/wiki/Npm_left-pad_incident

Languages and domais that have leaned too faar into package managers and small libraries are prone to fragility and security nightmares.

For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

Id much rather deal with a bug in our code than a depricated library or breaking version update.

If we are to use a library outside of standard unix or stdlib within my field, better expect a nighmareish code review and a meeting.

Besides being fun; implementing it ourselves improves our skill level for the future. Something vibe coding itself goes against aswell.

skydhash

> For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

A project only become serious once legal is breathing down engineering's neck. Before that, it's usually the far west. After, it becomes a security circus trying to patch the technology deficiency (custom registries, complex linting and other analysis tooling,...)

giantg2

If it's open source, it may be possible to create your own fork to fix issues.

j-pb

This. We finally have a tool that can learn from all the libraries and abstractions that have to fit everybody's needs (and do so badly because there is no free lunch), and extract just the parts that are actually relevant to our problem and domain. This allows you to not only produce a much smaller attack surface, but also allows for domain specific optimisations and shortcuts.

It's kinda like project specific semantic monomorphization.

handfuloflight

Sure. Lot's more debugging than using something battle tested, which is why I have this in my CLAUDE.MD:

> If there is a battle tested, well known package that can help us, then recommend it BEFORE implementing large swaths of custom code.

lazide

This is hilarious.

closeparen

>This allows you to not only produce a much smaller attack surface

Why does this reduce your attack surface? Can the functions in the library, unrelated to the ones you're using, be triggered by user input somehow?

rjsw

Same with ruby, I have to use a package with 230 dependencies.

ozim

This sounds exactly like under utilized - if someone needs a function or two from a library I guess making yourself depending on 3rd party for such small gain doesn't make sense.

AlienRobot

LOL! I thought the article was going to be about reading books and ChatGPT!

And yes, I agree.

https://www.npmjs.com/package/boolean

>converts lots of things to boolean.

>3 million weekly downloads

This is insane.

what

3 million weekly downloads for a package that is “deprecated” and the source repo no longer exists. Truly insane.

AlienRobot

Even if it wasn't deprecated this is literally

    ['yes', 'y', '1'].indexOf(input.toLowerCase()) !== -1
People adding a dependency to avoid writing one line of code...

lazide

This is the total leopards-eating-faces moment from all the greybeards.

PaulHoule

I dunno, often people say libraries are over used, at least in the JavaScript world.

My celery/RabbitMQ-based web crawler failed because of the Cloudflare CAPTCHAs, I figured it was best to empty out the queue and archive it. I asked copilot what to do and it told me to use a CLI program. “Does that come with RabbitMQ?” “No, you download it from GitHub”. It offered to write me a Python script but the CLI program did exactly what I needed. It got an option wrong but I’d expect the same if I asked a friend for help.

tptacek

I don't know about vibe coding (I'm not a fan of vibe coding) but LLM agents make me more likely to use good libraries, not less, because they instantly know how to use them; there's less intellectual friction to breaking them out (don't have to find and add the dep, don't have to look for the example code). These kinds of things made me ultra-likely to just hand-code crappier versions of stuff libraries did, before I got LLM-assisted.

tedunangst

Is it your assessment or the LLM's that it's a good library? There have been many times I looked at the API for a library, said this is bonkers, and bailed. The weird contortions needed to use something should be a signal.

tptacek

It's mine. I've been shooting down LLM library picks semiregularly. That's kind of what motivated me to comment: it is not at all my experience that LLMs steer me away from libraries, and rather more my experience that it's keeping me on my toes suggesting libraries I might not want to use.

corby

I'm having a problem like this now. I have a library that handles very complex hardware drivers and linkages.

I want people in the company to use it, but it's big and complicated (lots of chipsets and Bluetooth to boot).

I'm trying to design the library so the MCP can tell the LLM to pull it from our repo, read the prompt file for instructions and automatically integrate with the code.

I can't get it to do it consistenlty. There is a big gap in the current LLM tech where there is no standard/consistent way to tell an LLM how to interface with a library (C/Python/Java/etc.)

The LLM more often than not will read the library and then start writing duplicate code.

Maddening.

simonw

That's part of the idea behind https://llmstxt.org/ - even if you ignore the "/llms.txt" URL there's a bunch of thinking around that to help write explanations of things like libraries that can be used to "teach" a model to use it by injecting that into a prompt.

I'm still not clear on what the best patterns for this are myself. I've been experimenting with dumping my entire documentation into the model as a single file - see https://github.com/simonw/docs-for-llms and https://github.com/simonw/llm-docs - but I'd like to produce shorter, optimized documentation (probably with a whole bunch of illustrative examples) that use fewer tokens and get better results.

nimish

At this point it seems like just learning the library is easier than trying to cram the documentation into an LLM compatible format.

simonw

Doing the work to effectively prepare those docs for an LLM probably does involve "learning the library", but once one person has done that (and published the results) many other people can benefit from it.

I'm a library author myself, so publishing LLM-enhanced versions of the docs to help other people use my library more effectively feels like a sensible use of my time.

brikym

A bit of duplication is better than a lot of dependency.

giantg2

Kind of a false dichotomy. To avoid a lot of dependencies you generally need a lot of duplication.

cluckindan

”Dunning-Kruger effect leads us to understimate the complexity of the problem solved by the library we're considering.”

Invoking the smarter-than-thou effect is not a great starting point.

See e.g. https://www.sciencedirect.com/science/article/abs/pii/S01602...

If we’re considering a library, it would be prudent of us to take a look at the source code to see what exactly we’re pulling in. In the process, we would learn about the lay of the land, the API and the internals, and get at least an overview of the complexity of the problem it solves.

fmbb

The Dunning-Kruger effect absolutely also leads to people releasing libraries they should not have and which nobody should use.

unclad5968

The DK effect only implies that people who know things underestimate their knowledge superiority and people who don't know things underestimate their knowledge inferiority. The popular interpretation that uniformed people think they're informed is not consistent with the DK research.

I don't think DK has anything to do with people releasing libraries that nobody should use.

briantakita

I learned to consider that if one brings up Dunning-Kruger...projection/irony may be at play.

Anyways...I've had a few reoccurring issues with libraries. Note that the language is framed on a case by case basis...not general rules.

1. The essential implementation is a small amount of code...wrapped in structures just for packaging essential code. The wrapping code can be larger & more complex than the essential code.

2. There's small differences between what's needed & what's provided. Which requires workarounds for the desired outcome. These workarounds muddy the logic & can be pervasive at scale.

3. There can be dissonance between the app architecture & the library api.

4. Popular libraries in particular...create a culture of thinking in terms of the library/framework. Leading to resource inefficiencies...And outright dismissing solutions that are a better match for the domain. In short, the library/framework api frames the problem & solution...Which may not match the actual problem & optimal solution.

5. The library/framework authors are concerned about promoting the library/framework. Not solving the actual problem. Many problems need to be solved. The library/framework just be the "Golden Hammer" to pound in your screw.

With all that being said...there are many useful libraries that define & solve problems in their particular domain. Particularly with common, well defined, appropriately scoped requirements.

terribleperson

I imagine a good example for 4 would be the Tidyverse. It's very nice, but R with and without Tidyverse packages are very different experiences with different syntaxes, conventions, and even communities.

Though the addition of pipes to the base language is helping fix that.

giantg2

I'd much rather learn a library than create it from scratch. The two main issues are licensing concerns and being able to find ones that actually do what you need.

layer8

The third issue is avoiding “You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.”

(The quotes comes from a different context, but works quite well here as well.)

outside1234

I don't know if this is true. An LLM just today recommended a library I had never heard of, and after doing some due diligence, it looks solid.

This is analogous to folks who claim nobody is going to be able to learn software engineering any more. I think it is just the opposite. LLMs can be an awesome tool for learning.

null

[deleted]

msgodel

Totally disagree. I avoided python for way too long because of how people were abusing pip/anaconda. Especially with such a complete standard library there's no reason to be dragging in external libraries most of the time (except numpy and maybe pytorch if you're doing ML.)

AlienRobot

>Vibe coding is more fun than reading documentation. Shit, vibe-coding can be more fun than ordinary coding.

In my experience the big problem is that the documentation is always terrible, you can't ask open-ended questions on stack overflow, the library's reddit (if any) has zero users, and anything asked on their discord is not searchable.

It's incredible that we still don't have a stack overflow that is just a forum.

skydhash

There are some bad/missing documentations out there, but more often than not, people rush to use the library without first understanding the domain and learning the library's design. Once that's done, the generated api reference and the source code is more than enough to get going.

cat_plus_plus

If your vibe coding prompt generated a 1000 line output, you should probably ask if there is a library that would do that for you. If not, library is not worth it to shorten a one pager.