Ollama Models Atom Feed
12 comments
·March 23, 2025mdp2021
simonw
Does this really matter?
That 100KB feed is served by GitHub Pages, which should be pretty efficient - it gets gzipped and comes with ETags/last-modified headers, so good feed readers should be able to access it efficiently.
As a consumer of Atom feeds I find whole content feeds like that quite convenient, provided they don't get into the thousands of entries.
(Fun aside: I got the new Gemini 2.5 Pro to write me a quick Python script to confirm that GitHub Pages does indeed support gzip and ETag/last-modified: https://gist.github.com/simonw/7f6b167bd8288dccfbebbaf7e143e... )
mdp2021
> Does this really matter
Well, there is a matter of efficiency. Even when the feed reader uses the ETags/last-modified checks - which it should, but may not be granted practice after providers like Google ditched the technology (as some were apparently flooding them with "are we there yet" requests) - with an average for this year of 200 entries per XML you would be using 1/200th of each per download - and increasing...
On the average of 20kb of gzipped material per XML (rough average for this year for that feed) it may not seem much - but if you multiply that for all the feeds users register to, inefficiency starts building a volume...
Optimally, I would say, a feed provider should have (1) the "whole collection" XML - "download once to get all the past records in your DB", and (2) a "latest" XML, which should probably better calibrated by time (e.g. all the records of the past 48 hours for a newspaper; all the records for the past month for an every-few-days blog...).
spondyl
> As a consumer of Atom feeds I find whole content feeds like that quite convenient, provided they don't get into the thousands of entries.
Atom does have an RFC for supporting pagination so it is possible (client support provided) to have both a short feed as well as full content history: https://www.rfc-editor.org/rfc/rfc5005#section-3
For Wordpress sites, `?paged=X` can be used to paginate through RSS feeds, although that's more of a Wordpress standard than an RSS/Atom standard: https://codex.wordpress.org/Pagination
My personal gripe is that clients checking for query params would seem to rule out pagination of static feeds (so you'd end up shipping a single huge response to the client) but now that I look closer, I guess you could achieve it via Atom, you'd just end up having to reshuffle every feed whenever pages get updated.
I went through some of this recently when standing up some home software to generate RSS feeds for sites that don't have them :)
simonw
... I decided to give people a choice, so I've updated the script to produce TWO atom feeds now, one of which is just the most recent 20 items:
https://simonw.github.io/ollama-models-atom-feed/atom-recent...
mdp2021
By the way: I am intrigued by that «I built the scraper by pasting example HTML into Claude» - Simon picked Claude for this specific task.
Is anyone maintaining benchmarked or informal (anecdotal, empirical) lists of what are the strong abilities (and maybe weak ones) per LLM?
simonw
I've been defaulting to Claude 3.7 Sonnet for a while now as it produces Python and JavaScript that fit my personal tastes, but to be honest I'd expect any of the leading models (GPT-4o/4.5/o1/o3, Gemini 1.5 or 2.0 or 2.5, probably the larger Llamas) to be able to do this well.
Dumping a chunk of HTML or JSON directly into a model and saying "write something that converts this to format X" tends to just work in my experience.
PeterStuer
Been using OpenAI for most my api use in production and just started with Anhropic. Having to grind up customer tiers once again realy makes it difficult to experiment as 'Tier 1' hit it's limit after 15 mins. And no, I do not want to 'contact sales' or wait weeks when I'm merely checking out initial viability. Just let me buy and use my available credits.
rspoerri
i assume you know the leaderboard, which at least shows the strongest ones in a specific area:
mdp2021
Thank you, I did not remember the "Category" selector ("Hard Prompt", "Instruction-Following", "Math"...).
Descriptive blog page at https://blog.lmarena.ai/blog/2024/arena-category/
RSS/Atom: https://simonw.github.io/ollama-models-atom-feed/atom.xml
@Simon: you should really fix the number of entries: now it's over 160 (from a ~100kb XML), but "ollama newest" updates on average every 4 days, so the window of the RSS should be much shorter (10 entries per XML is enough for many weeks)...
Ideally, you could have a "full" XML (full set) that you will download once for your local reader, and the "proper" XML feed (reasonable window) that will replace the first one locally after first use to only download the updates.
(Edit: nitpicking, but since I noticed it: the "pulls" value in the "proper" feed - to check for new models - makes no sense ;) )