Show HN: Superglue – open source API connector that writes its own code
21 comments
·February 27, 2025promocha
Really nice idea and product. Does it update and cache changed schema for the target API? For ex. an app makes frequent get calls to retrieve list of houses but API changed with new schema, would Superglue figure it out at runtime or is it updating schema regularly for target API based on their API docs (assuming they have it)?
sfaist
Yes, it does update and cache changed schema for the target API. At runtime. The way it works that every time you make a call to superglue, we get the data from the source and apply the jsonata (that's very fast). We then validate the result against the json schema that you gave us. If it doesn't match, e.g. because the source changed or a required field is missing, we rerun the jsonata generation and try to fix it.
I guess you could regularly run the api just to make sure the mapping is still up to date and there are no delays when you actually need the data, depending on how often the api changes.
DaiPlusPlus
> Automatically generates the API configuration by analyzing API docs.
The problem with a lot (most?) integration work is that often there simply aren't any API docs - or the docs are outdated/obsolete (because they were written by-hand in an MS Word doc and never kept up-to-date) - or sometimes there isn't an API in the first place (c.f. screen-scraping, but also exfiltration via other means). Are these scenarios you expect or hope to accommodate?
sfaist
you can give it any context you have, worst case in text form, and the llm will try to figure it out, call different endpoints etc. Recently someone mentioned to me the intern test by Hamel Husain: if avg college student can suceed with the given input (with a lot of trying and time), then llms should be able to do it too. So that's the bar we're aiming for.
No api at all is out of scope for now, there are other tools that are better suited for that.
AvImd
Access to XMLHttpRequest at 'https://graphql.superglue.cloud/' from origin 'https://app.superglue.cloud' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
sfaist
Thanks for flagging this. Odd. Did this happen on the website or in the actual app? Might be a server overload looking at our logs.
asdev
why would use this when I can just add API docs to my LLM context and have it generate the integration code?
sfaist
depends on your usecase: - this abstracts away a lot of the complexity, including pagination and format conversion. Also integrated logging and schema validation. - this is self-healing, so when data comes through that you have never seen before or if the api changes it is a lot less likely to break. - if you need to integrate a lot of APIs, or if you have multiple apps needing access to these apis, it is much easier to set up here than writing 1000s of lines of integration code. If none of this is important / applies to you and the generated code works well, then you could also just do that.
nimar
because it's easier :) , see: https://news.ycombinator.com/item?id=9224
npollock
something like this that runs as a browser agent, allowing me to extract structured data from websites (whitelisted) using natural language queries
adinagoerres
huh interesting. we're exploring extraction from html
m0rde
Great idea, congrats. Can you speak a bit about the the validation piece? Were LLM hallucinations an issue and required this? Are you using some kind of structured output feature?
sfaist
Sure! We use structured output for the endpoint, but not for the jsonata since it's hard to actually describe as a format. 3 big levers for accuracy / reducing hallucinations: 1. direct validation: we apply the jsonata that is generated and check if it really produces what we want (we have the schema after all). This way we can catch errors as they come up. 2. using a reasoning model: by switching to o3-mini, we were able to drastically improve the correctness of the jsonata. takes a bit longer, but better waiting a bit than incorrect mappings. 3. using a confidence score: still in development, but sometimes there are multiple options to map something (e.g. 3 types of prices in the source, but you only want one. Which one?). So we're working on showing the user how "certain" we are that a mapping is correct.
hoerzu
Love it, is there also a possibility for alarms if schema changes?
sfaist
working on it... ping me if you have a usecase in mind and I can set it up for you.
tayloramurphy
Does this have any connection to the previous "Supaglue" startup [0]? Similar problem space, slightly different/pre-llm solution.
adinagoerres
we're not affiliated
Hi HN, we’re Stefan and Adina, and we’re building superglue (https://superglue.cloud). superglue allows you to connect to any API/data source and get the data you want in the format you need. It’s an open-source proxy server which sits between you and your target APIs. Thus, you can easily deploy it into your own infra.
If you’re spending a lot of time writing code connecting to weird APIs, fumbling with custom fields in foreign language ERPs, mapping JSONs, extracting data from compressed CSVs sitting on FTP servers, and making sure your integrations don’t break when something unexpected comes through, superglue might be for you.
Here's how it works: You define your desired data schema and provide basic instructions about an API endpoint (like "get all issues from Jira"). superglue then does the following:
- Automatically generates the API configuration by analyzing API docs.
- Handles pagination, authentication, and error retries.
- Transforms response data into the exact schema you want using JSONata expressions.
- Validates that all data coming through follows that schema, and fixes transformations when they break.
We built this after noticing how much of our team's time was spent building and maintaining data integration code. Our approach is a bit different to other solutions out there because we (1) use LLMs to generate mapping code, so you can basically build your own universal API with the exact fields that you need, and (2) validate that what you get is what you’re supposed to get, with the ability to “self-heal” if anything goes wrong.
You can run superglue yourself (https://github.com/superglue-ai/superglue - license is GPL), or you can use our hosted version (https://app.superglue.cloud) and our TS SDK (npm i @superglue/client).
Here’s a quick demo: https://www.youtube.com/watch?v=A1gv6P-fas4 You can also try out Jira and Shopify demos on our website (https://superglue.cloud)
Excited to share superglue with everyone here—it's early so you'll probably find bugs, but we'd love to get your thoughts and see if others find this approach useful!