Sj.h: A tiny little JSON parsing library in ~150 lines of C99
52 comments
·September 21, 2025layer8
The library doesn’t check for signed integer overflow here:
https://github.com/rxi/sj.h/blob/eb725e0858877e86932128836c1...
https://github.com/rxi/sj.h/blob/eb725e0858877e86932128836c1...
https://github.com/rxi/sj.h/blob/eb725e0858877e86932128836c1...
Certain inputs can therefore trigger UB.
hypeatei
You're not aware of the simplistic, single header C library culture that some developers like to partake in. Tsoding (a streamer) is a prime example of someone who likes developing/using these types of libraries. They acknowledge that these things aren't focused on "security" or "features" and that's okay. Not everything is a super serious business project exposed to thousands of paying customers.
layer8
Hobby projects that prove useful have a tendency of starting to be used in production code, and then turning into CVEs down the road.
If there is a conscious intent of disregarding safety as you say, the Readme should have a prominent warning about that.
hypeatei
> Hobby projects that prove useful have a tendency of starting to be used in production code
Even if that is true, how is that the authors problem? The license clearly states that they're not responsible for damages. If you were developing such a serious project then you need the appropriate vetting process and/or support contracts for your dependencies.
vrighter
then that is their problem, not the code author's. If you use a hobby project in production, that's on you
zwnow
So if its a hobby project designed for just a handful of people, its suddenly okay to endanger them due to being sloppy?
hypeatei
This is an open source project that you're not obligated to use nor did you pay for it. Who is it endangering?
The license also makes it clear that the authors aren't liable for any damages.
skydhash
There was a nice article [0] about bloated edge cases libraries (discussion [1]).
Sometimes, it's just not the responsibility of the library. Trying to handle every possible errors is a quick way to complexity.
[0]: https://43081j.com/2025/09/bloat-of-edge-case-libraries
klysm
Strongly disagree here because JSON can come from untrusted sources and this has security implications. It's not the same kind of problem that the bloat article discusses where you just have bad contracts on interfaces.
layer8
The problem in the present case is that the caller is not made aware of the limitation, so can’t be expected to prevent passing unsupported input, and has no way to handle the overflow case after the fact.
skydhash
Do you not review libraries you add to your project? A quick scan of the issues page if it's on a forge? Or just reading through the code if it's small enough (or select functions)?
Code is the ultimate specification. I don't trust the docs if the behavior is different from what it's saying (or more often fails to mention). And anything that deals with recursive structures (or looping without a clear counter and checks) is my one of the first candidate for checks.
> has no way to handle the overflow case after the fact.
Fork/Vendor the code and add your assertions.
flykespice
There is no easy way out when you're working with C: either you handle all possible UB cases with exhaustive checks, or you move on to another language.
(TIP: choose the latter)
odie5533
Can't use this library in production that's for sure.
ricardobeat
An int will be 32 bits on any non-ancient platform, so this means:
- a JSON file with nested values exceeding 2 billion depth
- a file with more than 2 billion lines
- a line with more than 2 billion characters
klysm
2 billion characters seems fairly plausible to hit in the real world
layer8
All very possible on modern platforms.
Maybe more importantly, I won’t trust the rest of the code if the author doesn’t seem to have the finite range of integer types in mind.
johnisgood
Personally, all my C code is written with SEI C Coding Standard in mind.
EmilStenstrom
Submit a PR!
codr7
JSON parser libraries in general is a black hole of suffering imo.
They're either written with a different use case in mind, or a complex mess of abstractions; often both.
It's not a very difficult problem to solve if you only write exactly what you need for your specific use case.
mbac32768
It's astonishing how involved a fucking modern JSON library becomes.
The once "very simple" C++ single-header JSON library by nlohmann is now
* 13 years old
* is still actively merging PRs (last one 5 hours ago)
* has 122 __million__ unit tests
Despite all this, it's self-admittedly still not the fastest possible way to parse JSON in C++. For that you might want to look into simdjson.
Don't start your own JSON parser library. Just don't. Yes you can whiteboard one that's 90% good enough in 45 minutes but that last 10% takes ten thousand man hours.
forty
Parsing JSON is a Minefield (2016)
flohofwoe
You can't get much more 'opinion-less' than this library though. Iterate over keys and array items, identify the value type and return string-slices.
IshKebab
It also feels like only half the job to me. Reminds me of SAX "parsers" that were barely more than lexers.
flohofwoe
I mean, what else is there to do when iterating over a JSON file? Delegating number parsing and UNICODE handling to the user can be considered a feature (since I can decide on my own how expensive/robust I want this to be).
nicce
The project advertises that it has zero-allocations with minimal state. I don’t think it is fair or our problems are very different. Single string, (the most used type), and you need an allocation.
EE84M3i
This is interesting, but how does this do on the conformance tests?
LegionMammal978
It doesn't seem to have much in the way of validation, e.g., it will indiscriminately let you use either ] or } to terminate an object or array. Also, it's more lenient than RFC or json.org JSON in allowing '\v' for whitespace. I'd treat it more as a "data extractor for known-correct JSON". But even then, rolling your own string or number parser could get annoying, unless the producer agrees on a subset of JSON syntax.
catlifeonmars
You know what would really be useful is a conformance test based on a particular real implementation.
What I mean by this is a subset (superset?) that exactly matches the parsing behavior of a specific target parsing library. Why is this useful? To avoid the class of vulnerabilities that rely on the same JSON being handled differently by two different parsers (you can exploit this to get around an authorization layer, for example).
Lucas_Marchetti
Real question, does it manage nested objects ?
morcus
Lucas_Marchetti
yep but how deep can you parse nested into nested etc
mbel
It feels like a stretch to call this a parser. It’s looks like a typical lexer?
adrianN
What’s the usecase for something like this? There are lots of excellent libraries for json available. Is this a teaching tool?
elcapitan
Being able to parse without a lot of overhead and without allocations is quite interesting. E.g. when you process some massive json dump to just extract some properties (the Wikidata dumps come to mind).
flohofwoe
Trivial to integrate into an existing code base, minimal size overhead, no heap allocations, no stdlib usage (only stdbool.h and stddef.h included for type definitions), no C++ template shenanigans and very simple and straightforward API. C libraries which tick all those boxes are actually quite rare, and C++ libraries are much rarer.
bb88
Embedded cpus is an easy one. You could maybe run an api server on a vape now.
Snild
Small code is easier to review, so projects with strict security requirements might be one?
Also, license compliance is very easy (no notice required).
CyberDildonics
A small single file, pure C dependency that doesn't allocate memory can be a universal solution to a common problem if it works well.
binary132
the more the merrier
p2detar
> Zero-allocations with minimal state
adsan
I suppose it's meant as a minimal library meant to be modded for the specific usecase.
sim7c00
this is really nice. i also _must_ use it because my initials are S.J H.. :').
on the more code side, love this, been looking to implement a simple json parser for some projects but this is small enough i can study it and either learn what i need or even use it. lovely!
cindyllm
[dead]
fnord77
I can see one bug just glancing at the code - feeding a stray '}' at the top level can result in depth becoming negative
flohofwoe
That's detected as an error though?
https://github.com/rxi/sj.h/blob/eb725e0858877e86932128836c1...
What I love about this author's work is that they're usually single-file libraries in ANSI C or Lua with focused scope, easy-to-use interface, and good documentation. And free software license. Aside from the posted project, some I like are:
- log.c - A simple logging library implemented in C99
- microui - A tiny immediate-mode UI library
- fe - A tiny, embeddable language implemented in ANSI C
- microtar - A lightweight tar library written in ANSI C
- cembed - A small utility for embedding files in a C header
- ini - A tiny ANSI C library for loading .ini config files
- json.lua - A lightweight JSON library for Lua
- lite - A lightweight text editor written in Lua
- cmixer - Portable ANSI C audio mixer for games
- uuid4 - A tiny C library for generating uuid4 strings