Skip to content(if available)orjump to list(if available)

Robust Wavefront OBJ model parsing in C

virtualritz

Curiously, what people commonly refer to as 'Waterfront OBJ' is merely a tiny subset of that format. I.e. the part dealing with polygons.

The format supports e.g. higher order curves and surfaces and apps like Maya or Rhino3D can read and write OBj files containing such data. [1]

Writing a parser for the polygon subset also comes with some caveats.

If your target is a GPU you probably need to care for robust triangulation of n-gons and making per-face-per-vertex data per-vertex on disconnected triangles.

Vice versa, if you are feeding data to an offline renderer you want to absolutely preserves such information.

I believe the tobj Rust crate is one of the few OBJ importers that handles all edge cases. [2] If you think it doesn't, let me know and I will fix that.

This is surprising for people familiar with one but not the other of the requirements of offline- or GPU rendering.

I.e. if you write an OBJ reader this can become a challenge; see e.g. an issue I opened here [3].

1. https://paulbourke.net/dataformats/obj/

2. https://docs.rs/tobj/latest/tobj/struct.LoadOptions.html

3. https://github.com/assimp/assimp/issues/3677

the__alchemist

How does this compare to the `obj` crate? I'm assuming that doesn't handle cases beyond the common one well? I ask because I have a 3D rendering/GUI application lib in rust (`graphics` crate), and for OBJ files, I thinly wrap that to turn it into a mesh.

In my own applications, it hasn't come up, as I've been mostly using primitives and dynamically-generated meshes, but am wondering if I should switch.

grandempire

> robust triangulation of n-gons and making per-face-per-vertex data per-vertex on disconnected triangles.

This is a simple post-process step after parsing.

suspended_state

That's very nice work, and many interesting concepts introduced in the post (for example, arenas, length bounded strings, the Cut struct).

One caveat though:

> If the OBJ source cannot fit in memory, then the model won’t fit in memory.

I don't think that this is true: a (single precision) float textual representation is typically equal or larger than its binary representation (4 bytes), the floating point used in the renderer given later in the post. The numbers given in the cube example are unlikely to occur in real world examples, where one would probably expect more than 2 digits of decimal precision. That being said, for double precision floats, it might be true in many scenarios, but I would not make that a cardinal rule anyway.

This corner cut fits within the objective of the post, which, imho, isn't to make the most efficient program, but provide a great foundation in C to build upon.

dahart

The sentence you quoted must be true because the input file and the output binary model both need to fit in memory at the same time.

grandempire

The technique shown can be easily adapted to mmap.

pixelesque

As someone who has written multiple OBJ readers over the years, this is interesting, but noteably seems to be ignoring texture coords (UV coords), and doesn't support object groups.

Also obj material support is an absolute nightmare if you ever try and support that: there's technically a sort of original standard (made around 30 years ago, so understandably somewhat messy given how far materials and physically-based shading has come in the mean time), but different DCCs do vastly different things, especially for things like texture paths and things like specular/roughness...

milesrout

I think it doesn't support 'vt' because the techniques are adequately demonstrated just with faces and normals, so it would be more code without serving any pedagogical purpose. The author would, I think, not suggest you copy this code and try to use it as a library or something, but that you should develop the skillset to be able to write code like this when you need it.

tylermw

Note that there's a great C99/C++ single header library, tinyobjloader, that provides robust (in my experience) and feature-full OBJ loading, including triangulation of n-gons and full material parsing.

https://github.com/tinyobjloader/tinyobjloader

It's fairly mature and handles many of the parsing footguns you'll inevitably run into trying to write your own OBJ parser.

pjmlp

The usual rite of passage into 3D programming in the old days, adding all the things that OpenGL doesn't do out of the box like in other 3D frameworks, naturally the 3D assets loading code was OBJ based.

Nowadays you can have the same fun by rewriting the previous sentence using Vulkan instead of OpenGL, and glTF instead of OBJ.

rossant

"The same fun" but also likely orders of magnitude more efforts (and headaches).

pjmlp

Indeed, at least now there is an SDK as starting point.

smcameron

Heh. I feel that I might be somewhat responsible for the existence of this article. Perhaps merely a coincidence, but this happened a few days ago: https://old.reddit.com/r/C_Programming/comments/1itrhd9/blat... in which the author schools me on the topic of American Fuzzy Lop which he applied to my home made OBJ parser.

animal531

This is one of those things where for literally every 3d tool you test it against you're going to find new edge cases that breaks the code.

bvrmn

> Str substring(Str s, ptrdiff_t i)

The function has quite questionable implementation. It fails miserably for strings with length < i.

Joker_vD

Only because every other Str-accepting function uses "s.len" instead of "s.len > 0" as the "is s non-empty" test.

Still, this function is called only once, and in that call, its i argument is always <= length, so it's perfectly fine (it's only UB if you actually pass it a bad argument).

bvrmn

> Still, this function is called only once, and in that call, its i argument is always <= length, so it's perfectly fine (it's only UB if you actually pass it a bad argument).

This very mindset is a source of bugs and vulnerabilities. The author has high marks from me on safety and "make it hard to use wrong" and it's quite surprising to see such code.

grandempire

Satisfying preconditions is a requirement to make functioning programs.

The insanity would be assuming that every function is valid for the Cartesian product of all possible of its arguments.

What he probably needs is an assert

UncleEntity

Reminds me of the time I was chastised for adding a NULL check to keep <program> from segfaulting by the dev responsible for said segfault because crashing without even as much as a warning was "intended behavior". IIRC this was over reading a file from disk and just assuming it existed.

writebetterc

This way of writing programs is also quite a lot faster than depending on fgetline and the like. The integer and float parsing is probably slow, though.

My question is: Does the author actually use Windows XP?

claytonaalves

> Does the author actually use Windows XP?

I've switched to XP (from Windows 7, on a VM) and the performance is astounding even on limited hardware settings. No bloatware, just good old Win32 x86.

MSFT_Edging

I recently pulled an old laptop out of the closet with a mostly stock image of XP to play with an old device and it felt so snappy.

It's sad how bloated things have gotten.

1f60c

Is it connected to the network?

oguz-ismail

Mine is. Why?

kilpikaarna

> My question is: Does the author actually use Windows XP?

Significant overlap between the types of people who use WinXP and write 3D file format importers in C, I think! Though I prefer 7 myself.

kleiba

Who can do it as a one liner with a regex?

creaktive

Been there, done that... Not worth it

turnsout

This sent me down a rabbit hole reading about the author's style of having an "Arena allocator," [0] which was fascinating. I often did something similar when writing ANSI C back in the day—allocate a big enough chunk of memory to operate, and do your own bookkeeping. But his Arena implementation looks more flexible and robust.

  [0]: https://nullprogram.com/blog/2023/09/27/