Rethinking DOM from first principles
75 comments
·August 6, 2025strogonoff
ale
While the web has grown complex in line with increasingly complex applications, the platform is also undeniably bloated, precisely because every new feature (like HTML in Canvas proposal) has to be shoehorned into an already very fragmented puzzle. Backwards compatibility has become an idealistic badge of honor rather than a technical feat. I believe the article does a good job at getting into the technical parts that are in fact not so remarkably robust, despite the web's organic growth. Even a bonsai tree needs to be pruned every once in a while. And while there will never be One Correct Way the way we engineer interfaces have converged a lot since the days of Flash meaning we can at least move the conversation forward in the Mostly Agreed Upon Way.
dleeftink
I'm reminded of how in the modular synthesis world Eurorack has standard 4U sizes but still allows other unit sizes to be fitted (e.g. 1U, 5U). Similarly, voltages and connections can be tweaked to your own content, as long as there exist appropriate adapters and converters in between.
We already have similar sofware development patterns, but I wonder what a Web API surface would look like when fully embracing a similar modular mindset.
strogonoff
I wouldn’t say that it’s not bloated or that it does not deserve to shed some legacy functionality, but I’m impressed at how it’s not more bloated and dysfunctional, given circumstances.
ezst
> I struggle to imagine how it could realistically be not complex.
Pretty easy, we should have had 2 standards, one being "Web for applications", built on a VM, stdlib, bytecode, RPC, UI framework and standard library of controls, ... And "Web for web pages" which was a solved problem pre-HTML5 days.
Java and Flash (although very problematic from a security point of view) were probably better bases on which to build "Web for applications" than HTML5+/CSS3+/JS/WASM will ever be, but it was intolerable for Google/Microsoft/Mozilla to hand the keys to Oracle/Adobe for that. It's all politics, and it's all worse and more complicated and more inefficient as a result.
davidmurdoch
You'd still have all the same backwards compatibility problems with the "Web for applications" stdlib and UI framework as we do now.
dvh
Nobody would be using webpages version and everybody would be using app version.
strogonoff
I would argue that a good layout engine for pages could see good use. See Gemini/Gopher—people like minimalism, but having to use a separate browser makes it probably too niche.
Contrary to what was stated in the suggestion, it’s hardly a solved problem—improvements are being made at a steady pace (grids or { text-wrap: pretty } come to mind)—but it is interesting to imagine, especially if there were extra compelling reasons for engineers to restrict themselves to purer hypertext document API if possible (for example, it could be much more stable, while the app part could be unlocked to evolve more quickly but would be more demanding to developers keeping their webapps up-to-date; search engines could prioritize it; there could be hosting services specializing in it; and so on).
One counter-argument is that you can’t neatly separate the two and engineers will definitely want to use both; one way around it I can see is if hypertext document functionality was possible to use from within the webapp somehow, but I haven’t given it that much thought obviously.
null
rizky05
[dead]
7bit
As for the vertical centering, trying this 20 years ago, it was a pain in the ass. These and layouting in general got massively simplified in the past 20 years. Remember that back then, everything was cramped into tables? I do. So if someone tells me, "DOM, HTML and CSS is becoming more and more difficult" just tells me to ignore their amateurish and unfounded opinion... Hah!
benoau
20 years ago it was much easier, you could ethically use tables for layouts which semantically did not make much sense, but in terms of code structure it was very simple: this table is the full size of the page, this cell is the header with this height, this cell is the sidebar with this width, this cell occupies the rest of the space with content positioned in the middle, all expressible without any styling at all.
strogonoff
Even if using { display: grid }, or a combination of flex rules, hardly seems like an ideal solution, considering there does exist { text-align: center } (which, incidentally, doesn’t only align the text!), I do agree with the general point. Today it is reasonably easy to make quite complex grid-driven layouts; not easy—often, if not always, making a complex thing easy requires also making it restrictive and opinionated—but perhaps easy enough considering the end result.
troupo
> Yes, some applications tend to have a large amount of markup for what seems like simple features (the Slack’s input box example). However, the alternative is that browser vendors bake it all in, and then every app is stuck with the opinionated way they think is right. Perhaps some amount of chaos is healthy.
Or, you know, provide a set of usable controls that provide useful functionality out of the box and provide a set of useful APIs so that people can either extend those controls or create their own.
Web Platform provides neither. Compare that to literally every other UI toolkit under the sun. Turbo Vision from 1990s was a better toolkit than anything the web has to offer.
vlindhol
I'm tempted to take the opposite stance to the author. The web as a platform is wildly successful, and it's interesting to think about why.
Surely the "loose" standards encouraged neat hacks that at some point were encoded as best practices and then standardized. Maybe that would tempt us to want to "cut the cruft" but a) people probably thought that many times previously and b) backwards compatibility is probably more valuable than one would think.
austin-cheney
Uggghhh, the article states correct facts about the DOM but grossly incorrect conclusions. Most developers have always feared working with the DOM. This irrationality is not new. I have no idea why, but tree models scare the shit out of college educated developers. That’s supremely weird because computer science education spends so much energy on data structures and tree models.
It also makes the conversation about WASM even more bizarre. Most college educated developers are scared of the DOM. Yes, it’s fear the emotion and it’s completely irrational. Trust me on this as I have watched it as a prior full time JS dev for over 15 years. Developers are continuously trying to hide from the thing with layers of unnecessary abstractions and often don’t know why because they have invested so much energy in masking their irrational nonsense.
Other developers that have not embraced this nightmare of emotions just simply wish WASM would replace JS so they don’t have touch any of this. This is problematic because you don’t need anything to do with JS or the DOM to deploy WASM, but it’s a sandbox that ignores the containing web page, which is absolutely not a replacement. For WASM to become a replacement it would have to gain full DOM access to the containing page. Browser makers have refused to do that for clear security reasons.
So you get people investing their entire careers trying to hide from the DOM with unnecessary abstractions and then other developers that want bypass the nonsense by embracing that thing they don’t know they are yet afraid of it.
That is super fucking weird, but it makes for fun stories to nondevelopers that wonder why software is the way it is.
assimpleaspossi
I closed my web dev business just three years ago. I found that many people who work with the web don't want to do the work to understand how it all works. They think there must be a library somewhere to do "that" while doing "that" is simple enough using standard components and features.
Another issue is people basing their fears of things in the past. Yes, the web was more difficult to do fancy things but often they're trying to push the web to do things it just couldn't do back then. Now you can using basic, built-in functionality and it's often easier that way.
afiori
The reason WASM does not have dom access is that many recent DOM APIs return and expect javascript objects and classes like iterators, so you would still need some thin js glue wrapper between the dom and wasm. Security has nothing to do with it as (performace aside) wasm+minimal js glue can already do anything js can do
chrismorgan
> For WASM to become a replacement it would have to gain full DOM access to the containing page.
To become a total replacement, as in no-JavaScript-at-all-needed, sure, WASM would need to be able to access the DOM. But to to replace JavaScript as the language you’re writing, you can easily generate DOM bindings so you trampoline via JavaScript, and people have been doing this for as long as WASM has been around.
Giving WASM direct DOM access does not enable anything new: it merely lets you slim down your JS bindings and potentially improve time or memory performance.
> Browser makers have refused to do that for clear security reasons.
Actually, they’re a long way down the path of doing it. They’ve just been taking their time to make sure it’s done right—they’ve headed in at least three different directions so far. But it’s been clear from just about the start that it was an eventual goal.
continuational
The reason working with the DOM directly is hard is that you have to implement arbitrary patching to go from one state to another.
The entire point of frameworks like React is to avoid the problem, by automatically creating and applying the patch for you.
It's not irrational; quite the contrary.
alpha_squared
Svelte seems to do this just fine. It's much simpler to work with, doesn't introduce too much proprietary code, and is both lightweight and incredibly fast.
troupo
> I have no idea why, but tree models scare the shit out of college educated developers.
Very few people are "scared" of tree models.
The problem of working with the DOM is that it's:
- 90s JAVA-like verbose unwieldy API that requires tons of boilerplate to do the simplest things
- Extremely anemic API that is neither low-level enough to let you do your own stuff easily, nor high-level enough to just create what you need out of existing building blocks
- An API that is completely non-composable
- A rendering system that is actively getting in the way of doing things, and where you have to be acutely aware of all the hundreds of pitfalls and corner cases when you so much as change an element border (which may trigger a full re-layout of the entire page)
- A rendering system which is extremely non-performant for anything more complex than a static web page (and it barely manages to do even that). Any "amazing feats of performance" that people may demonstrate are either very carefully coded, use the exact same techniques as other toolkits (e.g. canvas or webgl), or are absolute table stakes for anything else under the sun. I mean, an frontpage article last week was how it needed 60% CPU and 25% GPU to animate three rectangles: https://www.granola.ai/blog/dont-animate-height
> So you get people investing their entire careers trying to hide from the DOM with unnecessary abstractions
The abstractions of the past 15 or so years have been trying to hide from the DOM only because the DOM is both extremely non-performant and has an API even a mother wouldn't love.
austin-cheney
This is exactly what I am talking about. All these excuses, especially about vanity, are masking behaviors.
DOM access is not quite as fast now as it was 10 years ago. In Firefox I was getting just under a billion operations per second when perf testing on hardware with slow DDR3 memory. People with more modern hardware were getting closer to 5 billion ops/second. That isn’t slow.
Chrome has always been much slower. Back then I was getting closer to a max of 50 million ops/second perf testing the DOM. Now Chrome is about half that fast, but their string interpolation of query strings is about 10x faster.
The only real performance problem is the JS developer doing stupid shit.
troupo
> All these excuses, especially about vanity, are masking behaviors.
1. These are not excuses, these are facts of life
2. No idea where you got vanity from
> DOM access is not quite as fast now as it was 10 years ago. I was getting just under a billion operations per second
Who said anything about DOM access?
> The only real performance problem is the JS developer doing stupid shit.
Ah yes. I didn't know that "animating a simple rectangle requires 60% CPU" is "developers doing stupid shit" and not DOM being slow because you could do meaningless "DOM access" billions time a second.
Please re-read what I wrote and make a good faith attempt to understand it. Overcome your bias and foregone conclusions.
worthless-trash
> Browser makers have refused to do that for clear security reasons.
Because only javascript should be allowed to screw up that badly.
afiori
some of the worst dom api were designed with java compatibilty in mind
LauraMedia
I think for a system that can basically do EVERYTHING, HTML is quite well designed. And I think keeping backwards compatibility for SO long is a big achievement and a good thing.
I also think that if we could roll back time and had the knowledge of today, instead of fixed elements with user-agent styling and hard-coded restrictions, I would've crafted a system of arbitrary nodes that can have modifiers stacked on them.
So instead of
<ul> you could use <Features list>. This would minimize the very different but basically same CSS properties as well and trim out A LOT of HTML tags. Think <Comment collapsible link> instead of wrapping a <details> in an <a>.
That's basically how React and Vue started out with the component system, but I'm thinking more of a GameObject & Component system like with Unity.
parasti
I feel there's space for brainstorming and creating new ways of making web apps without having to take a stand against the status quo. It's a fun thought exercise, naming all the things you think are wrong with the web, but I had to scroll real far to see that this is a post about Use.GPU: "Use.GPU is a set of declarative, reactive WebGPU legos. Compose live graphs, layouts, meshes and shaders, on the fly." So felt like a missed opportunity to me to highlight that more instead of going through the list of annoyances.
AshleysBrain
It's easy to say "XYZ is dead, time to replace it with something better". Another example is the Win32 APIs are hideous (look up everything SetWindowPos does) and need replacing.
In the real world though, backwards compatibility reigns supreme. Even if you do go and make a better thing, nobody will use it until it can do the vast majority of what the old thing did. Even then, switching is costly, so a huge chunk of people just won't. Now you have two systems to maintain and arguably an even bigger mess. See Win32 vs. WinRT vs. Windows App SDK or however many else there are now.
So if you're serious about improving big mature platforms, you need a very good plan for how you will handle the transition. Perhaps a new API with a compatibility layer on top is a good approach, but the compatibility layer has to have exactly 100% fidelity, and you can never get rid of it. At the scale of these platforms, that is extremely hard. Even at the end of the day, with a huge compatibility layer like that, have you really made a better and less bloated system? This is why we tend to just muddle along - as much as we all like to dream, it's probably actually the best approach.
lhmiles
Very nice post. Maybe the best micro CSS basics explanation I've ever seen
b_e_n_t_o_n
Good article. It kind of makes me question how long we can go down this path though. Like surely we can't keep adding to css and the dom api's for 20 more years? How much bloat will we accumulate before we start over?
I hate to say it, but perhaps the browser needs a completely new standard designed for shipping applications? Something akin to what's discussed in the article - a simple but robust layout system built with a flexbox-like API and let us bind shaders to elements. We don't need css if we have shaders. And I don't think adding more and more features to current api's is gonna solve problems long term.
LorenDB
I'd vote for something along the lines of QML or Slint for defining UIs; business logic should be purely done in WASM.
fiedzia
> surely we can't keep adding to css and the dom api's for 20 more years?
We can. Just every now and then some new way of working becomes popular, and at some point combining them with older ones will become undefined or unsupported.
simonask
I don't know exact what system you have in mind, but writing a performant shader is hard. Requiring that designers attach arbitrary shader code to HTML elements is an easy way to absolutely tank the performance of the web.
Flexbox also isn't great at all for many, many use cases - its performance absolutely tanks outside of the use case it was designed for, specifically a 1D flow of blocks along an axis. If you want a grid layout, choose the grid layout algorithm.
Any system here must accommodate extremely heterogeneous requirements, so it will inevitably become "bloat". One alternative future you could envision is based on WASM and WebGPU, where each site is essentially an app that pulls in whatever libraries and frameworks it needs to do its work, but that's also pretty far off, since there is not sufficient standardization of the protocols used by WASM UI frameworks.
b_e_n_t_o_n
I wouldn't expect devs to be writing their own shaders all the time - the browser could have standard shaders, and no doubt libraries would crop up that offer more.
gherkinnn
I smell second system syndrome.
Not only that, a new system will get completely coopted by the likes of Google for their own purposes. The result of what is built is in large parts a function of the culture that builds it. And I for one have zero interest in the current tech culture building a DOM 2.
Yes, I am generally weary of rewrites.
skeezyboy
> I hate to say it, but perhaps the browser needs a completely new standard designed for shipping applications?
In all seriousness, isnt this what Java is for? Why would you need to treat a web browser like a virtual machine?
fiedzia
> Why would you need to treat a web browser like a virtual machine?
There are many reasons. Performance, ability to bring concepts from other domains, ability to do things browser has no api for, ability to provide controlled experience and behaviour that goes beyond common browser usage.
superkuh
Glad the title was changed because this article isn't about HTML at all. Instead it seems to be about corporate/for-profit needs for their web applications that happen to touch HTML in some parts. All about throwing away the good parts of HTML to make laying out applications easier and prettier.
To this I say: go away and leave HTML alone if you want to build some application from first principles. The web's first principle is that HTML should have text. Hyper TEXT MARK-UP language.
chrismorgan
> SVG can e.g. do polygonal hit-testing for mouse events, which CSS cannot
Yes it can. clip-path does just that.
skeezyboy
i remember hearing html was dead in 2001
People often lament how DOM, HTML and CSS are becoming more and more complicated: the difficulty with simple and/or common tasks like vertical centering or virtualization, 600+ CSS properties, so many JavaScript methods, leaky abstractions, { contain: size }. I agree on many issues, but equally I struggle to imagine how it could realistically be not complex.
If it was a result of a single very well thought through vision and developers were expected to be committed to conforming to the latest API (think Apple’s iOS runtime or the like), we could maybe expect the <thread> and <comment> tags, we could demand there to be The One Correct Way of doing anything, that the “fat” is trimmed quickly and features go from deprecated to gone in a year. However, it is a product designed by committee (in fact, by multitudes of various committees) that has largely maintained backwards compatibility for decades, it is a free runtime that grew organically from what was supposed to be a basic hyperlinked document layout engine but now powers fully dynamic applications rivaling their native equivalents yet still has a pretty low barrier to entry for new developers, and as such it’s remarkably robust.
Yes, some applications tend to have a large amount of markup for what seems like simple features (the Slack’s input box example). However, the alternative is that browser vendors bake it all in, and then every app is stuck with the opinionated way they think is right. Perhaps some amount of chaos is healthy.