Better typography with text-wrap pretty
156 comments
·April 8, 2025taeric
queuebert
Text wrapping is actually a difficult optimization problem. That's part of the reason LaTeX has such good text wrapping -- it can spend serious CPU cycles on the problem because it doesn't do it in real time.
taeric
You aren't wrong; but I stand by my claim. For one, plenty of things are actually difficult optimization problems that people don't give any passing thought to.
But, more importantly, the amount of cycles that would be needed to text-wrap most websites is effectively zero. Most websites are simply not typesetting the volumes of text that would be needed for this to be a concern.
Happy to be shown I'm flat wrong on that. What sites are you envisioning this will take a lot of time for?
pcwalton
> But, more importantly, the amount of cycles that would be needed to text-wrap most websites is effectively zero.
I've measured this, and no, it's not. What you're missing is the complexities of typesetting Unicode and OpenType, where GSUB/GPOS tables, bidi, ruby text, etc. combine to make typesetting quite complex and expensive. HarfBuzz is 290,000 lines of code for a reason. Typesetting Latin-only text in Times New Roman is quick, sure, but that doesn't cut it nowadays.
binaryturtle
With the current state of Websites and how much resources they waste any text wrapping is probably not an issue at all. :)
I hardly can open any website w/o some anti-bot check burning my CPU to the ground for 1/2 minute or something (if it doesn't manage to entirely crash my Firefox in the process like cloudflare). I rather would wait for 0.2s text wrapping than that, that's for sure. :)
cobertos
Any page with dynamic text. If the calculation takes a moderate amount of time, that will accumulate if the page layout reflows a lot.
contact9879
Quick example would be https://standardebooks.org/ebooks/mark-twain/the-innocents-a...
Try zooming in and out with text-wrap: pretty vs text-wrap: wrap
NoMoreNicksLeft
Won't this end up in Apple iBooks or whatever it's called now? Most novels can be a megabyte or more of text, pretty much all of it needing to be wrapped.
porphyra
But with modern hardware, running the dynamic programming solution to this optimization problem takes a trivial amount of cycles* compared to rendering your typical React webapp.
* for most webpages. Of course you can come up with giant ebooks or other lengthy content for which this will be more challenging.
ta988
even on 5 pages documents LaTeX can spend a surprising amount of time
throw0101d
> That's part of the reason LaTeX has such good text wrapping -- it can spend serious CPU cycles on the problem because it doesn't do it in real time.
Is that the reason the Microsoft Word team tells themselves as well?
We have multi-core, multi-gigahertz CPUs these days: there aren't cycles to spare to do this?
queuebert
You would think, but Word has some serious efficiency problems that I can't explain. For one, it is an order of magnitude slower at simply performing word counts than tools like wc or awk. Besides that, the problem does not parallelize well, due to the long-range dependency of line breaks.
Zooming in a bit, Word also does not kern fonts as well as LaTeX, so it might be missing some logic there that would trickle down into more beautiful word spacing and text flow.
toomim
It's a O(n^2) to O(n!) problem, not O(n), so it doesn't scale linearly with cpu cores.
jcelerier
to be honest as a LaTeX user on a very beefy CPU I regularly have 30s+ of build times for larger documents. I doubt Word users would want that. A simple 3 page letter without any graphics is a couple seconds already.
int_19h
Lest we forget, TeX is almost 50 years old now, so what constitutes "serious CPU cycles" has to be understood in the context of hardware available at the time.
setopt
TeX is still slow to compile documents on my current device (MacBook M1), especially when compared to e.g. Typst. I can only imagine how slow it would have been on a 40yo computer.
jgalt212
Computerphile did a nice video on this.
https://www.youtube.com/watch?v=kzdugwr4Fgk
The Kindle Text Problem - Computerphile
jkmcf
That's true, but I think the OP is commenting on the state of FE development :)
taeric
Largely, yes. I also challenge if it would be measurable for the size of most sites.
Typesetting all of wikipedia? Probably measurable. Typesetting a single article of wikipedia? Probably not. And I'd wager most sites would be even easier than wikipedia.
frereubu
You've phrased your comment as if it's a counterpoint to OP, but it's not - both can be true (and from personal experience OP is absolutely right).
watersb
The best study on the optimization of line breaking algorithms is now only on the Internet Archive. Lots of examples.
"Line Breaking", xxyxyz.org
https://web.archive.org/web/20171021044009/http://xxyxyz.org...
_moof
Same. I read that and think, oh NOW you all are worried about performance?
zigzag312
I once did a naive text wrapping implementation for a game and with a longer text it caused performance to drop way below 60 FPS.
This was on a 4.5 GHz quad core CPU. Single threaded performance of todays top CPUs is only 2-3x faster, but many gamers now have 144Hz+ displays.
pcwalton
Remember the days of Zuck saying "going with HTML5 instead of native was our biggest mistake"? Though hardware improvements have done a lot to reduce the perceptible performance gap between native and the Web, browser developers haven't forgotten those days, and layout is often high in the profile.
jcelerier
I have to consider the performance of rendering text literally all time, even without wrapping. This is one of the most gluttonous operations when rendering a UI if you want, say, 60 fps on a raspberry pi zero.
dominicrose
Basically everything that comes built-in a browser has to be perform well in most use-cases and most devices. We don't want an extra 5% quality at the cost of degraded performance.
0cf8612b2e1e
Open any random site without an ad blocker and it is clear that nobody cares about well optimized sites.
Telemakhos
Very likely the site is well optimized. It's optimized for search engines, which is why we found the site in the first place, which is in turn the reason I said "very likely" in the first sentence: we come upon web sites not truly randomly but because someone optimized them for search ranking. It also appears from your "without an ad blocker" that the site may be optimized for ad revenue, monetizing our visit as much as possible. There's probably optimization of tracking you in order to support those ads, too.
What you're complaining about is that the site is not optimized for your reading enjoyment. The site is probably quite well optimized, but your reading enjoyment was not one of the optimizer's priorities. I think we agree about how annoying that is and how prevalent, so the news that new typographical features are coming seems to me like good news for those of us who would appreciate more sites that prioritize us the readers over other possible optimization strategies.
taeric
I want to believe you. I just can't bring myself to agree, anymore. Most sites are flat out not optimized, at all. Worse, many of them have instrumentation buckled on to interface with several different analytics tools.
And to be clear, most sites flat out don't need to be optimized. Laying out the content of a single site's page is not something that needs a ton of effort put into it. At least, not a ton in comparison to the power of most machines, nowadays.
This is why, if I open up GMail in the inspector tab, I see upwards of 500+ requests in less than 10 seconds. All to load my inbox, which is almost certainly less than the 5 megs that has been transferred. And I'd assume GMail has to be one of the more optimized sites out there.
Now, to your point, I do think a lot of the discussion around web technologies is akin to low level assembly discussions. The markup and script layout of most sites is optimized for development of the site and the creation of the content far more than it is for display. That we have moved to "webpack" tricks to optimize rendering speaks to that.
lcnPylGDnU4H9OF
The developer in such a case is only allowed to care as much as the PM.
tiltowait
I’m pretty excited for this to be added to ereaders, which notoriously (among people who care about this kind of thing) have terrible layout engines.
velcrovan
Better ways of laying out digital text have existed since before ereaders existed. Even this one CSS directive has already been supported by Chrome for two years. What's missing is Amazon & co. giving a shit about it. That needle shows no signs of moving.
MBCook
> Even this one CSS directive has already been supported by Chrome for two years
The article says what chrome does is only support the “no super short lines” bit.
So while you won’t end up with one word on its own line at the end of a paragraph, it’s not trying to prevent rivers of text or keep a relatively clean ragged right or anything else.
That’s allowed by spec, but it’s not quite the same thing.
Cthulhu_
I was about to ask about that, how are / were traditional paper books lined out to prevent this? Surely not by hand. Proprietary software maybe?
Sharlin
Well, before desktop publishing, by phototypesetting [1], before that by hot-metal typesetting [2], before that, by hand. Nowadays, with software like Adobe InDesign, of if you happen to be a CS/math/physics nerd, with LaTeX, which has a famously high-quality layouting engine that utilizes the Knuth–Plass line-breaking algorithm [3]. Indeed it's fairly well known that Donald Knuth created TeX because he wasn't happy with the phototypeset proofs he received in 1977 for the second edition of The Art of Computer Programming, finding them inferior compared to the hot-metal typeset first edition.
[1] https://en.wikipedia.org/wiki/Phototypesetting
[2] https://en.wikipedia.org/wiki/Hot_metal_typesetting
[3] https://en.wikipedia.org/wiki/Knuth%E2%80%93Plass_line-break...
Telemakhos
The old linotype machines had visual indicators of minimum and maximum line width, and the operator would make a judgement call with each line. Spacers would then automatically justify the letters. It was all mechanical and amazing.
See http://widespacer.blogspot.com/2014/01/two-spaces-old-typist... for many details.
aardvark179
Books were produced before computers, and with very good typesetting. One difference between websites and books is that theee is a feedback loop with books where somebody ia at looking at the layout and either adjusting the spacing subtly, or even editing the text to avoid problems. Sometimes this is just to ensure that left justified text isn’t too ragged on the right edge, sometimes it’s to avoid river of space rubbing through a paragraph, and sometimes it’s editing or typesetting to avoid orphans.
But text on a page is set for a set layout, and that’s where the web really differs.
Finnucane
In ye olde dayes, indeed by hand. That's why there was often extra space after punctuation. In more mechanized times, the operator has to watch for it. Proofreaders are trained to watch for loose lines, rivers, widows, hyphenation errors, and other spacing problems. Those things will be marked as errors in proof. Even with modern DTP tools, typesetters still have to make a lot of manual corrections. Of course, for print, you're setting for a fixed format. You can do a lot of fine-tuning that a browser can't do on the fly.
omnimus
Nowdays basically any professionally produced book is made in Indesign. And text wrapping is semi-automated. It's automated but checked for issues and fixed manually. Indesign has two text wrapping algorithms paragraph composer that tries to balance whole paragraphs and line composer that only checks line by line.
Surprisingly in the high-end the less automated line composer is used a lot more. It requires more work but human decisions lead to best results if done properly.
TiredOfLife
Currently on Android i use Moon+ reader that has hyphenation + hanging punctuation. Before that (2008-2013) I used eInk reader that came with CoolReader (its layout engine "crengine" is the base for KOReader) that also had good hyphenation and hanging punctuation and nice footnotes.
So in my experience ereaders have had great layout engines.
Finnucane
Given the way ebook software is developed, it'll be years before this makes it to a device near you.
archagon
I dunno, I imagine Apple Books would be eager to implement this as soon as possible.
MBCook
Note that it’s up to the browser to do whatever it thinks is best. They didn’t lay down specific rules.
So unless the e-reader uses an engine that already has good rules there will be no real change unless the manufacturer does what it should have already.
taeric
Is this where folks will get it for ereaders? Naively, I hadn't realized ereaders were glorified webkit displays.
spookie
Sometimes I like to think Knuth may have intrusive violent thoughts about how shit programmers make text look.
null
numbers
yeah, I agree, sometimes reading an ebook feels very off b/c of the lines just looking way too justified
Sloppy
Far to little effort and attention has been devoted to creating beautiful text online. The web set text back centuries. In some ways it was never this bad except for the monospaced typewriters. This is welcome indeed.
accrual
This made me think of one person who cares for it - Matthew Butterick's Practical Typography appears to have spent quite a bit on bringing typeset-like text to his website.
ashton314
MB over-engineered that book. Example: look at the paragraph that stars “But I don’t have visual skills” on this page: https://practicaltypography.com/why-does-typography-matter.h...
Notice how the open quotation marks hang into the left margin. There’s been some recent work with CSS to make this automatic, but that’s newer than this book and support is spotty. MB made it happen with a (iirc) custom filter inside the Pollen setup he made for this book. Wild. And beautiful.
flobosg
See also the added soft hyphens within each word for hyphenation: https://practicaltypography.com/optional-hyphens.html#:~:tex...
crazygringo
This is fantastic. I'm not surprised they focus on short last lines and on rag, since it's easy to imagine defining metrics for them to then minimize.
But they say:
> We are not yet making adjustments to prevent rivers, although we’d love to in the future.
And indeed, I couldn't even begin to guess how to define a metric for rivers, that can occur at different angles, with different variation, being interrupted by various degrees... I'm curious if there's a clever metric anybody has invented that actually works? Or does it basically require some level of neural-network pattern recognition that is way too expensive to calculate 1,000 variations of for each paragraph?
ameliaquining
There's a TeX package that, among many other features, detects rivers: https://mirrors.ibiblio.org/pub/mirrors/CTAN/macros/latex/co...
The intent here is that the document author is informed that their text contains rivers, and responds by tweaking the wording until they land on something that doesn't have them.
Of course, for a browser engine this is a complete nonstarter; a useful feature for dealing with rivers would require not just detecting them but automatically removing them, without changing the text content. I'm not aware of any existing software that does this, but I've found one proposed list of exploratory directions that could make a decent starting point for anyone who wanted to build this: https://tex.stackexchange.com/a/736578
taeric
I think the main difficulty is that it is a paragraph level optimization and not a line one. Right? Otherwise, it seems like you can probably get pretty far by defining a metric that looks at connected whitespace sections between lines? With higher penalty for connected space between words that has been stretched. (That is, if you have space between some words expanded to make them pretty at the edge, those will be more visible as rivers if they are stacked?)
And, yes, there are some concerns that are done at the line level that could lead to a paragraph getting reworked. Ending with a single word, is an easy example. That is still something where you can evaluate it at the line level easily.
crazygringo
I think the difficulties are, how close do spaces need to be to be considered connected? Rivers aren't only perfectly vertical. And to what degree do they need to maintain the same angle across consecutive lines? How much can they wiggle? And a river is still visible across 10 lines even if one line in the middle doesn't have the space, so it needs to be able to handle breaks in contiguity.
There's no problem with paragraph-level optimizations inherently. Reducing raggedness is paragraph-level and that's comparatively easy. The problem is the metric in the first place.
taeric
I wouldn't try and consider spaces individually, I don't think? Rather, I'd consider the amount of space being considered. We aren't talking about fixed width typesetting, after all. To that end, you will have more space after punctuation and such. Rather than try to enumerate the different options, though, you almost certainly have some model of how much "space" is in a section. Try different model weights for how much to penalize different amounts of connected space and see how well different models optimize.
Or, maybe not? I'll note that the vast majority of "rivers" I've seen in texts coincide with punctuation quite heavily. Even the example in this article has 5/8 lines using a comma to show the river. With the other lines having what seems to be obvious stretched space between words to use more of the line? Maybe enumerating the different reasons for space would be enough?
Granted, this also calls out how dependent you almost certainly are on the font being used, as well?
6510
I hve no idea but it looks something like this
https://www.loc.gov/resource/gdcwdl.wdl_03042/?sp=5&r=-0.122...
fngjdflmdflg
>The purpose of pretty, as designed by the CSS Working Group, is for each browser to do what it can to improve how text wraps. [...] There is no mandate for every browser to make the same choices. In fact, a browser team might decide in 2025 to handle some aspects of improving these qualities, and then change what their implementation does in the future. [...] Because of the way Chrome’s implementation of pretty has been taught, a lot of web developers expect this value is only supposed to prevent short last lines. But that was never the intention.
Why did they even design it like this in the first place? This seems like it is counter to much of what browsers have been doing recently like being able to customize select, the browser interop and baseline projects, web platform test etc. I would rather move away from these types of features in favor of more explicit ones. I understand that this isn't a serious issue and is unlikely to cause bugs compared to other interop issues which are true deviations from the spec. It just seems counterintuitive to do this though.
giraffe_lady
They point to the reason in the intro but don't make it explicit: it's because this is at the intersection of computing with a much older tradition, typesetting.
There's no "correct" way to typeset a document, there wouldn't even be a consensus among typesetters on what the implementation specifics look like. Rather than turn the spec committee into a decades-long ecumenical council of typographers they just left the specifics up to each individual "shop" as it always has been. Except now instead of printers it's the browser vendors needing to make the final call.
fngjdflmdflg
>There's no "correct" way to typeset a document
They can add multiple typesetting properties and allow the develop to decide which one to use. Besides, letting each browser decide what the "best" line break looks like doesn't solve the problem of there not being a definitive answer to that question. Even here, I don't think the Chrome developers have a vastly different opinion on what a good line break looks like. It's possible they didn't like the performance implications of webkit's version or had some other tangential reason, although the blog says performance is not an issue.
giraffe_lady
ok. you should tell them.
mrandish
> Rather than turn the spec committee into a decades-long ecumenical council of typographers...
Having worked with passionate (aka opinionated) typographers, that phrasing earned a well-deserved chuckle. Leaving implementation choices up to each browser was certainly the only to way to get it into CSS. Hopefully various implementations will evolve over time and coalesce into a fairly consistent baseline.
moralestapia
While this is valuable work, leaving it implementation-dependent is a terrible mistake.
The whole point of CSS is/was to standardize presentation across browsers.
alwillis
The whole point of CSS is/was to standardize presentation across browsers.
CSS was created to standardize how to deal with presentation, but that doesn't mean every website should look exactly the same on every device or in every browser. The era of attempting to do that is over.
text-wrap: pretty is a great example of progressive enhancement [1]: it's a great way to add some polish to a website but if the user's device doesn't support it, they can still access all of the content on the site and be none the wiser.
If you read the CSS specifications, browser makers, in some cases, are allowed to use platform-specific heuristics to determine whether or not to execute certain features. Downloading web fonts works like this—browsers fallback to system fonts if a webfont doesn't download within 3 seconds.
It makes sense that text-wrap: pretty should be one of those. If your smartphone is low on power and the signal isn't that great, you can forgo expertly wrapped text and elegant hyphenation in order to view the webpage as quickly as possible.
fngjdflmdflg
>CSS was created to standardize how to deal with presentation, but that doesn't mean every website should look exactly the same on every device or in every browser. The era of attempting to do that is over.
For every device I agree but that was never the goal of CSS. It is meant to respond to the device's constraints such as screen dimensions and device type (desktop, mobile, print) using eg. media queries. In every browser I do think they should try to accomplish the same thing. Even if the exact algorithms used are different, the intended result should be agreed upon.
moralestapia
>but that doesn't mean every website should look exactly the same on every device or in every browser
That was the point. Maybe Gen Z changed its meaning now, but that was the main premise. There was even the Acid3 test and similar stuff.
fngjdflmdflg
It seems that the original spec had a more explicit intention:
>The pretty value is intended for body text, where the last line is expected to be a bit shorter than the average line.[0]
which seems to mainly be about avoiding short last lines. That is from a note. The actual value "specifies the UA should bias for better layout over speed, and is expected to consider multiple lines, when making break decisions," which is more broad. But the intent is clearly specified in the note. This is also how chrome described the feature as mentioned in the article. But it does say that the effect would change in the future:
>The feature does a little more than just ensure paragraphs don't end with a single word, it also adjusts hyphenation if consecutive hyphenated lines appear at the end of a paragraph or adjusts previous lines to make room. It will also appropriately adjust for text justification. text-wrap: pretty is for generally better line wrapping and text breaking, currently focused on orphans. In the future, text-wrap: pretty may offer more improvements.[1]
The design doc linked in [1] says this about it:
>The `text-wrap: pretty` is the property to minimize typographic orphans without such side effects.
>There are other possible advantages for paragraph-level line breaking, such as minimizing rivers. The csswg/#672 describes such other possible advantages. But the initial implementation focuses on typographic orphans, as it’s the most visible benefit, and to minimize the performance impacts.
>Because paragraph-level algorithms are slow, there are multiple variants to mitigate the performance impacts.[2]
The new draft[3] changed it to the current definition. What's also interesting from that new draft is this new note:
>The necessary computations may be expensive, especially when applied to large amounts of text. Authors are encouraged to assess the impact on performance when using text-wrap-style: pretty, and possibly use it selectively where it matters most.
which seems to go against what was written in the webkit blog. If developers start using this value everywhere expecting that it will be fast then that effectively stops future implementations from using a slower but better algorithm (assuming one exists).
[0] https://www.w3.org/TR/css-text-4/#propdef-text-wrap-style
[1] https://developer.chrome.com/blog/css-text-wrap-pretty
[2] https://docs.google.com/document/d/1jJFD8nAUuiUX6ArFZQqQo8yT...
mac3n
> The demo has content in English
strange English.
> It's far text
> this text has short a lot of words all in a row
not relevant to the subject, unless you want to consider improving line breaks by rearranging words
throw0101d
(La)Tex still seems to have the 'best' results for line breaking:
* https://en.wikipedia.org/wiki/Knuth–Plass_line-breaking_algo...
robszumski
Really excited for text-wrap: balance. This will prevent a ton of breakpointing or manual line breaks for web headers.
qingcharles
balance has been working for a couple of years on everything except Safari (about a year on that I think). I've been using it on my headlines for that long.
IshKebab
Ironically the letter height for the monospace text on this website is all over the place for me. I'm using Chrome on Windows so you'd think it would work fine... Seems to be an issue with SF Mono.
janalsncm
I’d love to learn more about the pretty algorithm and the optimizations that have been tried out so far.
It seems like a pretty straightforward machine learning regression problem: given a sequence of word widths, find a line length which when applied to the sequence satisfy the constraint of being “pretty”.
Using a model would allow computing the answer in constant time.
deredede
Machine learning is not magic -- no algorithm, machine learning or otherwise, will be able to treat an arbitrary-length sequence in constant time.
The actual problem is also more complex than fixed word widths due to hyphenation and justification - from what I recall, Knuth's paper (IIRC there's two and the second one is the one to read) on TeX's layout gives a good overview of the problem and an algorithm that's still state of the art. I think the Typst people might have a blog post about their modern take on it, but I'm not sure.
accrual
Could it be linear time? I thought larger inputs take longer for models to process. I suppose it depends on the length of each line and the number of lines.
ezfe
It can't be linear time. At best it would be log(n) but that would require storing all the possible inputs in a lookup table.
OisinMoran
This is excellent, thank you! I hadn't heard of "balance" before either, so definitely going to experiment with that now too. Anything that can improve typography on the web, even a little bit is a big win in my opinion. I'm also stealing that 1lh tip they link to!
If you like this and are interested in something closer to TeX, why not the TeX algorithm itself!? There's this lovely package that works wonderfully on the web https://github.com/robertknight/tex-linebreak?tab=readme-ov-... And if you want to play around with a live example to check the performance, I use it on my site's about page: https://lynkmi.com/about
Been on a big LaTeX buzz recently and even added support for it in comments there too!
rambambram
After a quick glance this looks pretty useful for me. I'm the kind of guy who is willing to change words or sentences (up to a point) to make the overall text look prettier.
wruza
I also do this for monospace text, but in proportional there's perfect "fill" justification which for some reason gets ignored.
Here's how you do it in advanced layout systems like CSS: https://stackoverflow.com/questions/6205109/justify-text-to-...
I find myself laughing at "Many developers are understandably concerned about the performance of text-wrap: pretty." I just can't bring myself to believe there is a meaningfully sized group of developers that have considered the performance of text-wrapping.