There isn't much point to HTTP/2 past the load balancer

269 comments

·February 25, 2025

hiAndrewQuinn

The maximum number of connections thing in HTTP/1 always makes me think of queuing theory, which gives surprising conclusions like how adding a single extra teller at a one-teller bank can cut wait times by 50 times, not just by 2.

However, I think the problem is the Poisson process isn't really the right process to assume. Most websites which would run afoul of the 2/6/8/etc connections being opened are probably trying to open up a lot of connections at the same time. That's very different from situations where only 1 new person arrives every 6 minutes on average, and 2 new people arriving within 1 second of each other is a considerably rarer event.

[1]: https://www.johndcook.com/blog/2008/10/21/what-happens-when-...

hinkley

And if memory serves if you care about minimizing latency you want all of your workers running an average of 60% occupied. (Which is also pretty close to when I saw P95 times dog-leg on the last cluster I worked on).

Queuing theory is really weird.

atombender

Most analyses I've read say the threshold is around the 80% mark [1], although it depends on how model the distribution, and there's nothing magical about the number. The main thing is to avoid getting close to 100%, because wait times go up exponentially as you get closer to the max.

Little's Law is fundamental to queueing theory, but there's also the less well-known Kingman's formula, which incorporates variability of arrival rate and task size [2].

[1] https://www.johndcook.com/blog/2009/01/30/server-utilization...

[2] https://taborsky.cz/posts/2021/kingman-formula/

dude187

Really both of those models show 60% as about the limit to where you're still effectively at the baseline for latency. 80% is just about the limit to where you're up there in the exponential rise, any higher and things become unusable.

0-60 and you're still at minimum latency. 60-80 you're at twice the latency but it's probably worth the cost savings of the extra compute density since it's still pretty low. Higher than 80 and things are already slowing down and getting exponentially worse by the request

hinkley

If you look at the chart in the second link, where does the wait time leave the origin? Around 60%.

The first one is even worse; by 80% you're already seeing twice the delay of 70%.

If I were to describe the second chart I'd say 80% is when you start to get into trouble, not just noticing a slowdown.

I said minimize latency, not optimize latency.

winrid

Somehow I was building distributed systems earlier in my career before I learned about queuing theory and learned this the hard way.

Nowadays with DB stuff I tend to get assigned new infra leads who see a DB cluster at 50% CPU utilization and think they can go down two instances sizes without severely impacting latency.

hinkley

For me it was seeing a machine at 70% utilization and thinking they could squeeze another small service onto the box. The first time it didn’t sound right, and after that I knew it was a bad idea. By the third time I was willing to throw a veto if they wouldn’t listen to reason.

And the thing is, even if you stuff a low priority service onto a bunch of boxes, and convince the OS to honor that priority fairly, the fact that the service runs reasonably at all gets baked in as an expectation. Maybe it’s the hedonic treadmill, one kid expects dessert with every meal because they’ve always gotten it, and another knows it’s a special occasion. But anything given is jealously guarded when you have to take it away. Even a “best effort” batch process that is supposed to finish on some interval, is missed when it no longer does. And somehow it’s always your fault.

I’m sure the grocery store employees who are assigned as backup tellers constantly get grief for not getting their other tasks done “on time”.

lurking_swe

Any resources you’d recommend to learn more about this?

for context: 10 yrs experience as a software engineer and only a couple on high traffic products. Some things (like this) continue to surprise me.

hiAndrewQuinn

Peep https://yzr95924.github.io/pdf/book/Basic-Queueing-Theory.pd... , honestly just being aware of it is already half the battle.

deepsun

Why 60%? I suppose if they are less than 1% then latency will be even less.

hinkley

Poisson could probably explain it elegantly, but he's not here, just us.

gunian

dog food dog leg dog ram lol

hinkley

dog leg is what people who aren't pretentious prats call an 'inflection point'.

tome

Can't it cut wait times by infinity? For example, if the arrivals are at 1.1 per minute, and a teller processes 1 per minute.

SJC_Hacker

Could be that, could also be the people taking a long time aren't at least causing a bottleneck (assuming there arent two of them at the same time). So you have this situation like this: first person takes 10 minutes, while there are 9 waiting in line that take only one minute a piece. With one teller, average wait time is ~15 minutes. With two tellers, its now ~5 minutes.

Which is why it is highly annoying when there's only one worker at the coffee stand, and there's always this one jerk at the front of the queue who orders a latte when you just want a coffee. With two workers, the people who just want coffee won't have to wait 15 minutes for the lattee people

And I've also noticed a social effect, when people wait a long time it seems to reinforce how they perceive the eventual serviced, that is, they want more out of the interaction, so take longer. Which makes it the situation even worse

lostlogin

> there's always this one jerk at the front of the queue

Here in the espresso world, that’s not so bad. But the ‘vanilla oat milk decaf, and also a hot muffin with butter’ is tedious.

There is a roaster in Auckland that’s been there since the ‘80s. On the counter it says ‘espresso, flat white or fuck off’. Clear and concise. I like it. https://millerscoffee.co.nz/

dylan604

Luckily, we don't get stuck behind someone using a check any more.

hiAndrewQuinn

The Poisson process is a random distribution. Just because its average is 1.1 per minute doesn't mean you can't have 2 people show up at virtually the same time, it's just pretty rare. Or there's like a 10^-100 chance a million people show up at the same second, etc etc. You'll always have some nonzero average queuing time if you calculate it across the whole distribution for that reason.

hinkley

You've forgotten about banker's hours :)

dylan604

There was a discussion not tool long ago about modern banks still with archaic practices. I have accounts at two different banks, and if I make a transfer request before 1:45PT, it is counted as same day. That makes no damn sense to me why that's a limitation at all today. It's not like a human needs to look at it, but even so, why the 1:45PT cutoff? Is it because it is 4:45ET? Then why not list it as that? And why does a banking computer system care about timezones or bankers' hours at all. It's all just mind boggling lame

amluto

[flagged]

dgfitz

I find a lot of value in being able to get a water or a coffee, use the restroom, have sidebar conversations with fellow employees, begrudgingly attend meetings, or take a walk to stretch my legs for a minute and think, personally.

timewizard

[flagged]

arjie

Almost every web forum enters a phase where participants bring in their pet politics into unrelated discussions. Whether they last or not depends entirely on whether the flamebait/troll creates a large reply structure or a non-existent one. This is why shadowbans are more effective than large groups of people responding angrily. Or, to cite the Deep Magic: "don't feed the trolls".

efdee

[flagged]

vasilvv

The article seems to make an assumption that the application backend is in the same datacenter as the load balancer, which is not necessarily true: people often put their load balancers at the network edge (which helps reduce latency when the response is cached), or just outsource those to a CDN vendor.

> In addition to the low roundtrip time, the connections between your load balancer and application server likely have a very long lifetime, hence don’t suffer from TCP slow start as much, and that’s assuming your operating system hasn’t been tuned to disable slow start entirely, which is very common on servers.

A single HTTP/1.1 connection can only process one request at a time (unless you attempt HTTP pipelining), so if you have N persistent TCP connections to the backend, you can only handle N concurrent requests. Since all of those connections are long-lived and are sending at the same time, if you make N very large, you will eventually run into TCP congestion control convergence issues.

Also, I don't understand why the author believes HTTP/2 is less debuggable than HTTP/1; curl and Wireshark work equally well with both.

dgoldstein0

I think the more common architecture is for edge network to terminate SSL, and then transmit to the load balancer which is actually in the final data center? In which case you can http2 or 3 on both those hops without requiring it on the application server.

That said I still disagree with the article's conclusion: more connections means more memory so even within the same dc, there should be benefits of http2. And if the app server supports async processing, there's value in hitting it with concurrent requests to make the most of its hardware, and http1.1 head of line blocking really destroys a lot of possible perf gains when the response time is variable.

I suppose I haven't had a true bake off here though - so it's possible the effect of http2 in the data center is a bit more marginal than I'm imagining.

hansvm

HTTP2 isn't free though. You don't have as many connections, but you do have to track each stream of data, making RAM a wash if TLS is non-existent or terminated outside your application. Moreover, on top of the branches the kernel is doing to route traffic to the right connections, you need an extra layer of branching in your application code and have to apply it per-frame since request fragments can be interleaved.

littlecranky67

TCP slow start is not an issue for load balancers, as operatings system cache the congestion window (cwnd) on a per-host basis, even after termination of all connections to that host. That is, next time a connection to the same backend host is created, the OS uses a higher initial congestion window (initcwnd) during slow start based on the previous cache value. It does not matter if the target backend host is in the same datacenter or not.

jchw

Personally, I'd like to see more HTTP/2 support. I think HTTP/2's duplex streams would be useful, just like SSE. In theory, WebSockets do cover the same ground, and there's also a way to use WebSockets over HTTP/2 although I'm not 100% sure how that works. HTTP/2 though, elegantly handles all of it, and although it's a bit complicated compared to HTTP/1.1, it's actually simpler than WebSockets, at least in some ways, and follows the usual conventions for CORS/etc.

The problem? Well, browsers don't have a JS API for bidirectional HTTP/2 streaming, and many don't see the point, like this article expresses. NGINX doesn't support end-to-end HTTP/2. Feels like a bit of a shame, as the streaming aspect of HTTP/2 is a more natural evolution of the HTTP/1 request/response cycle versus things like WebSockets and WebRTC data channels. Oh well.

Matthias247

Duplex streams are not really a HTTP/2-only feature. You can do the same bidirectional streaming with HTTP/1.1 too. The flow is always: 1. The client sends a header set. 2. It can then start to stream data in the form of an unlimited-length byte-stream to the server. 3. The server starts to send a header set back to the client. 4. The server can then start to stream data in the form of an unlimited-length byte-stream to the client.

There is not even a fixed order between 2) and 3). The server can start sending headers or body data before the client sent any body byte.

What is correct is that a lot of servers and clients (including javascript in browsers) don't support this and make stricter assumptions regarding how HTTP requests are used - e.g. that the request bytes are fully sent before the response happens. I think ReadableStream/WritableStream APIs on browsers were supposed to change that, but I haven't followed the progress in the last few years.

NGINX falls into the same category. It's HTTP/2 support (and gRPC support) had been built with a very limited use-case in mind. That's also why various CDNs and service meshes use different kinds of HTTP proxies - so that various streaming workloads don't break in case way the protocol is used is not strictly request->response.

jchw

No browser I'm aware of is planning on allowing the request and response bodies to be streamed simultaneously for the same request using ReadableStream and WriteableStream. When using streaming request bodies, you have to set the request explicitly to half-duplex.

Anyways, yes, this is technically true, but the streaming semantics are not really that well-defined for HTTP/1.1, probably because it was simply never envisioned. The HTTP/1.1 request and response were viewed as unary entities and the fact that their contents were streamed was mostly an implementation detail. Most HTTP/1.1 software, not just browsers, ultimately treat the requests and responses of HTTP as different and distinct phases. For most uses of HTTP, this makes sense. e.g. for a form post, the entire request entity is going to need to be read before the status can possibly be known.

Even if we do allow bidirectional full-duplex streaming over HTTP/1.1, it will block an entire TCP connection for a given hostname, since HTTP/1.1 is not multiplexed. This is true even if the connection isn't particularly busy. Obviously, this is still an issue even with long-polling, but that's all the more reason why HTTP/2 is simply nicer.

NGINX may always be stuck in an old school HTTP/1 mindset, but modern software like Envoy shows a lot of promise for how architecting around HTTP/2 can work and bring advantages while remaining fully backwards compatible with HTTP/1 software.

nitely

> I think ReadableStream/WritableStream APIs on browsers were supposed to change that, but I haven't followed the progress in the last few years.

There has been a lot of pushback against supporting full-duplex streams[0].

[0]: https://github.com/whatwg/fetch/issues/1254

rixed

> a lot of servers and clients (including javascript in browsers) don't support this

To say nothing about the many http proxies in between.

KaiserPro

HTTP2 works great on the LAN, or if you have really good network.

It starts to really perform badly when you have dropped packets. So any kind of medium quality wifi or 4/5g kneecaps performance.

It was always going to do this, and as webpages get bigger, the performance degradation increases.

HTTP2 fundamentally underperforms in the real world, and noticeably so on mobile. (My company enthusiastically rolled out http2 support when akamai enabled it.)

Personally I feel that websockets are a hack, and frankly HTTP 3 should have been split into three: a file access protocol, a arbitrary TCP like pipe and a metadata channel. But web people love hammering workarounds onto workarounds. so we are left with HTTP3

jchw

HTTP/2, in my experience, still works fine on decent connections, but the advantages definitely start to level out as the connection gets worse. HTTP/2 definitely has some inherent disadvantages over HTTP/1 in those regards. (Though it depends on how much you are constrained by bandwidth vs latency, to be sure.)

However, HTTP/3 solves that problem and performs very well on both poor quality and good quality networks.

Typically, I use HTTP/2 to refer to both HTTP/2 and HTTP/3 since they are basically the same protocol with different transports. Most people don't really need to care about the distinction, although I guess since it doesn't use TCP there are cases where someone may not be able to establish an HTTP/3 connection to a server. Still, I think the forward looking way to go is to try to push towards HTTP/3, then fall back to HTTP/2, and still support HTTP/1.1 indefinitely for simple and legacy clients. Some clients may get less than ideal performance, but you get the other benefits of HTTP/2 on as many devices as possible.

ninkendo

> HTTP 3 should have been split into three: a file access protocol, a arbitrary TCP like pipe and a metadata channel

HTTP3 is basically just HTTP2 on top of QUIC… so you already have the tcp-like pipe, it’s called QUIC. And there’s no reason to have a metadata channel when there are already arbitrary separate channels in QUIC itself.

KaiserPro

You're right of course, it has virtual channels. I just think it would have been good to break with the old HTTP semantics and change to something reflecting modern usage.

commandlinefan

It seems like the author is agreeing that HTTP/2 is great (or at least good) for browser -> web server communication, but not useful for the REST-style APIs that pervade modern app design. He makes a good case, but HTTP was never really a good choice for API transport _either_, it just took hold because it was ubiquitous.

yawaramin

What's the difference? Aren't they both request-response protocols?

commandlinefan

It's not terrible, it just has a lot of fluff that you don't really need or want in an API call like chunked transfer encoding, request pipelining, redirect responses, mime encoding, content type negotiation, etc. Of course, you can just ignore all that stuff and either a) implement a stripped down, incomplete HTTP-ish protocol that has just the parts you need or b) use a full-blown HTTP implementation like nginx like most people do. The problem with (b) is when nginx suddenly starts behaving like an actual web server and you have to troubleshoot why it's now barfing on some UTF-16 encoding.

withinboredom

There are many other transport protocols for APIs, http basically took the lead in the early days because it made it through firewalls/proxies; not because it is better.

KingMob

Yeah, it's a shame you can't take advantage of natural HTTP/2 streaming from the browser. There's the upcoming WebTransport API (https://developer.mozilla.org/en-US/docs/Web/API/WebTranspor...), but it could have been added earlier.

Matthias247

If you want to stream data inside a HTTP body (of any protocol), then the ReadableStream/WritableStream APIs would be the appropriate APIs (https://developer.mozilla.org/en-US/docs/Web/API/Streams_API) - however at least in the past they have not been fully standardized and implemented by browsers. Not sure what the latest state is.

WebTransport is a bit different - it offers raw QUIC streams that are running concurrently with the requests/streams that carry the HTTP/3 requests on shared underlying HTTP/3 connections and it also offers a datagram API.

bawolff

I think the problem is that duplex communication on the web is rarely useful except in some special cases, and usually harder to scale as you have to keep state around and can't as easily rotate servers.

Some applications it is important but for most websites the benefits just dont outweigh the costs.

jonwinstanley

I thought http/2 was great for reducing latency for JS libraries like Turbo Links and Hotwire.

Which is why the Rails crowd want it.

Is that not the case?

the_duke

H2 still suffers from head of line blocking on unstable connections (like mobile).

H3 is supposed to solve that.

wtarreau

Yep. Actually H1/H2/H3 do have the same problem (remember the good old days when everyone was trying to pipeline over H1?), except that H1 generally comes with multiple connections and H3 currently goes over QUIC and it's QUIC that addresses HoL by letting streams progress independently.

spintin

[dead]

withinboredom

[flagged]

theflyinghorse

I have an nginx running on my VPS supporting my startup. Last time I had to touch it was about 4 years ago. Quality software

Aurornis

I really like Caddy, but these nginx performance comparisons are never really supported in benchmarks.

There have been numerous attempts to benchmark both (One example: https://blog.tjll.net/reverse-proxy-hot-dog-eating-contest-c... ) but the conclusion is almost always that they're fairly similar.

The big difference for simple applications is that Caddy is easier to set up, and nginx has a smaller memory footprint. Performance is similar between the two.

otterley

AFAIK both proxies are capable of serving at line rate for 10Gbps or more at millions of concurrent connections. I can't possibly see how performance would significantly differ if they're properly configured.

jasonjayr

nginx's memory footprint is tiny for what it delivers. A common pattern I see for homelab and self-hosted stuff is a lightweight bastion VPS in a cloud somewhere proxying requests to more capabile on-premise hardware over a VPN link. Using a cheap < $5mo means 1GB or less of RAM, so you have to tightly watch what is running on that host.

neoromantique

To be fair 1GB is a lot, both caddy and nginx would feel pretty good with it I'd imagine.

p_ing

Why would you use either when there is OpenBSD w/ carp + HAProxy?

There's lots of options out there. I mean, even IIS can do RP work.

Ultimately, I would prefer a PaaS solution over having to run a couple of servers.

otterley

You're going to need to show your homework for this to be a credible claim.

evalijnyi

Is it just me or did anyone else completely miss Caddy for it's opening sentence?

>Caddy is a powerful, extensible platform to serve your sites, services, and apps, written in Go.

To me it reads that if your application is not written in Go, don't bother

jeroenhd

The Go crowd, like the Rust crowd, likes to advertise the language their software is written in. I agree that that specific sentence is a bit ambiguous, though, as if it's some kind of middleware that hooks into Go applications.

It's not, it's just another standalone reverse proxy.

unification_fan

Why should a reverse proxy give a single shit about what your lang application is written in

treve

First 80% of the article was great, but it ends a bit handwavey when it gets to its conclusion.

One thing the article gets wrong is that non-encrypted HTTP/2 exists. Not between browsers, but great between a load balancer and your application.

byroot

> One thing the article gets wrong is that non-encrypted HTTP/2 exists

Indeed, I misread the spec, and added a small clarification to the article.

tuukkah

Do you want to risk the complexity and potential performance impact from the handshake that the HTTP/2 standard requires for non-encrypted connections? Worst case, your client and server toolings clash in a way that every request becomes two requests (before the actual h2c request, a second one for the required HTTP/1.1 upgrade, which the server closes as suggested in the HTTP/2 FAQ).

arccy

most places where you'd use it use h2c prior knowledge, that is, you just configure both ends to only speak h2c, no upgrades or downgrades.

fragmede

Not according to Edward Snowden, if you're Yahoo and Google.

ChocolateGod

You can just add encryption to your backend private network (e.g. Wireguard)

Which has the benefit of encrypting everything and avoids the overhead of starting a TLS socket for every http connection.

jeroenhd

If you're going that route, you may as well just do HTTPS again. If you configure your TLS cookies and session resumption right, you'll get all of the advantages of fancy post-quantum crypto without having to go back to the days of manually setting up encrypted tunnels like when IPSec did the rounds.

soraminazuki

Wait, are some people actively downvoting advice encouraging the use of encryption in internal networks? I sure hope those people don't go anywhere near the software industry because that's utterly reckless in the post-Snowden world.

fragmede

People are all over the place. I had to talk someone into SSH over VPN being double encrypted isn’t a waste.

fulafel

There's a security angle: Load balancers have big problems with request smuggling. HTTP/2 does something to the picture, maybe someone is more up to date if it's currently better or worse?

ref: https://portswigger.net/web-security/request-smuggling

mwcampbell

This is why I configured my company's AWS application load balancer to disable HTTP2 when I first saw the linked post, and haven't changed that configuration since then. Unless we have definitive confirmation that all major load balancers have fixed these vulnerabilities, I'll keep HTTP2 disabled, unless I can figure out how to do HTTP2 between the LB and the backend.

wtarreau

If you transfer large objects, H2 on the backend will increase transfer costs (due to framing). If you deal with many moderate or small objects however, H2 can improve the CPU usage for both the LB and the backend server because they will have less expensive parsing and will be able to merge multiple messages in a single packet over the wire, thus reducing the number of syscalls. Normally it's just a matter of enabling H2 on both and you can run some tests. Be careful not to mix too many clients over a backend connection if you don't want slow client to limit the other ones' xfer speed or even cause head of line blocking, though! By typically supporting ~10 streams per backend connection does improve things quite a bit over H1 for regular sites.

albinowax_

Yes HTTP/2 is much less prone to exploitable request smuggling vulnerabilities. Downgrading to H/1 at the load balancer is risky.

nitely

In theory request smuggling is not possible with end-to-end HTTP/2. It's only possible if there is a downgrade to HTTP/1 at some point.

SahAssar

A h2 proxy usually wouldn't proxy through the http2 connection, it would instead accept h2, load-balance each request to a backend over a h2 (or h1) connection.

The difference is that you have a h2 connection to the proxy, but everything past that point is up to the proxies routing. End-to-end h2 would be more like a websocket (which runs over HTTP CONNECT) where the proxy is just proxying a socket (often with TLS unwrapping).

nitely

> A h2 proxy usually wouldn't proxy through the http2 connection, it would instead accept h2, load-balance each request to a backend over a h2 (or h1) connection.

Each connection need to keep state of all processed requests (the HPACK dynamic headers table), so all request for a given connection need to be proxied through the same connection. Not sure I got what you meant, though.

Apart from that, I think the second sentence of my comment makes clear there is no smuggling as long as the connection before/past proxy is http2, and it's not downgraded to http1. That's all that I meant.

jiggawatts

Google measured their bandwidth usage and discovered that something like half was just HTTP headers! Most RPC calls have small payloads for both requests and responses.

HTTP/2 compresses headers, and that alone can make it worthwhile to use throughout a service fabric.

LAC-Tech

Personally, this lack of support doesn’t bother me much, because the only use case I can see for it, is wanting to expose your Ruby HTTP directly to the internet without any sort of load balancer or reverse proxy, which I understand may seem tempting, as it’s “one less moving piece”, but not really worth the trouble in my opinion.

That seems like a massive benefit to me.

Animats

The amusing thing is that HTTP/2 is mostly useful for sites that download vast numbers of tiny Javascript files for no really good reason. Like Google's sites.

theandrewbailey

I've seen noticeable, meaningful speed improvements with HTTP/2 on pages with only 1 Javascript file.

But I'd like to introduce you/them to tight mode:

https://docs.google.com/document/d/1bCDuq9H1ih9iNjgzyAL0gpwN...

https://www.smashingmagazine.com/2025/01/tight-mode-why-brow...

paulddraper

Or small icon/image files.

Anyone remember those sprite files?

cyberpunk

You ever had to host map tiles? Those are the worst!

SahAssar

Indeed, there is a reason most mapping libraries still support specifying multiple domains for tiles. It used to be common practice to setup a.tileserver.test, b.tileserver.test, c.tileserver.test even if they all pointed to the same IP/server just to get around the concurrent request limit in browsers.

youngtaff

That’s not quite true… lots of small files still have the overhead of IPC in the browser

littlecranky67

One overlooked point is ephemeral source port exhaustion. If a load balancer forwards a HTTP connection to a backend system, it needs a TCP source port for the duration of that connection (not destination port, which is probably 80 or 443). That limits the number of outgoing connections to less than 65535. A common workaround is to use more outgoing IP addresses to the backends as source IPs, thus multiplying the available number of source ports to 65535 times number_of_ips.

HTTP/2 solves this, as you can multiplex requests to backend servers over a single TCP socket. So there is actually a point of using HTTP/2 for load_balancer <-> backend_system connections.

mike_d

This has been solved for 10+ years. Properly configure your load balancer with HTTP keep alive and piplining.

littlecranky67

HTTP keep-alive still limits the number of outgoing connections to 65535. Pipelining suffers from the known same issues addressed in the article.

But I agree, it is a solved problem unless you really have a lot of incoming connections. When you use multiple outgoing ip addresses that fixes that even for very busy load balancers, and since IPv6 is common today you will likely have a /64 to draw addresses from.

mike_d

On modern systems you have about 28k ephemeral ports available. 65,535 is the total number of ports (good luck trying to use them all). Either way, if you have more than 20k connections open to a single backend (remember linux does connection tracking using the 4 tuple, so you can reuse a source port to different destinations) you are doing something seriously wrong and should hire competent network engineering folks.

feyman_r

CDNs like Akamai still don’t support H2 back to origins.

That’s likely not because of the wisdom in the article per se, but because of rising complexity in managing streams and connections downstream.

monus

> bringing HTTP/2 all the way to the Ruby app server is significantly complexifying your infrastructure for little benefit.

I think the author wrote it with encryption-is-a-must in the mind and after he corrected those parts, the article just ended up with these weird statements. What complexity is introduced apart from changing the serving library in your main file?

hinkley

In a language that uses forking to achieve parallelism, terminating multiple tasks at the same endpoint will cause those tasks to compete. For some workflows that may be a feature, but for most it is not.

So that's Python, Ruby, Node. Elixir won't care and C# and Java... well hopefully the HTTP/2 library takes care of the multiplexing of the replies, then you're good.

dgoldstein0

A good python web server should be single process with asyncio , or maybe have a few worker threads or processes. Definitely not fork for every request

hinkley

Your response explains the other one, which I found just baffling.

I didn't say forking per request, good god. I meant running a process per core, or some ratio to the cores to achieve full server utilization. Limiting all of HTTP/2 requests per user to one core is unlikely to result in good feelings for anybody. If you let nginx fan them out to a couple cores it's going to work better.

These are not problems Java and C# have.

neonsunset

I don't think any serious implementation would do forking when using HTTP/2 or QUIC. Fork is a relic of the past.

byroot

You are correct about the first assumption, but even without encryption dealing with multiplexing significantly complexify things, so I still stand by that statement.

If you assume no multiplexing, you can write a much simpler server.

nitely

In reality you would build your application server on top of the HTTP/2 server, so you'd not have to deal with multiplexing, the server will hide that from you, so it's the same as an HTTP/1 server (ex: you pass some callback that gets called to handle the request). If you implement HTTP/2 from scratch, multiplexing is not even the most complex part... It's rather the sum of all parts: HPACK, flow-control, stream state, frames, settings, the large amount of validations, and so on.

byroot

This may be true with some stacks, but my answer has to be understood in the context of Ruby where the only real source of parallelism is `fork(2)`, hence the natural way to write server is an `accept` loop, which fits HTTP/1 very well, but not HTTP/2.

immibis

If your load balancer is converting between HTTP/2 and HTTP/1.1, it's a reverse proxy.

Past the reverse proxy, is there a point to HTTP at all? We could also use SCGI or FastCGI past the reverse proxy. It does a better job of passing through information that's gathered at the first point of entry, such as the client IP address.

SJC_Hacker

Keeping everything HTTP makes testing a bit easier.

HN

There isn't much point to HTTP/2 past the load balancer

There isn't much point to HTTP/2 past the load balancer