Reverse proxy deep dive: Why HTTP parsing at the edge is harder than it looks

pixl97

Oh, and it can get messy and lead to exploits really quick.

Incorrect parsing and parsing differences between libraries can lead to exciting exploits.

Like what do you do when there is multiple of the same headers with odd line breaks?

GET /example HTTP/1.1 Host: bad-stuff-here Host: vulnerable-website.com

freeone3000

It’s a good thing we have RFCs! For duplicate Host, you MUST respond with a 400. If the Host is different than the authority, Host must be ignored. If Host is not specified, it must be provided to upstream. See “Host” in RFC 7230:

https://www.rfc-editor.org/rfc/rfc7230#section-5.4

ranger_danger

it's a good thing all RFCs are 100% specified with no ambiguities.

EDIT: Sorry I dropped my /s. I was only trying to say that unfortunately not all RFCs are sufficiently specified... and that I think saying "good thing we have RFCs" should not imply they will all be sufficiently specified, which is how I interpreted their comment... and didn't feel like typing all this out, but I guess it was necessary anyway.

necovek

That's a very weird take as a reply on a bit that is sufficiently specified.

TechDebtDevin

I've been building out a very large network of reverse proxies the last year. Very fun, and your article is very relatable. Go has been my friend. Been spending the last couple months testing trying to figure out all the weird things that can happen and its quite a bit.

bithavoc

me too, what are you building?

TechDebtDevin

A sort of boutique mobile-first proxy, with emphasis on geography spread/accuracy. I've been running my own proxies for a long time via friends and families networks, but in those instances security/safety wasn't as big of a deal. Yourself?

bithavoc

that’s cool, I’m working on branded artifact delivery. Docker, Go, NPM, Pypi repos delivered on free custom sub-domains. Vultr BGP services doing the trick so far.

HN

Reverse proxy deep dive: Why HTTP parsing at the edge is harder than it looks

Reverse proxy deep dive: Why HTTP parsing at the edge is harder than it looks