Scaling Up Reinforcement Learning for Traffic Smoothing

taberiand

There's a certain satisfaction in anticipating these stop and go waves while driving, and timing it so that you catch the tail just as it starts moving again - the goal being to use the brake as little as possible and ideally only need to adjust acceleration. I don't really get why people feel the need to repeatedly accelerate up and then slam on the brakes, when leaving reasonable gaps makes everything go smoother.

TeMPOraL

> VI don't really get why people feel the need to repeatedly accelerate up and then slam on the brakes, when leaving reasonable gaps makes everything go smoother.

My feeling observing some drivers: because they feel like if they leave a gap of more than 110% of a width of an average car (which is way below reasonable, not to mention safe), some idiot will immediately slot themselves into that gap. Which shouldn't even matter to them, but somehow they prefer to not leave a gap than to risk another take it.

gorpy7

depends on the distance between stoplights. but, in general, this will reduce throughput- if you’re the lead car and you accelerate slowly when it turns green then the 10th car may not get through the light. and if you slow down early and gently, that’ll ripple backward. slowing down gently also doesn’t let the cars pack densely quickly and here again you can’t get enough cars in the space between lights, especially over freeways/bridges. sometimes you’ll see zipper merges just to combat this low density as cars accelerate.

mitthrowaway2

Yeah it's not a technique intended for gridlock city traffic where you need cars to squeeze through a light and then pack together. It's very good for some other scenarios though. I think the sorts of people who put enough thought into driving to delete traffic waves are also aware enough to know when it's not appropriate to use that technique.

taberiand

I'm talking about highway driving. At traffic lights the behaviour I see is people stacking up in one lane when there's a zipper merge across the intersection - I don't get this either, but it's good for me because there's plenty of room to skip past that line and merge in while the stack dawdles across the road, the inchworm-style traffic movement leaving plenty of space between each car

pornel

It's nice that this worked without need for communication between cars.

This should be a built-in feature of adaptive cruise control in regular cars.

evinitsky

We're trying to convince folks that this should be the case!

schobi

This is an interesting idea, but I'm sceptical of the advantages.

"there is more Co2" is valid for cars burning fuel. But as soon as you recuperate (even in a hybrid) you might only have a fraction of the losses any more. Adjusting the speed more aggressively is possible, without breaking, with little loss. I totally agree that stop-and-go is annoying, but looking into the future, Co2 should not be a reason for the vehicles in 5-10 years when the research can be rolled out.

Is "slamming the brakes" still happening? Around here you have dynamic speed limit signs on the highway. In high traffic everybody then goes a little slower, but smoothly.

I suspect that if a road is loaded beyond max throughput, this method will also fail, even harder. Let me explain: I remember a graph from communications theory. With improving error correcting codes in transmission, you can get a clean signal for even worse channel conditions. But once it fails you will not have a signal any more. The better the code, the steeper the cutoff. Whereas without in FM radio, the degrade in user experience is also gradual.

So the analogy goes like this: I would expect that you could possibly load the road with another 10% more vehicles. But if one day you have 15% more, the blockage will be even worse than before. Could be worth simulating throughput for various loading situations.

jwlit

Apologies for people who've already seen this (it's pretty old and comes round fairly frequently on HN), but for those previously unaware, http://trafficwaves.org/ is one of the best sites digging into the phenomenon from an "educated layman's" perspective.

jgord

Great exposition on that page .. with the animations clearly explaining the phenomenon [ yet not preventing reading ]

I expect to see many small startups using RL to solve realworld B2B problems, of this flavor, that were previously too-hard to tackle.

gsf_emergency_2

Describing human intelligences as self-drivers on the highway of progress, one may build the following dictionary with the help of OP's graph

  Traffic density -> urban pop. density

  Traffic flow -> rate of new ideas productively implemented 

  Partial observability -> democracy*

  Reinforcement learning -> augmenting individual expression

  Oxygen/lithium -> money

*Incorporating further insights from OP: https://news.ycombinator.com/item?id=43589398

gsf_emergency_2

Today my mind (system 2) almost feigned surprise*:

The intersection of RL & policy (/politics) is a blind spot of HN*

*(Having factored out emotion+humour)

peepeepoopoo117

It's so refreshing to see real solutions to transportation problems instead of pie in the sky "burn it all down and start from scratch" thinking.

evinitsky

(co-author here). US transit systems have state dependence. I suspect most of us are transit advocates but it seems clear that minimizing present harms is good

nn3

Does the really need reinforcement learning? It seems like something that classical controller should be able to do.

evinitsky

One of the authors here. It's a somewhat nuanced answer. In principle, I think a classical controller would have been fine here and if you read the paper (might be in one of the other papers) we do benchmark a bunch of them. But what's really nice about RL is what it does to the workflow. We can add a sensor, drop a sensor, change the dynamics of the system, and have a functional controller the next day. It trades compute for control engineer time. On a secondary small point, the dynamics of the cruise control cars are an unpleasant switched system and there's a lot of partial observability, we never fully sense the traffic state, we didn't even have direct measurements of the distance to the car in front, and the individual car control decisions are coupled to macroscopic effects on the system i.e. since all the cars have the same policy their decisions actually affect the traffic flow. So, it's not a trivial control design problem at all.

tonetegeatinst

Are the cars used ICE? I would think a electric vehicle would be better for the environment, and less susceptible to fluctuations in gas prices.

If you used EV's you also have a fleet or high density energy storage when not in use

evinitsky

These are mostly Nissan Rogue's since that's the thing we could get 100 of

efavdb

Pretty interesting. I’m surprised that the throughput drops with traffic, especially all the way to zero in the first plot.

Definitely frustrating to drive through these and great point that it’s bad for efficiency.

Coneylake

How do lane changes affect this work?

mitthrowaway2

I'm not the author, but I do a similar driving pattern when I encounter traffic waves. Lane changes actually end up making little difference. Even when I leave a lot of space ahead of me as a buffer to close a gap, only the very most aggressive drivers tend to enter it for the purpose of gaining ground, because it doesn't help them much at all -- the cars at the leading end of that open gap are usually stopped. However, leaving these gaps does help increase fluidity for those drivers who need to change lanes in order to eg. exit the highway, which in turn reduces traffic.