Skip to content(if available)orjump to list(if available)

Scaling Up Reinforcement Learning for Traffic Smoothing

jgord

Great exposition on that page .. with the animations clearly explaining the phenomenon [ yet not preventing reading ]

I expect to see many small startups using RL to solve realworld B2B problems, of this flavor, that were previously too-hard to tackle.

taberiand

There's a certain satisfaction in anticipating these stop and go waves while driving, and timing it so that you catch the tail just as it starts moving again - the goal being to use the brake as little as possible and ideally only need to adjust acceleration. I don't really get why people feel the need to repeatedly accelerate up and then slam on the brakes, when leaving reasonable gaps makes everything go smoother.

gorpy7

depends on the distance between stoplights. but, in general, this will reduce throughput- if you’re the lead car and you accelerate slowly when it turns green then the 10th car may not get through the light. and if you slow down early and gently, that’ll ripple backward. slowing down gently also doesn’t let the cars pack densely quickly and here again you can’t get enough cars in the space between lights, especially over freeways/bridges. sometimes you’ll see zipper merges just to combat this low density as cars accelerate.

gsf_emergency_2

Describing human intelligences as self-drivers on the highway of progress, one may build the following dictionary with the help of OP's graph

  Traffic density -> urban pop. density

  Traffic flow -> rate of new ideas productively implemented 

  Partial observability -> democracy*

  Reinforcement learning -> augmenting individual expression

  Oxygen/lithium -> money
*Incorporating further insights from OP: https://news.ycombinator.com/item?id=43589398

pornel

It's nice that this worked without need for communication between cars.

This should be a built-in feature of adaptive cruise control in regular cars.

evinitsky

We're trying to convince folks that this should be the case!

Coneylake

How do lane changes affect this work?

tonetegeatinst

Are the cars used ICE? I would think a electric vehicle would be better for the environment, and less susceptible to fluctuations in gas prices.

If you used EV's you also have a fleet or high density energy storage when not in use

evinitsky

These are mostly Nissan Rogue's since that's the thing we could get 100 of

peepeepoopoo117

It's so refreshing to see real solutions to transportation problems instead of pie in the sky "burn it all down and start from scratch" thinking.

evinitsky

(co-author here). US transit systems have state dependence. I suspect most of us are transit advocates but it seems clear that minimizing present harms is good

efavdb

Pretty interesting. I’m surprised that the throughput drops with traffic, especially all the way to zero in the first plot.

Definitely frustrating to drive through these and great point that it’s bad for efficiency.

nn3

Does the really need reinforcement learning? It seems like something that classical controller should be able to do.

evinitsky

One of the authors here. It's a somewhat nuanced answer. In principle, I think a classical controller would have been fine here and if you read the paper (might be in one of the other papers) we do benchmark a bunch of them. But what's really nice about RL is what it does to the workflow. We can add a sensor, drop a sensor, change the dynamics of the system, and have a functional controller the next day. It trades compute for control engineer time. On a secondary small point, the dynamics of the cruise control cars are an unpleasant switched system and there's a lot of partial observability, we never fully sense the traffic state, we didn't even have direct measurements of the distance to the car in front, and the individual car control decisions are coupled to macroscopic effects on the system i.e. since all the cars have the same policy their decisions actually affect the traffic flow. So, it's not a trivial control design problem at all.