Scaling Up Reinforcement Learning for Traffic Smoothing
14 comments
·April 2, 2025taberiand
There's a certain satisfaction in anticipating these stop and go waves while driving, and timing it so that you catch the tail just as it starts moving again - the goal being to use the brake as little as possible and ideally only need to adjust acceleration. I don't really get why people feel the need to repeatedly accelerate up and then slam on the brakes, when leaving reasonable gaps makes everything go smoother.
gorpy7
depends on the distance between stoplights. but, in general, this will reduce throughput- if you’re the lead car and you accelerate slowly when it turns green then the 10th car may not get through the light. and if you slow down early and gently, that’ll ripple backward. slowing down gently also doesn’t let the cars pack densely quickly and here again you can’t get enough cars in the space between lights, especially over freeways/bridges. sometimes you’ll see zipper merges just to combat this low density as cars accelerate.
gsf_emergency_2
Describing human intelligences as self-drivers on the highway of progress, one may build the following dictionary with the help of OP's graph
Traffic density -> urban pop. density
Traffic flow -> rate of new ideas productively implemented
Partial observability -> democracy*
Reinforcement learning -> augmenting individual expression
Oxygen/lithium -> money
*Incorporating further insights from OP: https://news.ycombinator.com/item?id=43589398pornel
It's nice that this worked without need for communication between cars.
This should be a built-in feature of adaptive cruise control in regular cars.
evinitsky
We're trying to convince folks that this should be the case!
Coneylake
How do lane changes affect this work?
tonetegeatinst
Are the cars used ICE? I would think a electric vehicle would be better for the environment, and less susceptible to fluctuations in gas prices.
If you used EV's you also have a fleet or high density energy storage when not in use
evinitsky
These are mostly Nissan Rogue's since that's the thing we could get 100 of
peepeepoopoo117
It's so refreshing to see real solutions to transportation problems instead of pie in the sky "burn it all down and start from scratch" thinking.
evinitsky
(co-author here). US transit systems have state dependence. I suspect most of us are transit advocates but it seems clear that minimizing present harms is good
efavdb
Pretty interesting. I’m surprised that the throughput drops with traffic, especially all the way to zero in the first plot.
Definitely frustrating to drive through these and great point that it’s bad for efficiency.
nn3
Does the really need reinforcement learning? It seems like something that classical controller should be able to do.
evinitsky
One of the authors here. It's a somewhat nuanced answer. In principle, I think a classical controller would have been fine here and if you read the paper (might be in one of the other papers) we do benchmark a bunch of them. But what's really nice about RL is what it does to the workflow. We can add a sensor, drop a sensor, change the dynamics of the system, and have a functional controller the next day. It trades compute for control engineer time. On a secondary small point, the dynamics of the cruise control cars are an unpleasant switched system and there's a lot of partial observability, we never fully sense the traffic state, we didn't even have direct measurements of the distance to the car in front, and the individual car control decisions are coupled to macroscopic effects on the system i.e. since all the cars have the same policy their decisions actually affect the traffic flow. So, it's not a trivial control design problem at all.
Great exposition on that page .. with the animations clearly explaining the phenomenon [ yet not preventing reading ]
I expect to see many small startups using RL to solve realworld B2B problems, of this flavor, that were previously too-hard to tackle.