A Deep Reinforcement Learning Approach to Eco-driving of Autonomous Vehicles Crossing a Signalized Intersection
by Joshua Ogbebor 1 , Xiangyu Meng 1,* , Xihai Zhang 2
1 Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, 70803, United States of America
2 College of Electronic and Information, Northeast Agricultural University, Harbin, 150000, China
* Author to whom correspondence should be addressed.
Journal of Engineering Research and Sciences, Volume 1, Issue 5, Page # 25-33, 2022; DOI: 10.55708/js0105003
Keywords: reinforcement learning, eco-driving, connected vehicles, autonomous vehicles
Received: 26 February 2022, Revised: 10 April 2022, Accepted: 18 April 2022, Published Online: 12 May 2022
APA Style
Ogbebor, J., Meng, X., & Zhang, X. (2022). A Deep Reinforcement Learning Approach to Eco-driving of Autonomous Vehicles Crossing a Signalized Intersection. Journal of Engineering Research and Sciences, 1(5), 25–33. https://doi.org/10.55708/js0105003
Chicago/Turabian Style
Ogbebor, Joshua, Xiangyu Meng, and Xihai Zhang. “A Deep Reinforcement Learning Approach to Eco-driving of Autonomous Vehicles Crossing a Signalized Intersection.” Journal of Engineering Research and Sciences 1, no. 5 (May 2022): 25–33. https://doi.org/10.55708/js0105003.
IEEE Style
J. Ogbebor, X. Meng, and X. Zhang, “A Deep Reinforcement Learning Approach to Eco-driving of Autonomous Vehicles Crossing a Signalized Intersection,” Journal of Engineering Research and Sciences, vol. 1, no. 5, pp. 25–33, May 2022, doi: 10.55708/js0105003.
This paper outlines a method for obtaining the optimal control policy for an autonomous vehicle appro med that traffic signal phase and timing information can be made available to the autonomous vehicle as the vehicle approaches the traffic signal. Constraints on the vehicle’s speed and acceleration are considered and a microscopic fuel consumption model is considered. The objective is to minimize a weighted sum of the travel time and the fuel consumption. The problem is solved using the Deep Deterministic Policy Gradient algorithm under the reinforcement learning framework. First, the vehicle model, system constraints, and fuel consumption model are translated to the reinforcement learning framework, and the reward function is designed to guide the agent away from the system constraints and towards the optimum as defined by the objective function. The agent is then trained for different relative weights on the travel time and the fuel consumption, ults are presented. Several considerations for deploying such reinforcement learning-based ents are also discussed.
- A. Talebpour, H. S. Mahmassani, “Influence of connected and au- tonomous vehicles on traffic flow stability and throughput”, Trans- portation Research Part C: Emerging Technologies, vol. 71, pp. 143–163, 2016.
- A. Papadoulis, M. Quddus, M. Imprialou, “Evaluating the safety im- pact of connected and autonomous vehicles on motorways”, Accident Analysis & Prevention, vol. 124, pp. 12–22, 2019.
- B. Asadi, A. Vahidi, “Predictive cruise control: Utilizing upcoming traffic signal information for improving fuel economy and reducing trip time”, IEEE transactions on control systems technology, vol. 19, no. 3, pp. 707–714, 2010.
- R. K. Kamalanathsharma, H. A. Rakha, H. Yang, “Networkwide impacts of vehicle ecospeed control in the vicinity of traffic signalized intersections”, Transportation Research Record, vol. 2503, no. 1, pp. 91–99, 2015.
- N. Wan, A. Vahidi, A. Luckow, “Optimal speed advisory for connected vehicles in arterial roads and the impact on mixed traffic”, Transporta- tion Research Part C: Emerging Technologies, vol. 69, pp. 548–563, 2016.
- S. Stebbins, M. Hickman, J. Kim, H. L. Vu, “Characterising green light optimal speed advisory trajectories for platoon-based optimisation”, nsportation Research Part C: Emerging Technologies, vol. 82, pp. 43–62,
017. - X. Meng, C. G. Cassandras, “Eco-driving of autonomous vehicles for nonstop crossing of signalized intersections”, IEEE Transactions on Automation Science and Engineering, vol. 19, no. 1, pp. 320–331, 2022.
- X. Meng, C. G. Cassandras, “Trajectory optimization of autonomous agents with spatio-temporal constraints”, IEEE Trans. Control Netw. Syst., vol. 7, no. 3, pp. 1571–1581, 2020.
- F. L. Lewis, D. Vrabie, V. L. Syrmos, Optimal control, John Wiley & Sons, 2012.
- H. Rakha, R. K. Kamalanathsharma, “Eco-driving at signalized inter- sections using V2I communication”, “2011 14th international IEEE conference on intelligent transportation systems (ITSC)”, pp. 341–346, IEEE, 2011.
- R. E. Bellman, Dynamic Programming, Dover Publications, Inc., USA, 2003.
- D. Bertsekas, Dynamic programming and optimal control: Volume I, vol. 1, Athena scientific, 2012.
- R. S. Sutton, A. G. Barto, Reinforcement learning: An introduction, MIT press, 2018.
- M. H. Hassoun, et al., Fundamentals of artificial neural networks, MIT press, 1995.
- S. B. Kotsiantis, I. Zaharakis, P. Pintelas, et al., “Supervised machine learning: A review of classification techniques”, Emerging artificial intelligence applications in computer engineering, vol. 160, no. 1, pp. 3–24, 2007.
- I. Goodfellow, Y. Bengio, A. Courville, Deep learning, MIT press, 2016.
- V. François-Lavet, P. Henderson, R. Islam, M. G. Bellemare, J. Pineau, “An introduction to deep reinforcement learning”, Foundations and Trends® in Machine Learning, vol. 11, no. 3-4, pp. 219–354, 2018.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wier- stra, M. Riedmiller, “Playing atari with deep reinforcement learning”, arXiv preprint arXiv:1312.5602, 2013.
- J. Shi, F. Qiao, Q. Li, L. Yu, Y. Hu, “Application and evaluation of the reinforcement learning approach to eco-driving at intersections under infrastructure-to-vehicle communications”, Transportation Research Record, vol. 2672, no. 25, pp. 89–98, 2018.
- S. R. Mousa, S. Ishak, R. M. Mousa, J. Codjoe, M. Elhenawy, “Deep reinforcement learning agent with varying actions strategy for solving the eco-approach and departure problem at signalized intersections”, Transportation research record, vol. 2674, no. 8, pp. 119–131, 2020.
- M. Zhou, Y. Yu, X. Qu, “Development of an efficient driving strategy for connected and automated vehicles at signalized intersections: A reinforcement learning approach”, IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 1, pp. 433–443, 2019.
- M. A. S. Kamal, M. Mukai, J. Murata, T. Kawabe, “Ecological ve- hicle control on roads with up-down slopes”, IEEE Transactions on Intelligent Transportation Systems, vol. 12, no. 3, pp. 783–794, 2011, doi:10.1109/TITS.2011.2112648.
- C. J. Watkins, P. Dayan, “Q-learning”, Machine learning, vol. 8, no. 3-4, pp. 279–292, 1992.
- R. S. Sutton, D. A. McAllester, S. P. Singh, Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation”, “Advances in neural information processing systems”, pp. 1057–1063, 2000.
- M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder,
B. McGrew, J. Tobin, P. Abbeel, W. Zaremba, “Hindsight experience replay”, arXiv preprint arXiv:1707.01495, 2017. - T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver,
nuous control with deep reinforcement learning”, preprint arXiv:1509.02971, 2015. - D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, M. Riedmiller, “Deterministic policy gradient algorithms”, E. P. Xing, T. Jebara, eds., “Proceedings of the 31st International Conference on Machine Learn- ing”, vol. 32 of Proceedings of Machine Learning Research, pp. 387–395, PMLR, Bejing, China, 2014.
- A. Mukherjee, “A comparison of reward functions in q-learning ap- plied to a cart position problem”, arXiv preprint arXiv:2105.11617, 2021.
- B. Yelchuru, S. Fitzgerel, S. Murari, M. Barth, G. Wu, D. Kari, H. Xia, S. Singuluri, K. Boriboonsomsin, B. A. Hamilton, “AERIS-applications for the environment: real-time information synthesis: eco-lanes op- erational scenario modeling report”, Tech. Rep. FHWA-JPO-14-186, rtment of Transportation, Intelligent Transportation Systems Joint Project Office, 2014.