Deep reinforcement learning for a self-driving vehicle operating solely on visual information

Armantas Ostreika; Robertas Audinys; Žygimantas Šlikas; Justas Radkevičius; Mantas Šutas

doi:10.3390/electronics14050825

Title	Deep reinforcement learning for a self-driving vehicle operating solely on visual information
Authors	Ostreika, Armantas ; Audinys, Robertas ; Šlikas, Žygimantas ; Radkevičius, Justas ; Šutas, Mantas
DOI	10.3390/electronics14050825
Full Text
Is Part of	Electronics.. Basel : MDPI. 2025, vol. 14, iss. 5, art. no. 825, p. 1-30.. ISSN 2079-9292
Keywords [eng]	autonomous driving ; vision transformers (ViT) ; deep reinforcement learning (DRL) ; policy learning ; metadrive ; airsim ; safe exploration
Abstract [eng]	This study investigates the application of Vision Transformers (ViTs) in deep reinforcement learning (DRL) for autonomous driving systems that rely solely on visual input. While convolutional neural networks (CNNs) are widely used for visual processing, they have limitations in capturing global patterns and handling complex driving scenarios. To address these challenges, we developed a ViT-based DRL model and evaluated its performance through extensive training in the MetaDrive simulator and testing in the high-fidelity AirSim simulator. Results show that the ViT-based model significantly outperformed CNN baselines in MetaDrive, achieving nearly seven times the average distance traveled and an 87% increase in average speed. In AirSim, the model exhibited superior adaptability to realistic conditions, maintaining stability and safety in visually complex environments. These findings highlight the potential of ViTs to enhance the robustness and reliability of vision-based autonomous systems, offering a transformative approach to safe exploration in diverse driving scenarios.
Published	Basel : MDPI
Type	Journal article
Language	English
Publication date	2025
CC license

„Deep reinforcement learning for a self-driving vehicle operating solely on visual information“