Accéder directement au contenu Accéder directement à la navigation
Nouvelle interface
Communication dans un congrès

TransFuseGrid: Transformer-based Lidar-RGB fusion for semantic grid prediction

Abstract : Semantic grids are a succinct and convenient approach to represent the environment for mobile robotics and autonomous driving applications. While the use of Lidar sensors is now generalized in robotics, most semantic grid prediction approaches in the literature focus only on RGB data. In this paper, we present an approach for semantic grid prediction that uses a transformer architecture to fuse Lidar sensor data with RGB images from multiple cameras. Our proposed method, TransFuseGrid, first transforms both input streams into topview embeddings, and then fuses these embeddings at multiple scales with Transformers. Finally, a decoder transforms the fused, top-view feature map into a semantic grid of the vehicle's environment. We evaluate the performance of our approach on the nuScenes dataset for the vehicle, drivable area, lane divider and walkway segmentation tasks. The results show that Trans-FuseGrid achieves superior performance than competing RGBonly and Lidar-only methods. Additionally, the Transformer feature fusion leads to a significative improvement over naive RGB-Lidar concatenation. In particular, for the segmentation of vehicles, our model outperforms state-of-the-art RGB-only and Lidar-only methods by 24% and 53%, respectively.
Liste complète des métadonnées

https://hal.inria.fr/hal-03768008
Contributeur : David Sierra-Gonzalez Connectez-vous pour contacter le contributeur
Soumis le : vendredi 2 septembre 2022 - 14:39:07
Dernière modification le : jeudi 8 décembre 2022 - 09:04:42
Archivage à long terme le : : samedi 3 décembre 2022 - 19:27:50

Fichier

TTGrid.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-03768008, version 1

Collections

Citation

Gustavo Salazar-Gomez, David Sierra González, Manuel Alejandro Diaz-Zapata, Anshul Paigwar, Wenqian Liu, et al.. TransFuseGrid: Transformer-based Lidar-RGB fusion for semantic grid prediction. ICARCV 2022 - 17th International Conference on Control, Automation, Robotics and Vision, Dec 2022, Singapore, Singapore. pp.1-6. ⟨hal-03768008⟩

Partager

Métriques

Consultations de la notice

117

Téléchargements de fichiers

63