Introduction

Reliable operation of Autonomous Vehicles relies heavily on multi-modal sensor fusion (combining Camera, LiDAR, and Radar) to compensate for individual sensor weaknesses. However, standard deep learning fusion architectures typically operate under the assumption of nominal sensor health. Consequently, they lack a fail-safe mechanism to handle corrupted data streams caused by environmental degradation (e.g., severe weather) or hardware faults (e.g., sensor occlusion, calibration drift). When such failures occur, standard models inadvertently fuse noise with signal, leading to errors in downstream tasks, whether in understanding the environment (Perception) or estimating the vehicle’s position (Localization).
This thesis proposes a generalized “Severity-Aware Fusion Framework”. Instead of treating sensor inputs as equally reliable, this architecture integrates a real-time Severity Score, derived from an independent Anomaly Detection module, to explicitly govern the fusion process. Through a novel dynamic gating mechanism, the system learns to mathematically suppress feature maps from compromised sensors before they affect the shared representation. This approach ensures Graceful Degradation, guaranteeing that the system maintains operational capability by relying on remaining robust modalities, regardless of the specific task (Perception or Localization) being performed.
Goals
Architecture:
- Fusion Core: The student will adopt a state-of-the-art sensor fusion backbone (e.g., based on Bird’s Eye View or token-based representations).
- Severity Branch: Integration of a parallel control branch that projects the diagnostic Severity Score or Severity Map into attention weights. These weights will modulate the primary sensor features via techniques such as FiLM (Feature-wise Linear Modulation) or Gated Attention prior to the fusion stage.
Training Strategy: Implementation of a “Fault Injection” protocol. The model will be trained on nominal data augmented with synthetic corruptions (simulating both weather and hardware faults). The training objective will force the network to minimize reliance on sensors tagged with high severity scores.
Datasets: Utilization of public multimodal datasets (e.g., nuScenes, VoD) enriched with a custom library of synthetic sensor faults to simulate edge cases rarely found in standard training sets.
Framework: The implementation will leverage standard Deep Learning frameworks (PyTorch) and high-performance computing resources (NVIDIA GPU) to ensure real-time inference capabilities.
- Knowledge of Python;
- Machine learning/deep learning and software development attitude (can be learned during the project);
Contact
Davide Possenti: davide.possenti@polimi.it
Stefano Arrigoni: stefano.arrigoni@polimi.it
For inquiries and further information, please email the first author, copying the other authors in CC
