Dual-Camera LiDAR Fusion for Occlusion-Robust 3D Detection in Urban Driving Simulation

Xingnan Zhou and Ciprian Alecsandru
Concordia University, Montreal, QC, Canada
In Preparation, 2026

Abstract

Three-dimensional object detection from LiDAR point clouds is a cornerstone of autonomous driving perception, yet single-sensor systems remain vulnerable to occlusion in complex urban environments. This paper proposes a symmetric dual-camera LiDAR fusion framework that combines PointPillar and CenterPoint 3D LiDAR detectors with YOLOv8-based 2D detections from two complementary camera viewpoints: a drone (top-down, 40m altitude) and a subject-vehicle forward camera. The fusion operates at the decision level (late fusion), where camera-confirmed LiDAR detections receive confidence boosts while unconfirmed low-confidence detections are suppressed. The full dual-camera fusion achieves +0.92 pp mAP@0.5 (+4.4% relative; sign test p = 0.001, t-test p < 0.0001), with all ten seeds showing positive improvement.
+4.4%
mAP@0.5 Improvement
10/10
Seeds Positive
35,837
Object Annotations
2,600
CARLA Frames
System overview
System overview: symmetric dual-camera LiDAR fusion combining PointPillar and CenterPoint 3D detection with YOLOv8 drone and forward-camera 2D detections for occlusion-robust perception.

Method

Symmetric dual-camera LiDAR fusion pipeline
Overview of the proposed dual-camera LiDAR fusion pipeline. The PointPillar and CenterPoint 3D detectors and two independent YOLOv8 2D detectors produce detections from their respective sensors. The symmetric late-fusion module refines confidence scores based on uniform boost and suppress rules across both cameras.
Sensor setup in CARLA
Sensor configuration in CARLA Town10HD: ego vehicle with 64-channel LiDAR + forward camera, supplemented by a drone camera at 40m altitude.
Dual detector architecture
Dual YOLOv8 detector architecture: independent models process drone (top-down) and forward camera (SDC) views.

Three-Stage Pipeline

Key Finding: Symmetric > Asymmetric

Counter-intuitively, applying both boost and suppress operations to both cameras outperforms asymmetric designs where the forward camera is restricted to boost-only. The forward camera’s value lies entirely in its ability to suppress false positives, not in boosting true positives.

Results

Evaluated on a CARLA Town10HD dataset with 2,600 frames, 35,837 annotations (Car + Pedestrian), and ten-seed repeated random sub-sampling validation.

PointPillar Results (10-seed average)

Configuration Car AP Ped AP mAP@0.5 Δ Significance
LiDAR-only 38.50 2.69 20.76 ± 0.38
+ SDC (boost-only) 38.50 2.69 20.75 ± 0.38 -0.01 n.s.
+ Drone 40.21 2.57 21.39 ± 0.35 +0.63 p = 0.001*
+ Symmetric 40.74 2.61 21.68 ± 0.31 +0.92 p = 0.001*

CenterPoint Results (10-seed average)

Configuration Car AP Ped AP mAP@0.5 Δ Significance
LiDAR-only 39.04 5.56 22.30 ± 0.28
+ SDC (boost-only) 38.94 5.55 22.25 ± 0.25 -0.05 n.s.
+ Drone 40.39 5.55 22.97 ± 0.32 +0.67 p = 0.001*
+ Symmetric 40.54 5.54 23.04 ± 0.31 +0.74 p = 0.001*
Fusion ablation study
Ten-seed averaged mAP@0.5 with standard deviation error bars across PointPillar and CenterPoint detectors
Improvement heatmap
Improvement over LiDAR-only baseline across all metrics and configurations. Bold borders indicate statistical significance.

Per-Seed Consistency

Per-seed consistency
Per-seed mAP@0.5 consistency. All fusion configurations consistently outperform the LiDAR-only baseline across all ten random seeds (10/10 positive, sign test p = 0.001).

Camera Contribution Ablation (PointPillar)

Configuration mAP@0.5 Δ (pp) Relative
LiDAR-only20.76
+ SDC camera (boost only)20.75-0.01n.s.
+ Drone camera (symmetric)21.39+0.63+3.0%
+ Both cameras (symmetric)21.68+0.92+4.4%

The drone camera drives the majority of improvement, while the forward camera’s value lies in its ability to suppress false positives rather than boost true positives. Symmetric fusion outperforms asymmetric across both PointPillar and CenterPoint detectors, with all gains statistically significant (sign test p = 0.001, t-test p < 0.0001).

Sensitivity analysis
Sensitivity analysis of fusion hyperparameters (boost factor, suppress factor, confidence threshold). The optimal symmetric configuration achieves +0.92 pp mAP@0.5 improvement.

Qualitative Results

Qualitative fusion overview
Representative scenarios showing LiDAR-only (left) vs dual-camera fused (right) detections. Fusion recovers occluded vehicles and suppresses false positives in cluttered scenes.
False positive suppression
False positive suppression: the drone camera's overhead view confirms or denies LiDAR detections, reducing spurious boxes in occluded regions.

Multi-View Sample Pairs

Synchronized samples from all three sensors show how complementary viewpoints resolve ambiguities:

Drone view sample
Drone camera view (40m altitude) — complete overhead coverage
SDC view sample
Forward camera (SDC) view — frontal sector with depth perception
Paired sensor views
Side-by-side drone and SDC views for the same frame, showing complementary coverage areas.

Video Demonstrations

These videos show the fusion system operating in real-time on CARLA Town10HD sequences.

Driving sequence: ego vehicle navigating through urban traffic with multi-sensor detections overlaid.
Drone camera detection: YOLOv8 detections from the overhead perspective at 40m altitude.
Fusion demo
Fusion demonstration: combining LiDAR 3D detections with dual-camera 2D confirmations in real-time.

Citation

@article{zhou2026bevfusion,
  title={Dual-Camera LiDAR Fusion for Occlusion-Robust 3D Detection in Urban Driving Simulation},
  author={Zhou, Xingnan and Alecsandru, Ciprian},
  year={2026},
  note={In Preparation}
}