EuroSciPy 2025

Python Framework for Large-Scale Radar Data Generation and Visualization
2025-08-20 , Small room

The application of machine learning in automotive radar systems presents severe challenges, particularly due to the limited availability of raw radar data tailored to specific radar configurations and annotated datasets. In this presentation, we introduce a novel Python-based framework designed to address these challenges by enabling large-scale radar data generation and visualization.

Our framework leverages existing radar detections from production systems, accumulating radar detections over multiple cycles to enhance resolution and minimize feature fluctuation. These accumulated features, referred to as pseudo scatter points, are treated as scatter centers to generate raw spectra for virtual radar systems with arbitrary antenna arrangements. This approach incorporates clutter in the simulation to achieve more representative results.

Key features of our framework include:

  • GPU Acceleration: Utilizes GPU acceleration to handle the computational demands of large-scale radar data generation efficiently.
  • Inbuilt Visualizer: Provides an inbuilt visualizer for radar data, facilitating real-time analysis and debugging.
  • Specialized Data class: Implements a specialized data class to streamline the process of radar data generation and processing.

The advancement of machine learning in automotive radar systems is significantly impeded by the limited availability of annotated raw radar data, specifically tailored to distinct radar sensor configurations. To overcome these limitations, we present a robust Python-based computational framework specifically designed to facilitate efficient large-scale radar data generation and comprehensive visualization.

Our framework leverages radar point cloud data, which is stored in the HDF5 format and collected directly from automotive radar sensors deployed in real-world environments. The point cloud data undergoes sophisticated preprocessing where radar detections are accumulated over multiple measurement cycles. This accumulation process is carefully corrected for both ego-vehicle velocity and the velocities of detected objects, ensuring spatial and temporal consistency. Through this preprocessing, our method reliably identifies dominant scatter centers within the radar's operational environment, creating high-quality datasets, termed "pseudo scatter points," that effectively represent persistent environmental features.

These pseudo scatter points serve as a foundation for generating synthetic raw radar data suitable for virtual radar systems with arbitrary antenna arrangements and configurations within the widely used 77 GHz automotive radar band. The framework implements advanced radar signal processing algorithms including multi-dimensional Fast Fourier Transforms (FFTs), digital beamforming, Constant False Alarm Rate (CFAR) detection algorithms in range-Doppler domains, specialized peak-finding algorithms in azimuth and elevation dimensions, and interpolation methods to enhance angular resolution. Additionally, synthetic clutter generation methods are integrated to further improve the realism of the generated radar datasets.

A key strength of our framework is its adaptability and flexibility, achieved through the utilization of a specialized and extensible Python data class structure. This structure simplifies the step-by-step management of radar data processing pipelines, facilitating customization, scalability, and straightforward integration of additional processing steps or alternative radar parameter configurations.

To address the computational demands of generating radar data at scale, the framework employs GPU acceleration via CuPy, a CUDA-enabled implementation of NumPy, enabling significant performance improvements. Currently, it achieves processing rates of approximately 10,000 radar frames per day, depending on radar resolution, thus offering practical scalability for both research and commercial development scenarios.

Visualization capabilities are seamlessly integrated, leveraging PyQt5 combined with OpenGL to deliver a powerful inbuilt visualization tool. This visualizer provides detailed, interactive representations of radar cubes in real time, significantly aiding debugging, data exploration, and deeper insight into radar signal characteristics and potential anomalies.

Practical applications of our framework include the generation of extensive radar datasets to support the training and validation of advanced deep neural networks for automotive radar perception tasks, as well as early-stage validation and benchmarking against reference radar sensor systems. Validation experiments, conducted through direct comparison with evaluation radar sensor data collected under identical conditions, have demonstrated the framework's capability to reliably reproduce realistic radar spectra, highlighting its value as a cost-effective alternative or complementary solution to expensive field-testing campaigns or high-fidelity ray tracing methods.

From a scientific research perspective, our framework directly addresses a critical bottleneck in deep learning-based radar applications—the scalability and availability of data needed to train sophisticated machine learning models. Its flexibility, ease of use, and high computational efficiency make it an attractive tool for researchers and developers aiming to push the boundaries of radar-based perception in automotive safety systems and advanced driver assistance systems (ADAS).

We welcome feedback and collaboration from the scientific community at EuroSciPy 2025 to further refine and extend the capabilities of our framework, exploring its broader applicability in scientific computing and automotive research domains.


Expected audience expertise: Domain:

some

Expected audience expertise: Python:

some

Your relationship with the presented work/project:

Original author or co-author

Engineer and PhD student at BMW Group in cooperation with FAU Erlangen-Nuremberg, specializing in radar signal processing and perception with a focus on deep learning. Currently developing a Python framework for large-scale radar data generation.