Logo
  • Home
  • About ESA Φ-lab CIN
  • CIN People
  • Opportunities
  • Projects
  • Φ-talks
  • News
→ THE EUROPEAN SPACE AGENCY
Deep Learning for Efficient Onboard SAR Processing

Deep Learning for Efficient Onboard SAR Processing

📆 Project Period
October 2025 - January 2026
👤 CIN Visiting Researcher
Gabriele Daga

Project Summary

  • Context: Addressed the critical data bottleneck in spaceborne SAR missions, where the volume of high-resolution, wide-swath data far exceeds downlink capacity. The traditional "collect-downlink-process" pipeline is becoming unsustainable for modern constellations.
  • Methodology: Reformulated the SAR azimuth focusing task, traditionally a block-based 2D operation, into a one-dimensional sequence-to-sequence problem processed along the slow-time (azimuth) axis. This approach utilized State-Space Models (SSMs), specifically a stack of S4D (Diagonal Plus Low-Rank) layers, to model the long-range dependencies of the synthetic aperture. This design enables line-by-line streaming processing, effectively eliminating the "corner-turning" memory bottleneck that prevents standard algorithms like Range-Doppler from running on memory-constrained onboard hardware.
  • Architecture: Developed a "Teacher-Student" progressive knowledge distillation framework. A high-capacity, complex-valued Teacher model with a selective gating mechanism designed to capture complex scattering behavior and preserve radiometric fidelity. This was distilled into a more compact student model designed for deployment.
  • Results:The distilled Student model achieved a compression ratio of 48x in parameter count compared to the Teacher. Crucially, it maintained competitive focusing quality with 36dB PSNR and an SSIM of 0.803 on average on a test set, successfully preserving fine structures and point targets essential for downstream tasks.

Development Tools

  • Deep Learning Frameworks: PyTorch was the primary framework used for implementing and training the custom complex-valued neural networks and the State-Space Model (S4D) architectures.
  • Data Management & Streaming: Developed a robust custom data pipeline for Sentinel-1 Stripmap products. To support the unique requirement of streaming access during training (mimicking satellite acquisition), data was processed from Level-0 to Level-1 and stored as Zarr chunks hosted on the Hugging Face Hub. A custom data loader with an LRU cache was implemented to enable efficient, partial streaming of azimuth lines directly from cloud storage without downloading full multi-gigabyte scenes.

Development Outputs

  • Code Repository: A complete codebase implementing the sequence-to-sequence SAR focusing architecture. This includes the definitions for the Teacher and Student model, the progressive knowledge distillation training pipeline, and the custom streaming dataloader. A scientific publication is currently in preparation, and the codebase is intended to be released as open source.
  • Dataset: A curated, streaming-optimized dataset of over 960 paired Sentinel-1 Raw and SLC products. The dataset covers a diverse range of global scenes including land, coast, and urban areas to ensure model robustness. It is formatted in Zarr to support efficient, chunked access.

Project Description

Motivation and Objectives

Synthetic Aperture Radar (SAR) missions are undergoing a paradigm shift, characterized by an explosion in data volume driven by higher spatial resolutions, wider acquisition swaths, and the launch of large satellite constellations. This exponential growth is exacerbating the well-known bottleneck of downlink capacity; the volume of raw data generated in orbit far exceeds what can be transmitted to ground stations.

Consequently, the traditional "collect-everything-and-downlink" pipeline is becoming operationally unsustainable. This project investigated the potential of Deep Learning to disrupt this paradigm by enabling onboard SAR focusing. By converting raw data into focused Single Look Complex (SLC) images directly on the satellite, the proposed approach paves the way for "Cognitive SAR" systems capable of extracting actionable insights (e.g., ship detection, flood mapping) in near real-time and therefore being able to change the acquisition mode based on that. The primary technical challenge addressed was the "corner-turning" problem inherent in classical algorithms like the Range-Doppler Algorithm (RDA). These methods require buffering the entire raw data matrix to perform transposes, demanding memory resources that are prohibitive for standard embedded space hardware. The objective of this work was to develop a model that could focus data in a streaming, line-by-line fashion, drastically reducing memory footprint.

Conventional SAR focusing

In traditional SAR processing chains, image focusing is performed using frequency-domain algorithms such as the Range-Doppler Algorithm (RDA). These methods operate on a two-dimensional raw data matrix, with one dimension corresponding to range (fast time) and the other to azimuth (slow time). Range compression is naturally well suited to streaming: each received pulse can be independently matched-filtered to resolve targets in slant range.

Azimuth compression, however, is intrinsically more challenging. High azimuth resolution in SAR is achieved by coherently combining echoes collected over a long sequence of radar pulses as the satellite moves along its orbit, forming a synthetic aperture. For a single ground target, meaningful azimuth focusing is only possible after acquiring all pulses during the time the target remains within the antenna beam—typically requiring hundreds or thousands of consecutive range-compressed lines. Classical algorithms such as RDA therefore require access to the full azimuth history to perform Doppler-domain processing, leading to large memory buffers and explicit data reordering (corner-turning). While efficient on ground-based systems, these requirements make conventional azimuth compression poorly suited for real-time, onboard processing on memory-constrained spaceborne hardware.

Methodology: Sequence-to-Sequence SAR Focusing with State-Space Models

The SAR azimuth focusing task was reformulated from a two-dimensional block-processing operation into a one-dimensional sequence-to-sequence learning problem. While range compression is naturally suited to line-by-line processing, azimuth focusing requires integrating a long history of radar pulses collected as the platform moves along its trajectory, forming the synthetic aperture.

To address this challenge, the use of State-Space Models (SSMs) was proposed: these models are particularly well suited to modeling the long-range dependencies of the azimuth signal—often spanning thousands of samples—while maintaining linear computational complexity. Unlike Transformer-based architectures, whose complexity scales quadratically with sequence length, SSMs enable efficient processing of long SAR sequences.

A key advantage of SSMs is their ability to operate in two complementary modes: a parallel formulation, used during training for stability and efficiency, and a sequential formulation, used during inference. In the latter case, the model processes incoming range-compressed lines one at a time while maintaining an internal hidden state that encodes information from past acquisitions. This allows azimuth-compressed lines to be produced incrementally in a streaming fashion, eliminating the need to buffer the full azimuth history.

Figure 1: Equivalence of sequential and parallel mode in State Space Models
Figure 1: Equivalence of sequential and parallel mode in State Space Models
Figure 2: Usage of SSM in the focusing pipeline for azimuth compression.
Figure 2: Usage of SSM in the focusing pipeline for azimuth compression.

To reconcile high focusing quality with the strict constraints of onboard hardware—such as limited memory, power, and computational resources—a knowledge distillation strategy was developed. A high-capacity SSM-based Teacher model was first trained to produce high-fidelity focused SAR images, capturing both radiometric accuracy and fine structural details.

This knowledge was then transferred to a more compact student model through progressive distillation. Despite its drastically reduced size and computational footprint, the Student model learned to closely approximate the focusing performance of the Teacher. This approach enabled a substantial reduction in model complexity while preserving the high-frequency content and point scatterer fidelity essential for downstream SAR analysis.

The Dataset

A custom dataset was constructed from Level-0 Sentinel-1 Stripmap products to support this novel streaming approach. The dataset comprises 967 products processed from Raw to SLC using a standard RDA pipeline to generate ground truth labels. Data was organized to allow efficient, sequential access during training, faithfully emulating real-time satellite acquisition.

The dataset was curated to ensure a uniform geographical distribution across training, validation, and test sets. The dataset and the associated streaming data loader are made available through public repositories.

Figure 3: Final geographical downsampling of dataset in several zones that guarantees uniform distribution
Figure 3: Final geographical downsampling of dataset in several zones that guarantees uniform distribution   

Experimental Results

The collaboration yielded a highly efficient deep learning model capable of high-quality onboard focusing. The distillation process was highly effective, compressing the model by a factor of 48x in parameters and 64x in state dimension.

On the held-out test set, the Student model achieved a PSNR of 36.2 dB and an SSIM of 0.803. This represents a minimal performance drop (~1.7 dB PSNR) compared to the larger Teacher model, confirming the efficacy of the distillation process.

As shown in Figure 4, the Student model successfully reconstructs fine details. Point scatterers, roads, and urban structures are preserved with high fidelity, verifying that the model retains the high-frequency content essential for interpretation and downstream analytics.

Figure 4: Visual comparison of focusing results. From left to right: Range Compressed Input, Ground Truth (RDA), Student Prediction, and Teacher Prediction. The Student model preserves high-frequency content and point scatterers despite high compression.
Figure 4: Visual comparison of focusing results. From left to right: Range Compressed Input, Ground Truth (RDA), Student Prediction, and Teacher Prediction. The Student model preserves high-frequency content and point scatterers despite high compression.
Figure 7: Student vs Teacher comparison on a subset of the test set (land scenes).
Figure 7: Student vs Teacher comparison on a subset of the test set (land scenes).

Conclusion and Future Directions

This project successfully demonstrated that Deep Learning-based SAR focusing is a viable alternative to classical algorithms for onboard deployment. By reformulating the problem as a streaming sequence task and utilizing efficient State-Space Models, the memory-intensive corner-turning requirement was eliminated. The resulting Student model offers a practical path toward Cognitive SAR systems: satellites that can focus data in real-time, analyze it onboard (e.g., for rapid disaster response or maritime surveillance), and downlink actionable information with minimal latency. Future work will focus on the hardware deployment of the Student model onto space-grade FPGAs to validate power consumption in orbit, and extending the methodology to support more complex acquisition modes like TOPS. A scientific publication is currently in preparation, and the codebase is intended to be released as open source.

Logo

About ESA EO

About CIN

About Pi School

ESA Φ-lab Website

ESA Φ-lab Linkedin community

Copyright 2025 @ European Space Agency. All rights reserved.

LinkedInXGitHubInstagramFacebookYouTube