Creating terrains using generative modelling and satellite images

Researchers

‣

Paul Borne Pons

Project Summary

My project revolves on generation of terrains using remote sensing data (Optical RGB + DEM) and generative machine learning.
The publication of global expansion to Major TOM with GLO-30 DEM data (Copernicus DEM https://spacedata.copernicus.eu/collections/copernicus-digital-elevation-model).
Some post processing algorithms to get a better photographic look out of the Sentinel 2 optical images (histogram matching between L1C and L2A)

Development Tools

I’ve worked intensively on a EO-HPC virtual machine which gave me a direct access to ESA data (especially CopernicusDEM) to enable easier processing in view of creating the dataset.
I’ve worked on my Adobe Laptop but using the Remote desktop app to access my esa emails.
I’ve used a lot of ESA documentation available freely online such as https://spacedata.copernicus.eu/collections/copernicus-digital-elevation-model

Development Outputs

This DEM expansion to major TOM mentioned above is available freely on Hugging Face https://huggingface.co/datasets/Major-TOM/Core-DEM as the rest of the Major TOM datasets.
All my code remains on the eohpc VM and on a private GitHub repository

Internship Subject

The internship project focuses on terrain generation using satellite imagery and diffusion models, with two main challenges: creating a training dataset and training a diffusion model.

Terrain generation is crucial in video games and VFX, especially for large-scale landscapes, where current methods such as procedural and simulation-based techniques are time-consuming and lack realism at scale. Example-based approaches, leveraging real-world data, have shown promise, particularly with advancements in generative models like GANs and Diffusion Models. The project aims to address the challenge of large-scale generation with diverse outputs while maintaining user control through conditioning. Diffusion and more especially Latent Diffusion Models are known to be able to tackle these 3 problems.

To train this model we need a dataset of terrain. It combines 2D RGB satellite images and Digital Elevation Models (DEM) to represent terrain surfaces, The dataset will also include conditioning information like GPS data, text, or images to facilitate the training of the diffusion model.

The collaboration between ESA and Adobe lasted for the whole length of my internship but the focus of my stay at ESA’s Φ-lab was the creation of the RGB + DEM dataset.

A RGB + DEM dataset

What source?

For global RGB data, Sentinel-2 (S2), an optical satellite from ESA, is the ideal choice due to its open access, frequent revisit time, global coverage, and extensive community usage. For Digital Elevation Models (DEM), several 30m resolution options exist. SRTM (Shuttle Radar Topography Mission) is standard but outdated, while ASTER and ALOS DEMs, derived from stereophotogrammetry, suffer from poor quality and artifacts. The Copernicus DEM, an ESA InSAR product, stands out as superior in quality and resolution, and since it's hosted on ESA servers, it's the most favorable for this project.

The Method

There are 3 reasons for the reprocessing of raw Copernicus data

Naïve Sentinel 2 and Copernicus DEM tiles are too big for machine learning use (100km*100km = 10 000 * 10 000 pixels images for S2)
We want a uniform distribution of the earth and need to sample it accordingly
Copernicus DEM and Sentinel 2 lie in two different projections: EPSG 4326 for Copernicus DEM and UTM zones for Sentinel 2

Our approach involved creating 10x10 km tiles from the Copernicus DEM to match those from the Major TOM-core, an already existing global dataset for Sentinel-1 and Sentinel-2 images developed by ESA’s Φ-lab. Major TOM mainly consists of a geographical indexing system based on a set of grid points and a metadata structure that allows multiple datasets with different sources to be merged. This grid approach fosters the uniformity of the dataset and lets us cut the raw data into smaller tiles (100km100km to 10km10km for Sentinel 2). Furthermore, sampling our DEM data on the same sampling grid as Major TOM-Core enables us to have matching RGB and DEM data.

The resulting dataset

In this first version, all available DEM data was included except for the Major TOM cells below the 89th latitude and two degrees west off the date change line. Azerbaijan and Armenia weren’t included either as they are unavailable on the Creodias platform used to create this dataset.

In addition to raw data, hillshade thumbnails and a compressed version of the data was created

Reprojection

Contrary to S1 RTC and S2 (L1C & L2A) products, which are taken in their native projection to create their respective Major TOM Core datasets, Copernicus DEM, natively in EPSG:4326, was reprojected to a carefully chosen projection. To guarantee uniformity across Major Tom sources, it was reprojected to the corresponding UTM zone of the cell. This leads to inconsistency between Sentinel-2 and COP-DEM cells in some cases. For the S2-L2A product, this is estimated to be 2.5% of all the cells where COP-DEM and S2-L2A are available (41,998 out of 1,679,898 cells).

Large DEM tiles were projected and resampled to 30m using bilinear interpolation. Small major tom cells were then cropped for it using nearest neighbor interpolation if needed. Some tiles above water and around Armenia and Azerbaijan, may exhibit missing pixels whose values were set to -32767.

Postprocessing of the RGB data

In addition to the DEM expansion of Major TOM, I worked on enhancing the Sentinel-2 image, trading radiometric correctness for a better photographic look. The RGB images used were L2A (Bottom of the atmosphere) images, contrary to L1D (Top of the atmosphere), these images have been corrected of their characteristic haze (due to aerosol and water vapor in the atmosphere) Nevertheless this correction comes with shadow correction on rugged terrains which is imprecise and introduces a lot of artifact in the L2A product. To get rid of both the haze and the L2A artifact, we perform a histogram matching of the L1C pixel values to the L2A ones. This simple algorithm enables us to correct most of the images containing artifacts and has the benefit of being quite lightweight and hence usable at scale which is necessary for our approach.

Conclusion

During my stay at ESA’s I managed to create a global DEM expansion to Major TOM and I worked on enhancing the Sentinel 2 images for a more photographically pleasing rendering. Since then I’ve worked on captioning my dataset using different metadata sources and Vison Language Models and I’ve trained a few diffusion models on the resulting dataset.

Just blow you can see an output of the latest model I’ve trained:

The resulting dataset is available on 🤗 HuggingFace and image can be browsered here :

https://huggingface.co/spaces/Major-TOM/MajorTOM-Core-Viewer

Eye Candy

Using the matching DEM and RGB images and Substance Designer, we created a render of some iconic places on earth demonstrating the diversity of terrain in our dataset.