📆 Project Period | July - September, 2025 |
👤 CIN Visiting Researcher |
Project Summary
Destination Earth aims to produce highly accurate Digital Twins of the Earth. Hosting a powerful integrated ecosystem within the DestinE Platform, many of its services provide a seamless accessibility experience for end-users. This project aims to explore the full potential of the DestinE platform as an end-user, integrating the latest advances of state-of-the-art AI models for the Earth Observation field, exploring novelties in data types such as Point Clouds, and contributing to the scope and demands of the Destination Earth project.
Development Tools
- Destination Earth Platform: A gateway to advanced digital technologies, data and leveraging AI and high-performance computing.
- DeltaTwin: A service under DestinE platform. A collaborative environment to create, run, and share multi-scale models.
- Earth Data Hub: Gateway to access earth data. A DestinE service.
- Pytorch: An open-source machine learning library based on the Torch library, used for applications such as computer vision and Deep Learning.
- Open3D: A modern open-source library for 3D Data Processing.
Development Outputs
- Earth Observation Point Cloud in DeltaTwin service: Creating on-the-fly EO-Point Cloud and exploring DeltaTwin processing capacities. GitHub: EO-PointCloud
- Implementation of AI Atmospheric Correction in DeltaTwin service: An AI MVP study focused on atmospheric correction of Sentinel-2 Level-1C data. GitHub: AI-DeltaTwin
- End-to-End AI Pipeline: Design training pipeline to accomplish a baseline model of BigEarthNet. GitHub: End-to-End AI Pipeline
Project Description
Introduction
Among the many services offered by the Destination Earth platform, whose goal is to produce highly accurate Digital Twins of the Earth. The F.A.I.R principles (Findability, Interoperability, Accessibility, and Reusability) are essential guidelines for data management and a seamless end-user experience.
As the timeline of the Destination Earth initiative aims to launch the full Earth Digital Twin by 2030, end-users represent a currently important step in contributing to and leveraging the services already available on the platform.
This project aims to explore Destination Earth services, particularly the DeltaTwin service. DeltaTwin is collaborative toolbox to build and share digital twin components. With a environment for building and running multi-scale and composable workflows, leveraging numerous available data sources, sharing results, and facilitating interoperability with other users and DestinE services.
The full scope of the project is divided into three main points:
- Exploring DeltaTwin geospatial data types, particularly PointCloud data type, by creating and generating EO-PointCloud data on-the-fly.
- Implementation of AI Atmospheric Correction in DeltaTwin.
- End-to-end AI pipeline: increasing accessibility for end-users to perform deep-learning multi-classification tasks.

Earth Observation Point Cloud
A point cloud is a collection of three-dimensional data points used to represent the surface of an object. Therefore, this discrete set of Cartesian data points in space can represent a 3D shape with high accuracy. Usually, accurate point clouds are created using technologies like LiDAR or photogrammetry, and the Earth Observation domain has seen ongoing growth in datasets of point clouds. However, there are techniques for converting other geospatial data types, such as raster data, into a point cloud representation that provides a three-dimensional view of satellite imagery. By using the services offered by Destination Earth, integrating many data type formats is becoming a seamless experience, and processing EO-PointCloud can be done on-the-fly within a DeltaTwin Component.

One of the techniques for generating EO point clouds is using two types of data: an RGB Sentinel-2 image and a Digital Elevation Model (DEM) representation. The Sentinel-2 image is retrieved using the CDSE service and DEM is used with the Earth Data Hub, a DestinE service.
The prime is using the each pixel of the DEM raster as a cartesian point over a point cloud space, so the Z-axis is given by the elevation at the DEM. To colorized the point cloud, the Sentinel-2 image pixel intersection with the DEM is used to transfer the RGB color space. So the each point of the point cloud is a 4-dimensional point, containing (X,Y,Z,(RGB)).

To create a DeltaTwin component, three main files are mandatory: Manifest, Workflow, and Model. DeltaTwin uses these to create the component in its cloud-optimized environment and allows users to run it and schedule runs, leveraging modeling activities of the Digital Twin. A full step-by-step guide for building the component can be found on GitHub. Once the component is published, end-users can focus on running and analyzing, skipping the often time-consuming and challenging steps of software development.

An important note concerns DeltaTwin hardware constraints. The hardware components used for DeltaTwin have 2GB of RAM and 500m of CPU by default; however, both parameters can be increased following Kubernetes resource unit guidelines. Therefore, adding "sampled_fraction" as an input parameter is a way to limit RAM consumption and allow DeltaTwin to handle point cloud data, which can be computationally expensive.
Implementation of AI Atmospheric Correction in DeltaTwin
As DeltaTwin serves as a collaborative toolbox, an exploration of its capabilities regarding AI main components was investigated during the scope of this project. The main goal was creating a DeltaTwin component able to run the full pipeline: fetching data, preprocessing, inference, and post-processing. Atmospheric correction of Sentinel-2 L1C images is currently performed by the sen2cor toolbox; however, scholars have been investigating the use of AI for atmospheric correction due to its faster processing time. DeltaTwin can handle the complexity and scalability of its components and allows users to embed components within other components. This is called Dependencies (when an already created component is used inside another component) and enables extreme scalability for DeltaTwin, demonstrating its strengths in enhancing the end-user experience.

Another important aspect and technical constraint of DeltaTwin is component creation. The manifest file allows adding geospatial libraries through two main parameters: pip and apt. A DeltaTwin component is essentially a Docker container, which means that during its construction, all libraries must be specified within these two parameters. This imposes some difficulties when users want to use conda-forge for installing packages or utilize, for instance, a PyTorch Docker image to reduce container size.
End-to-End AI Pipeline: Design training pipeline to accomplish a baseline model of BigEarthNet


The main goal of the current project was developing an end-to-end pipeline for a multi-label classification task, delivering a flexible and scalable ML pipeline while establishing robust data analysis to handle imbalanced datasets, which are commonly present in the Earth Observation domain. The project focused on evaluating and optimizing model architectures by systematically assessing different neural network designs and delivering clean, maintainable code that aligns with DestinE's internal processes. During the development of the project, model architectures such as ResNet50, ResNet26, and ViT were tested to achieve the best F1-score predictions for multi-classification tasks. In addition, data augmentation strategies were used to improve performance, and fine-tuning was extensively tested throughout the pipeline. The code is completely open and freely available on GitHub, providing an open-source, robust end-to-end pipeline for all users.
