Super-resolution with UDF, wrong georeferencing of the output image

ek.kalinicheva · 17 May 2023 09:15

Hello.
I am currently working on super-resolution algorithm and its integration in Open EO.
I have written a UDF that passes a numpy array extracted from a datacube to an inference function based on a pretrained NN model. The input numpy array is always in shape (B, H, W).
The inference function returns a 2x super-resoluted numpy array that is then transformed to an xarray and georeferenced.
However, the final output image has the same pixel size as the input one. We can see that the data chunking has messed with the image output, as each chunk has still 10m pixel resolution instead of 5m. The output image has the same size in pixels as the input one, but the predicted_cube has the correct 2 x cubearray shape.

Here is the example of my code and the output image.

import openeo
connection = openeo.connect("openeo.vito.be").authenticate_oidc()
s2_cube = connection.load_collection(
    "SENTINEL2_L2A",
    spatial_extent={"west": 5.1, "east": 5.25, "south": 51.0, "north": 51.1},
    temporal_extent=["2020-07-05", "2020-07-30"],
    bands=["B02", "B03", "B04", "B08"],
    max_cloud_cover=30
)

udf = openeo.UDF.from_file('udf_fancy_resolution.py')

# UDF
result = s2_cube.max_time().apply(process=udf)
result.download("test_output.nc")

udf_fancy_resolution.py:

from openeo.udf import XarrayDataCube
from typing import Dict
import xarray
import numpy as np


def fancy_upsample_function(array: np.array, factor: int = 2) -> np.array:
    assert array.ndim == 3
    return array.repeat(factor, axis=-1).repeat(factor, axis=-2)


def apply_datacube(cube: XarrayDataCube, context: Dict) -> XarrayDataCube:
    # We get data from datacube
    cubearray: xarray.DataArray = cube.get_array().copy()

    # We make prediction and transform numpy array back to datacube

    # Pixel size of the original image
    init_pixel_size_x = cubearray.coords['x'][-1] - cubearray.coords['x'][-2]
    init_pixel_size_y = cubearray.coords['y'][-1] - cubearray.coords['y'][-2]

    # If we have a time series in the datacube, each image is processed separately
    if cubearray.data.ndim == 4 and cubearray.data.shape[0] != 1:
        cube_collection = []
        for c in range(len(cubearray.data)):
            predicted_array = fancy_upsample_function(cubearray.data[c], 2)
            cube_collection.append(predicted_array)
        cube_collection = np.stack(cube_collection, axis=0)
        coord_x = np.linspace(start=cube.get_array().coords['x'].min(), stop=cube.get_array().coords['x'].max() + init_pixel_size_x,
                              num=predicted_array.shape[-2], endpoint=False)
        coord_y = np.linspace(start=cube.get_array().coords['y'].min(), stop=cube.get_array().coords['y'].max() + init_pixel_size_y,
                              num=predicted_array.shape[-1], endpoint=False)
        predicted_cube = xarray.DataArray(cube_collection, dims=['t', 'bands', 'x', 'y'], coords=dict(x=coord_x, y=coord_y))
    # We have only one image
    else:
        if cubearray.data.ndim == 4 and cubearray.data.shape[0] == 1:
            cubearray = cubearray[0]
        predicted_array = fancy_upsample_function(cubearray.data, 2)
        coord_x = np.linspace(start=cube.get_array().coords['x'].min(), stop=cube.get_array().coords['x'].max() + init_pixel_size_x,
                              num=predicted_array.shape[-2], endpoint=False)
        coord_y = np.linspace(start=cube.get_array().coords['y'].min(), stop=cube.get_array().coords['y'].max() + init_pixel_size_y,
                              num=predicted_array.shape[-1], endpoint=False)
        predicted_cube = xarray.DataArray(predicted_array, dims=['bands', 'x', 'y'], coords=dict(x=coord_x, y=coord_y))

    return XarrayDataCube(predicted_cube)

Don’t know if there is a bug somewhere or should I change something in my code?
Thank you in advance.

stefaan.lippens · 17 May 2023 10:29

Using a UDF in apply does not really allow to change the resolution of your cube.

I haven’t tried it myself, but I think, conceptually, that you should first explicitly resample the cube to the desired resolution (with a simple method like near) and after that apply an UDF where you downsample first and then do the fancy superresolution

jeroen.dries · 25 May 2023 19:08

Hi,
just wanted to add here that we are doing some work with allowing a 20m to 10m ‘super-resolution’ case specifically for Sentinel-2. This is however also assuming that the datacube gets created at 10m resolution.

Your case is a bit different, is the suggestion made by Stefaan an option?
The key problem revolves around changing the labels of the spatial axis in the datacube. A new openEO process would probably be the best option but perhaps it’s better to go over your case a bit more in depth to be sure?

julien.michel · 27 November 2023 15:35

Hi all,

I made a lot of progress toward integrating the super-resolution code in openeo, and I am now coming back to this issue on mixed 10m - 20m resolution single datacube.

My model would require to have access to 10m as well as original 20m bands without resampling.

Somehow the following issue looks like it is providing exactly that, but I am unsure how and how to use it : support bands with 10m and 20m resolution in single datacube (Sentinel-2) · Issue #152 · Open-EO/openeo-geotrellis-extensions · GitHub

jeroen.dries · 1 December 2023 11:28

Hi Julien,
indeed, that’s the issue we are working on, but it’s not fully there yet, work got interrupted by some operational issues, but @jeroen.verstraelen will continue working on it, so we hopefully can get something soon.