Creating a UDF from a Python Module That Operates on Multiple Datacubes

mlla · 5 November 2024 12:13

Hello
I’m working on creating some OpenEO wrappers around a Python module, to make it possible to run the function with data loaded from the OpenEO Platform collections on the OpenEO Platform Backend. Ultimately, I’d like to create a User-Defined Process so other users can run the function as well.

The Python module takes two file paths—one for a high-resolution image and one for a low-resolution image of the same area—and returns a high-resolution representation of the low-resolution image.

My Question:
What’s the best way to create a User-Defined Function (UDF) around this Python module?
From the OpenEO documentation, my understanding is that UDFs are typically applied to a single datacube and that the function operates at the pixel level. However, in my case, I need to load two datacubes (from different collections), save them to disk, and then pass their file paths to a backend job that runs the Python function.
Is there any way to implement this approach within OpenEO, or would it require a different strategy?

I’m quite new to OpenEO, so I might be missing something basic. Any guidance would be greatly appreciated!

jan.van.den.bosch · 13 November 2024 09:31

Hi Marie,

Is my understanding correct that you have an existing function that upsamples a single low-resolution image to the resolution of a high-resolution image and you want to apply this to two OpenEO data cubes?

mlla · 13 November 2024 10:10

Yes, that is correct
Of course I can download the two OpenEO data cubes as images and then run the function locally. but I would like to run the function on the OpenEO Backend instead of locally.

jan.van.den.bosch · 13 November 2024 10:19

In that case OpenEO offers this out of the box with the resample_cube_spatial process so there is no need for a UDF.

Let us know if you have additional questions or encounter any problems with this approach.

Cheers,
Jan

mlla · 13 November 2024 10:39

Okay, then maybe I am explaining it wrong. It is not just resampling, but rather a specific algorithm for “sharpening”.
So the question is more if there is any way to define a UDF that takes two datacubes and returns one?

jeroen.dries · 14 November 2024 13:16

Hi,
currently such a process does not yet exist.
The workaround that others have done before is rather:
bring the low-res datacube to same grid/resolution as high-res cube
then in the udf, you basically have to downsample the low-res bands again, at that point, you have the desired input.

It’s not ideal, but currently the only way.
Specifically for Sentinel-2, there is also a special ‘resolution_merge’ that does something like pansharpening, but I guess you are looking to implement your own algorithm.