I noticed a difference in the order of the dimensions and the shape of the Xarrary DataArray of the datacube that goes into an UDF (as a XarrayDataCube) when using execute_local_udf
versus applying the udf with reduce_dimension
For completeness, minimal code for the datacube:
cube = connection.load_collection(
spatial_extent={"west": 4.40, "south": 50.75, "east": 4.43, "north": 50.78},
temporal_extent=[datetime.datetime(2018, 1, 1), datetime.datetime(2018, 12, 31)],
bands=["B04", "B08", "SCL"])
cube_vi = compute_index(cube, "NDVI")
cube_vi_agg = cube_vi.aggregate_temporal_period(period="month", reducer="median")
- In the case of
# The first lines of the udf
def apply_datacube(cube: XarrayDataCube, context: dict) -> XarrayDataCube:
xr_dataarray: xr.DataArray = cube.get_array()
The 1st print statement returns ('t', 'bands', 'x', 'y')
The 2nd print statement returns (12, 1, 339, 219)
- In the case of applying udf with
via batch job:
# The first lines of the udf
def apply_datacube(cube: XarrayDataCube, context: dict) -> XarrayDataCube:
xr_dataarray: xr.DataArray = cube.get_array()
The 1st inspect statement returns ('t', 'bands', 'y', 'x')
The 2nd inspect statement returns (12, 1, 256, 256)
- The ‘x’ and ‘y’ dimensions seem to be swapped
- The displayed number of pixels is different (339 and 219 should be the correct numbers for this bbox)
Even though my udfs are working properly now, these differences confused me a lot during the development of the udf. So I’m still looking to understand this better:
- Is this behaviour expected? Or am I doing something wrong?
- If expected, is there a way to avoid these differences? It is my understanding that
can be used for the development/debugging of a udf. However in my case, I needed to adapt the udf anyway when moving from local execution to execution on the openeo backend. Is that normal? - Is there a specific reason for the shape returning 256, 256 instead of 339, 219?
Kind regards,