Denser time series of LAI

I found something!

It’s the when we use the filter_bands() procedure to selecte the layer that leads to the bug. This snippet is working as expected, when you directly select the layer:

import openeo

# load data from openo
con  = openeo.connect("https://openeo.vito.be").authenticate_oidc(provider_id="egi")

# Load data cube from TERRASCOPE
LAI = con.load_collection("TERRASCOPE_S2_LAI_V2",
                               spatial_extent={"west": 5.60, "south": 50.42, "east": 6.3, "north": 50.7},
                               temporal_extent=["2015-07-01", "2022-08-15"],
                               bands="LAI_10M")

# temporal aggregation
LAI_month = LAI.aggregate_temporal_period(period =  "month" , reducer="mean")

# linear interpolation
LAI_month_interpolate = LAI_month.apply_dimension(process = "array_interpolate_linear", dimension = "t")
res = LAI_month_interpolate.save_result(format="netCDF")
job = res.create_job(title = "LAI_maps_Vesdre_interpolatelight")
job.start_job()

But if you select the same layer, for the same time period using filter_bands() from a datacube, you get the error (job id if needed: j-6bc83ac5189f4673834e1fedf39ca090)

import openeo

# load data from openo
con  = openeo.connect("https://openeo.vito.be").authenticate_oidc(provider_id="egi")

# Load data cube from TERRASCOPE
LAI = con.load_collection("TERRASCOPE_S2_LAI_V2",
                              spatial_extent={"west": 5.60, "south": 50.42, "east": 6.3, "north": 50.7},
                              temporal_extent=["2015-07-01", "2022-08-15"],
                              bands=["LAI_10M","SCENECLASSIFICATION_20M"])

LAI = datacube.filter_bands(["LAI_10M"])
# temporal aggregation
LAI_month = LAI.aggregate_temporal_period(period =  "month" , reducer="mean")

# linear interpolation
LAI_month_interpolate = LAI_month.apply_dimension(process = "array_interpolate_linear", dimension = "t")
res = LAI_month_interpolate.save_result(format="netCDF")
job = res.create_job(title = "LAI_maps_Vesdre_interpolatelight")
job.start_job()

I hope this helps

hmm that’s interesting.

Another difference is that you include the “SCENECLASSIFICATION_20M” band in the second snippet too.

Does it also fail if you exclude that band from the start? so something like this:

LAI = con.load_collection(...
                              bands=["LAI_10M"])

LAI = datacube.filter_bands(["LAI_10M"])

Just tested this snippet… Successfully!

LAI = con.load_collection("TERRASCOPE_S2_LAI_V2",
                              spatial_extent={"west": 5.60, "south": 50.42, "east": 6.3, "north": 50.7},
                              temporal_extent=["2015-07-01", "2022-08-15"],
                              bands=["LAI_10M"])

LAI = LAI.filter_bands(["LAI_10M"])

But this failed:

# Load data cube from TERRASCOPE
LAI = con.load_collection("TERRASCOPE_S2_LAI_V2",
                              spatial_extent={"west": 5.60, "south": 50.42, "east": 6.3, "north": 50.7},
                              temporal_extent=["2015-07-01", "2022-08-15"],
                              bands=["LAI_10M","SCENECLASSIFICATION_20M"])

LAI = datacube.filter_bands(["LAI_10M"])

Hi there,

Can I do something to make things moving forward? Should I try to do this task in an other way? At first look, generating interpolated and masked VI time series should be a typical open EO task?

Hi Adrien,

you’re right, we were a bit slow to look into this as I was not available for support in the past weeks, apologies for that.

In any case, your job is running into memory problems due to the long timeseries. The job with one band does work, probably simply because one band requires less memory.

The best solution for now is to increase memory, which is also described here:

Can you try that? I’m also looking a bit further at the job itself, sometimes there’s other ways to reduce memory usage.

best regards,
Jeroen

Hi Adrien,
just want to confirm that it works for me with these settings:


job_options = {
      "executor-memory": "6G",
      "executor-memoryOverhead": "2G",
      "executor-cores": "2"
  }
  LAI_month_interpolate.execute_batch("lai_adrien.nc", job_options=job_options)

you may also want to try and filter out clouds, which can also reduce memory usage.
For instance, this cloud filter process is quite aggressive, but reduces memory as well:
LAI = LAI.process("mask_scl_dilation", data=datacube, scl_band_name="SCENECLASSIFICATION_20M")

For these larger jobs (in space or time), it is however always possible that you need to increase memory a bit.

Jeroen,

No need to apologize! You guys are doing a great job with openEO :slight_smile:
My message was just in case I could do something by my own

Not sure to follow you on this. We shared a lot of snippets there :slight_smile:

Could you please share the last one you tested?

This snippet failed (job-id: j-d12273406d07409c97256289e29c85ad)

(also when I launched on the dev server)

import openeo

# load data from openo
con  = openeo.connect("https://openeo.vito.be/").authenticate_oidc(provider_id="egi")

# Load data cube from TERRASCOPE_S2_LAI_V2 collection.
datacube = con.load_collection("TERRASCOPE_S2_LAI_V2",
                              spatial_extent={"west": 5.60, "south": 50.42, "east": 6.3, "north": 50.7},
                              temporal_extent=["2015-07-01", "2022-08-15"],
                              bands=["LAI_10M","SCENECLASSIFICATION_20M"])

LAI = datacube.filter_bands(["LAI_10M"])

LAI_masked = LAI.process("mask_scl_dilation", data=datacube, scl_band_name="SCENECLASSIFICATION_20M")

job_options = {
      "executor-memory": "6G",
      "executor-memoryOverhead": "2G",
      "executor-cores": "2"
  }
 
# temporal aggregation
LAI_masked_month = LAI_masked.aggregate_temporal_period(period =  "month" , reducer="mean")

# interpolation attemps
LAI_masked_month_interpolate = LAI_masked_month.apply_dimension(process = "array_interpolate_linear", dimension = "t")

# saving results
res_month = LAI_masked_month_interpolate.save_result(format="netCDF")
job_month = res_month.create_job(title = "LAI_masked_month", job_options=job_options)
job_month.start_job()

Hi Adrien,

the version I’m running is slightly different, because mask_scl_dilation needs to happen before the ‘filter_bands’. However, when I do that, I run into another error that I now have to investigate. (I attached the stack trace below for my own reference.)

connection = openeo.connect("openeo.cloud").authenticate_oidc()
    LAI = connection.load_collection("TERRASCOPE_S2_LAI_V2",
                              spatial_extent={"west": 5.60, "south": 50.42, "east": 6.3, "north": 50.7},
                              temporal_extent=["2015-07-01", "2022-08-15"],
                              bands=["LAI_10M", "SCENECLASSIFICATION_20M"])

    LAI_masked = LAI.process("mask_scl_dilation", data=LAI, scl_band_name="SCENECLASSIFICATION_20M")

    LAI_masked = LAI_masked.filter_bands(["LAI_10M"])
    LAI_month = LAI_masked.aggregate_temporal_period(period="month", reducer="mean")

    # linear interpolation
    LAI_month_interpolate = LAI_month.apply_dimension(process="array_interpolate_linear", dimension="t")

    job_options = {
        "executor-memory": "6G",
        "executor-memoryOverhead": "2G",
        "executor-cores": "2"
    }
    LAI_month_interpolate.execute_batch("lai_adrien.nc", job_options=job_options)
java.lang.AssertionError: assertion failed: Band 3 cell type does not match, uint8ud0 != uint8ud127
	at scala.Predef$.assert(Predef.scala:223)
	at geotrellis.raster.ArrayMultibandTile.<init>(ArrayMultibandTile.scala:100)
	at geotrellis.raster.ArrayMultibandTile$.apply(ArrayMultibandTile.scala:46)
	at geotrellis.raster.MultibandTile$.apply(MultibandTile.scala:37)
	at org.openeo.geotrellis.OpenEOProcesses$.org$openeo$geotrellis$OpenEOProcesses$$timeseriesForBand(OpenEOProcesses.scala:45)
	at org.openeo.geotrellis.OpenEOProcesses.$anonfun$applyTimeDimension$7(OpenEOProcesses.scala:130)

I found this last issue as well, but still need to implement a fix.

It’s logged in our issue tracker here, and scheduled for a fix over next two weeks, is that fast enough?

Of course it is fast enough :slight_smile: Thanks for the follow up

Is the fix already implemented in the dev server?

Hi

I ran this snippet which is directly based on yours and it is still throwing errors

Is it possible that the solution is not yet implemented? Or maybe I’m doing something wrong?

import openeo

connection = openeo.connect("https://openeo.vito.be").authenticate_oidc(provider_id="egi")
LAI = connection.load_collection("TERRASCOPE_S2_LAI_V2",
                              spatial_extent={"west": 5.60, "south": 50.42, "east": 6.3, "north": 50.7},
                              temporal_extent=["2015-07-01", "2022-08-15"],
                              bands=["LAI_10M", "SCENECLASSIFICATION_20M"])

LAI_masked = LAI.process("mask_scl_dilation", data=LAI, scl_band_name="SCENECLASSIFICATION_20M")

LAI_masked = LAI_masked.filter_bands(["LAI_10M"])
LAI_month = LAI_masked.aggregate_temporal_period(period="month", reducer="mean")

    # linear interpolation
LAI_month_interpolate = LAI_month.apply_dimension(process="array_interpolate_linear", dimension="t")

job_options = {
        "executor-memory": "6G",
        "executor-memoryOverhead": "2G",
        "executor-cores": "2"}

LAI_month_interpolate.execute_batch("lai_adrien.nc", job_options=job_options)

Apologies, I haven’t been able to get to a final solution yet, I was looking into it, but got interrupted by some critical issues. Will try to put it back on the agenda again!

Hi Adrien,
this now works on our development instance (openeo-dev.vito.be).
It took 1h20min, resulting in a netCDF file of about 1GB.

Please try it out and let me know if you still see issues!

best regards,
Jeroen

It’s working just fine :partying_face: :partying_face:

Thank you