Filter multiple dates with filter_labels

I am trying to extract from a datacube only specific dates.
The filter_temporal process allows only one time range, so it’s not suitable (workaround would be to chain multiple filter_temporal + merge_cubes, but not so efficient I guess).

So, from the Parcel Delineation notebook I found out that it’s possible to do that combining filter_labels, date_shift and date_between.

Still, after several trials, I’m not able to get only the dates I want to keep. What am I doing wrong?

import openeo
import xarray as xr
from numpy import datetime_as_string
from openeo import processes as eop

bounding_box_32632_10x10 = dict(
    west=680000, east=680100, south=5151500, north=5151600, crs="EPSG:32632"
)

temporal_interval = ["2022-06-01", "2022-07-01"]

conn = openeo.connect("openeo.cloud").authenticate_oidc()

cube = conn.load_collection("SENTINEL2_L2A",spatial_extent=bounding_box_32632_10x10,temporal_extent=temporal_interval,bands=["B04"])
cube.download("sample.nc")

ds = xr.open_dataset("sample.nc")
timesteps = [datetime_as_string(t, unit="s", timezone="UTC") for t in ds.t.values]
print(timesteps)

# Keep the second and fourth dates
dates_to_keep = [timesteps[1], timesteps[3]]

## Create a condition that checks if a date is one of the best timesteps: from https://github.com/Open-EO/openeo-community-examples/blob/815ab0cf4662a1b2be0881f55a9d4896467ed224/python/ParcelDelineation/Parcel%20delineation.ipynb
condition = lambda x : eop.any(
    [
        eop.date_between(
            x = x,
            min = timestep,
            max = eop.date_shift(date=timestep, value=1, unit='day')) 
        for timestep in dates_to_keep
    ]
)

## Filter the dates using the condition
cube_reduced = cube.filter_labels(
    condition = condition,
    dimension = "t"
)
cube_reduced.download("sample_dates5.nc")
print(xr.open_dataset("sample_dates5.nc").t.values)
Authenticated using refresh token.
['2022-06-02T00:00:00Z', '2022-06-05T00:00:00Z', '2022-06-07T00:00:00Z', '2022-06-10T00:00:00Z', '2022-06-12T00:00:00Z', '2022-06-15T00:00:00Z', '2022-06-17T00:00:00Z', '2022-06-20T00:00:00Z', '2022-06-22T00:00:00Z', '2022-06-25T00:00:00Z', '2022-06-27T00:00:00Z', '2022-06-30T00:00:00Z']
['2022-06-02T00:00:00.000000000' '2022-06-05T00:00:00.000000000'
 '2022-06-07T00:00:00.000000000' '2022-06-10T00:00:00.000000000'
 '2022-06-12T00:00:00.000000000' '2022-06-15T00:00:00.000000000'
 '2022-06-17T00:00:00.000000000' '2022-06-20T00:00:00.000000000'
 '2022-06-22T00:00:00.000000000' '2022-06-25T00:00:00.000000000'
 '2022-06-27T00:00:00.000000000' '2022-06-30T00:00:00.000000000']

Hi,

This is an unfortunate case where Terrascope has to fallback to sentinelhub to fetch SENTINEL2_L2A data. In such a case, we don’t support filter_labels yet.

This case will give a more clear error when this ticket is deployed: https://github.com/Open-EO/openeo-geopyspark-driver/issues/749

A workaround could be to switch to the CDSE backend for Sentinel2: openeo.connect("https://openeo.dataspace.copernicus.eu").authenticate_oidc()

You can also use max_cloud_cover:

cube = conn.load_collection(
    "SENTINEL2_L2A",
    spatial_extent=bounding_box_32632_10x10,
    temporal_extent=temporal_interval,
    bands=["B04"],
    max_cloud_cover=80,  # Avoid Sentinel Hub fallback
)

@emile.sonneveld thanks for the feedback. We are not filtering cloud covered dates, so the last suggestion is not relevant in our case.

Additionally, we also can’t use CDSE, since @valentina.premier needs a specific collection which is not available there :sweat_smile:

Ah, maybe an other workaound:

cube = conn.load_collection(
    "TERRASCOPE_S2_TOC_V2",  # Force use Terrascope
    spatial_extent=bounding_box_32632_10x10,
    temporal_extent=temporal_interval,
    bands=["B04"],
)

It that does not do the trick, we can check it next week. I understood that @valentina.premier will pass by the VITO offices.

Dear @emile.sonneveld, the previous example was working on CDSE but now it doesn’t anymore. Using openEO Platform it never worked due to the usage of SentinelHub collections.

Hi,

I have also tried this example recently but it does not filter the dates as required. The output datacube has the same time dimensions as the input one.

eoconn = openeo.connect('https://openeo.dataspace.copernicus.eu/', auto_validate=False)
eoconn.authenticate_oidc()

eoconn.describe_account()


startdate = '2022-12-01'
enddate = '2022-12-30'

s1 = eoconn.load_collection(
    "SENTINEL1_GRD",
    spatial_extent={'west':11.23,
                    'east':11.45,
                    'south':46.9,
                    'north':47,
                    'crs':4326},
    bands=['VV','VH'],
    temporal_extent=[startdate,enddate],
)


dates_to_keep = ['2022-12-05', '2022-12-17', '2022-12-29']

s1 = s1.sar_backscatter(
                coefficient='beta0',
                elevation_model='COPERNICUS_30')

## Create a condition that checks if a date is one of the best timesteps: from https://github.com/Open-EO/openeo-community-examples/blob/815ab0cf4662a1b2be0881f55a9d4896467ed224/python/ParcelDelineation/Parcel%20delineation.ipynb
condition = lambda x : eop.any(
    [
        eop.date_between(
            x = x,
            min = timestep,
            max = eop.date_shift(date=timestep, value=1, unit='day')) 
        for timestep in dates_to_keep
    ]
)

## Filter the dates using the condition
cube_reduced = s1.filter_labels(
    condition = condition,
    dimension = "t"
)

In my case, I would like to filter based on the track number. I was wondering if there is a direct way to do it…
Thanks,
Valentina

Hey, I’ll check what is happening

I get the same problem, and it is indeed a non-sentinelhub layer, so it should work. I’ll check for a workaround tomorrow

A fix is on the way. It might take some days before it is available on staging.
filter_labels was not applied when using backscatter:

edit:
This is available on staging: https://openeo-staging.dataspace.copernicus.eu
For date_between you have to use full dates:
dates_to_keep = ['2022-12-05T00:00:00Z', '2022-12-17T00:00:00Z', '2022-12-29T00:00:00Z']

Hi Emile,

thanks. Sorry I have just seen your message since strangely I did not get notification.

BTW, it seems that I cannot login to this staging. Should work with the same CDSE account?!

Would this fix be available also on CDSE?

Thanks
Valentina

To login on staging you need to use the EGI DEMO button, instead of typing username and password directly. I use the same account to log in there.

This is now deployed on https://openeo.dataspace.copernicus.eu/