Filter multiple dates with filter_labels

I am trying to extract from a datacube only specific dates.
The filter_temporal process allows only one time range, so it’s not suitable (workaround would be to chain multiple filter_temporal + merge_cubes, but not so efficient I guess).

So, from the Parcel Delineation notebook I found out that it’s possible to do that combining filter_labels, date_shift and date_between.

Still, after several trials, I’m not able to get only the dates I want to keep. What am I doing wrong?

import openeo
import xarray as xr
from numpy import datetime_as_string
from openeo import processes as eop

bounding_box_32632_10x10 = dict(
    west=680000, east=680100, south=5151500, north=5151600, crs="EPSG:32632"

temporal_interval = ["2022-06-01", "2022-07-01"]

conn = openeo.connect("").authenticate_oidc()

cube = conn.load_collection("SENTINEL2_L2A",spatial_extent=bounding_box_32632_10x10,temporal_extent=temporal_interval,bands=["B04"])"")

ds = xr.open_dataset("")
timesteps = [datetime_as_string(t, unit="s", timezone="UTC") for t in ds.t.values]

# Keep the second and fourth dates
dates_to_keep = [timesteps[1], timesteps[3]]

## Create a condition that checks if a date is one of the best timesteps: from
condition = lambda x : eop.any(
            x = x,
            min = timestep,
            max = eop.date_shift(date=timestep, value=1, unit='day')) 
        for timestep in dates_to_keep

## Filter the dates using the condition
cube_reduced = cube.filter_labels(
    condition = condition,
    dimension = "t"
Authenticated using refresh token.
['2022-06-02T00:00:00Z', '2022-06-05T00:00:00Z', '2022-06-07T00:00:00Z', '2022-06-10T00:00:00Z', '2022-06-12T00:00:00Z', '2022-06-15T00:00:00Z', '2022-06-17T00:00:00Z', '2022-06-20T00:00:00Z', '2022-06-22T00:00:00Z', '2022-06-25T00:00:00Z', '2022-06-27T00:00:00Z', '2022-06-30T00:00:00Z']
['2022-06-02T00:00:00.000000000' '2022-06-05T00:00:00.000000000'
 '2022-06-07T00:00:00.000000000' '2022-06-10T00:00:00.000000000'
 '2022-06-12T00:00:00.000000000' '2022-06-15T00:00:00.000000000'
 '2022-06-17T00:00:00.000000000' '2022-06-20T00:00:00.000000000'
 '2022-06-22T00:00:00.000000000' '2022-06-25T00:00:00.000000000'
 '2022-06-27T00:00:00.000000000' '2022-06-30T00:00:00.000000000']


This is an unfortunate case where Terrascope has to fallback to sentinelhub to fetch SENTINEL2_L2A data. In such a case, we don’t support filter_labels yet.

This case will give a more clear error when this ticket is deployed:

A workaround could be to switch to the CDSE backend for Sentinel2: openeo.connect("").authenticate_oidc()

You can also use max_cloud_cover:

cube = conn.load_collection(
    max_cloud_cover=80,  # Avoid Sentinel Hub fallback

@emile.sonneveld thanks for the feedback. We are not filtering cloud covered dates, so the last suggestion is not relevant in our case.

Additionally, we also can’t use CDSE, since @valentina.premier needs a specific collection which is not available there :sweat_smile:

Ah, maybe an other workaound:

cube = conn.load_collection(
    "TERRASCOPE_S2_TOC_V2",  # Force use Terrascope

It that does not do the trick, we can check it next week. I understood that @valentina.premier will pass by the VITO offices.