Hi,
I suddenly run into an error when using aggregate_temporal_period followed by aggregate_spatial. The error only occurs in a specific case. Below is a simplified script that explores 4 “test cases”, each with a different sequence of preprocessing steps. One of them fails consistently.
import openeo
import datetime
connection = openeo.connect("openeo-dev.vito.be").authenticate_oidc()
URL = "https://raw.githubusercontent.com/MargotVerhulst/raw/main/Plotbuffer10_8point_N5_random.json"
# Load datacube
cube = connection.load_collection(
collection_id="TERRASCOPE_S2_TOC_V2",
temporal_extent=[datetime.datetime(2018, 1, 1), datetime.datetime(2019, 1, 1)],
bands=["B02", "B03", "B04", "SCL"])
# Test 1: no cloud masking, spatial aggregation
cube_raw = cube.filter_bands(["B02", "B03", "B04"]) # Remove SCL band
cube_raw_agg = cube.aggregate_spatial(geometries=URL, reducer="mean")
# Test 2: no cloud masking, temporal resampling, spatial aggregation
cube_dek = cube.aggregate_temporal_period(period="dekad", reducer="median")
cube_dek_agg = cube_dek.aggregate_spatial(geometries=URL, reducer="mean")
# Test 3: cloud masking, spatial aggregation
SCL = cube.band("SCL")
mask_scl = ~ ((SCL == 4) | (SCL == 5))
cube_mask1 = cube.mask(mask_scl)
cube_mask1 = cube_mask1.filter_bands(["B02", "B03", "B04"]) # Remove SCL band
cube_mask1_agg = cube_mask1.aggregate_spatial(geometries=URL, reducer="mean")
# Test 4: cloud masking, temporal resampling, spatial aggregation
cube_mask1_dek = cube_mask1.aggregate_temporal_period(period="dekad", reducer="median")
cube_mask1_dek_agg = cube_mask1_dek.aggregate_spatial(geometries=URL, reducer="mean")
# Batch jobs
res1 = cube_raw_agg.save_result(format="JSON")
job1 = res1.send_job(title="testscript_test1")
job1.start_job()
res2 = cube_dek_agg.save_result(format="JSON")
job2 = res2.send_job(title="testscript_test2")
job2.start_job()
res3 = cube_mask1_agg.save_result(format="JSON")
job3 = res3.send_job(title="testscript_test3")
job3.start_job()
res4 = cube_mask1_dek_agg.save_result(format="JSON")
job4 = res4.send_job(title="testscript_test4")
job4.start_job()
id job1: j-1cda64f0beaa4653b012696b26f4e753 → finishes
id job2: j-7d9ab30aabfe4752bd620204bdb8a7c0 → finishes
id job3: j-70e0c772b34c4a49b2019e12bde3c875 → finishes
id job4: j-77bf374100fc4094b369e06ab56fef2f → error
The error log of job4 mentions “ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 2”, which I also saw when trying it as a synchronous job.
Any ideas on why this error arises in that particular case?
Thanks in advance.