Missing images (temporal extent = month)

Hey there,

We are trying to process the SENTINEL2_L2A_SENTINELHUB collection, where a temporal extent is a full month and a spatial extent is almost the size of the 18NXL tile. Then, we’ve tried to download data but we’ve obtained only 6 scenes. We are sure there are more scenes. Could you explain us why we gained only a few scenes?

This is an example of the script:

start_date      = '2022-01-01'
end_date        = '2022-01-31'
bands           = ['B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B09',  'B11', 'B12', 'CLP', 'SCL' , 'sunAzimuthAngles', 'sunZenithAngles']
collection      = 'SENTINEL2_L2A_SENTINELHUB'

# 18NXL
spatial_extent = {'west': -74.09868206699997  , 'east': -73.106493944999954, 'south': 4.432483427000022, 'north': 5.425259437000022}

S2_cube = connection.load_collection(collection,
                                     spatial_extent = spatial_extent,
                                     temporal_extent = [start_date, end_date],
                                     bands = bands)

# SCL Sen2Cor mask
scl = S2_cube.band("SCL")
mask = (scl == 3) | (scl == 8) | (scl == 9) | (scl == 10) |(scl == 11)
S2_cube_scl = S2_cube.mask(mask)

# CLP (cloud probabilities) based on s2cloudless
clp = S2_cube.band("CLP")
clp = clp.resample_spatial(resolution=20, method = "bicubic")
mask = (clp / 255) > 0.3  # 160m resolution s2cloudless so it does not have to use
S2_cube = S2_cube_scl.mask(mask)

# Send cube to server
S2_cube = S2_cube.save_result(format='GTiff') #GTiff #netCDF
my_job  = S2_cube.send_job(title="S2_L2A_220115")
results = my_job.start_and_wait().get_results()

We can see from the log (1:33:04 Job ‘vito-81464076-ddc2-405f-80a2-35ebe35a9ad4’: finished (progress N/A), that processing was finished but when we wanted to download data using results.download_files("S2_L2A")

We’ve obtained the S2 scenes only from half of a month.

[WindowsPath('S2_L2A/openEO_2022-01-03Z.tif'),
 WindowsPath('S2_L2A/openEO_2022-01-05Z.tif'),
 WindowsPath('S2_L2A/openEO_2022-01-08Z.tif'),
 WindowsPath('S2_L2A/openEO_2022-01-10Z.tif'),
 WindowsPath('S2_L2A/openEO_2022-01-13Z.tif'),
 WindowsPath('S2_L2A/openEO_2022-01-15Z.tif'),
 WindowsPath('S2_L2A/job-results.json')]

Maybe @daniel.thiex could help here!

FYI: I just tried for a smaller area of interest, without the masking, downloaded synchronously as netcdf and also found 6 observations, but covering the whole month:

...
bands           = ['B02', 'B03', 'B04']
spatial_extent = {'west': -74.1  , 'east': -74.0, 'south': 4.4, 'north': 4.5}
...
S2_cube.download("forum398-out.nc")
...

>>> xarray.load_dataset("forum398-out.nc")["t"]
array(['2022-01-03T00:00:00.000000000', '2022-01-08T00:00:00.000000000',
       '2022-01-13T00:00:00.000000000', '2022-01-18T00:00:00.000000000',
       '2022-01-23T00:00:00.000000000', '2022-01-28T00:00:00.000000000'],
      dtype='datetime64[ns]')

With larger AOI, I get more (partial) observations, again for the full month:

bands           = ['B02', 'B03', 'B04']
spatial_extent = {'west': -73.75258800599997, 'east': -73.45258800599996, 'south': 4.778871432000022, 'north': 5.078871432000023}
...
S2_cube.download("forum398-out.nc")
...
>>> xarray.load_dataset("forum398-out.nc")["t"]
array(['2022-01-03T00:00:00.000000000', '2022-01-05T00:00:00.000000000',
       '2022-01-08T00:00:00.000000000', '2022-01-10T00:00:00.000000000',
       '2022-01-13T00:00:00.000000000', '2022-01-15T00:00:00.000000000',
       '2022-01-18T00:00:00.000000000', '2022-01-20T00:00:00.000000000',
       '2022-01-23T00:00:00.000000000', '2022-01-25T00:00:00.000000000',
       '2022-01-28T00:00:00.000000000', '2022-01-30T00:00:00.000000000'],
      dtype='datetime64[ns]')

I suspect the masking (or the resampling) might be the culprit.
Have you already tried with disabling one or all of the masks?

1 Like

I’ve tried to run the same script without cloud masking, and I was getting a full month. However, I am still getting only a few scenes once cloud masking is applied. I’ve tried to test SCL and CLP masking separately but I was not able get the batch job running since last Friday due to this error:

Printing logs:
[{'id': 'error', 'level': 'error', 'message': 'Traceback (most recent call last):\n File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/backend.py", line 1690, in get_log_entries\n with (self.get_job_output_dir(job_id) / "log").open(\'r\') as f:\n File "/usr/lib64/python3.8/pathlib.py", line 1221, in open\n return io.open(self, mode, buffering, encoding, errors, newline,\n File "/usr/lib64/python3.8/pathlib.py", line 1077, in _opener\n return self._accessor.open(self, flags, mode)\nFileNotFoundError: [Errno 2] No such file or directory: \'/data/projects/OpenEO/j-60652c67945549e3a93f8b7cac191b9e/log\'\n'}]

Also, I have tried to reconnect multiple times but it does not help. I am using the “openeo-dev.vito.be” back end and my area is bigger than yours so I have to use the batch job.

Hi Andrea,
there was indeed a problem over the weekend, can you try again?
My sincerest apologies, I will also check why it took so long for us to discover this, as it should not happen.

best regards,
Jeroen

Thanks for the update Jeroen.
I have been running the code since morning but it seems busy now. My my bath job is in a queue for 2,5h currently. I have tried to run another script but it has the same scenario.

2:37:03 Job 'j-4adb987b41454e21af583c482e8ed7a6': queued (progress N/A)

.

Hi Andrea,
it is waiting to fetch data from sentinelhub. While it’s fetching data, it will also show up as ‘queued’.
I guess this is a somewhat larger job? (Smaller jobs using sentinelhub do not get queued normally.)

thanks,
Jeroen

Thanks Jeroen.
It eventually failed for 2 running jobs while the jobs were in a line.

1st code:

3:52:40 Job 'j-4adb987b41454e21af583c482e8ed7a6': queued (progress N/A)
Failed to parse API error response: 404 '404 page not found\n'

2nd code

3:53:15 Job 'j-e95d1ddd55ca4c9a9a7c32158a322359': queued (progress N/A)
Failed to parse API error response: 404 '404 page not found\n'

Hi Andrea,
this seems more like a temporary network error. Could you perhaps use the web editor to check if these jobs are still running or finished?
(I tried looking them up myself, and at least

e95d1ddd55ca4c9a9a7c32158a322359

still seems to be running.