Sentinel 1 ARD processing batch job failure for long timeseries

olivier.bonte · 22 February 2023 16:58

Hey,

I would like to use the function .ard_normalized_radar_backscatter() to process Sentinel-1 data over a relatively small catchment (100 km^2) over several years. To not have to large batch jobs, I split up the job per half year. Although my job does run for the year 2015, it fails in the first half of 2016. The job that failed has following job-id: vito-j-d7c5e0dbd1324640acfdcdb01ee43989
The code used is visualised below

connection = openeo.connect("openeo.cloud").authenticate_oidc()
shape_zwalm = gpd.read_file('data/Zwalm_shape/zwalm_shapefile_emma.shp')
shape_zwalm.plot()
extent = shape_zwalm.total_bounds
print(extent) #I include the output of this function below for clarity
[ 3.66751526 50.76325563  3.83821038 50.90341411]
temporal_extent = ["2015-06-07", "2022-11-05"]
list_temp_extent = []
job_title_list = []
job_title = "s1_a_gamma0_2015" 
job_title_list.append(job_title)
list_temp_extent.append([temporal_extent[0],"2015-12-31"])
years = np.arange(2016,2023)
for year in np.arange(2016,2023):
    if year == 2022:
        #print([str(year)+"-01-01",temporal_extent[1]])
        list_temp_extent.append([str(year)+"-01-01",str(year)+ "-06-30"])
        job_title = "s1_a_gamma0_2022_I"
        job_title_list.append(job_title)
        list_temp_extent.append([str(year)+"-07-01",temporal_extent[1]])
        job_title = "s1_a_gamma0_2022_II" 
        job_title_list.append(job_title)
    else:
        #print([str(year)+"-01-01",str(year)+ "-12-31"])
        list_temp_extent.append([str(year)+"-01-01",str(year)+ "-06-30"])
        job_title = "s1_a_gamma0" +  str(year) + "_I"
        job_title_list.append(job_title)
        list_temp_extent.append([str(year)+"-07-01",str(year) + "-12-31"])
        job_title = "s1_a_gamma0" +  str(year) + "_II"
        job_title_list.append(job_title)
print(list_temp_extent)
print(job_title_list)
collection = 'SENTINEL1_GRD' #Ground Range Detected #Ground Range Detected
spatial_extent = {'west':extent[0],'east':extent[2],'south':extent[1],'north':extent[3]}
bands = ["VV"]#enkel in deze geïnteresseerd 
properties = {
    "sat:orbit_state": lambda od: od == "ASCENDING", ##filter on ascending vs descending
    "sar:instrument_mode":lambda mode: mode == "IW" ## Orbit direction filtering
}
job_id_list = []
if job_exec:
    for i, temporal_extent in enumerate(list_temp_extent):
          s1a = connection.load_collection(
              collection_id = collection,
              spatial_extent= spatial_extent,
              temporal_extent = temporal_extent,
              bands = bands,
              properties= properties
          )
          s1a = s1a.ard_normalized_radar_backscatter(elevation_model = "COPERNICUS_30")
          s1a = s1a.mask_polygon(shape_zwalm['geometry'].values[0])
          # job_title = "s1_a_gamm0" +  str(years[i])
          # job_title_list.append(job_title)
          job_s1a = s1a.create_job(title = job_title_list[i], out_format= 'NetCDF')
          job_s1a_id = job_s1a.job_id
          if job_s1a_id:
              print("Batch job created with id: ",job_s1a_id)
              job_s1a.start_and_wait()
              job_id_list.append(job_s1a_id)
          else:
              print("Error! Job ID is None")

Is there a mistake in my code, or do I simply exceed my allowed processing time with the Early adopter package (90 days) I have?
Update: I used .start_and_wait() instead of .start_job() because with the latter method, I surpass my allowance on request from SentinelHub backend.

Kind regards
Olivier Bonte

jeroen.dries · 27 February 2023 08:21

Hi Olivier,
unfortunately, the logs also don’t have a clear indication of what went wrong.
One important hint however is that you now use:

ard_normalized_radar_backscatter

This process implements a number of strict requirements from CEOS CARD4L that make it more costly to run. I see that you only retain the VV band, so it looks like you may not be interested in this extra CARD4L metadata at all?
If so, please try using the ‘sar_backscatter’ process, which is somewhat lighter, making it less prone to errors.

thanks,
Jeroen

olivier.bonte · 27 February 2023 10:15

Hi Jeroen,

Thanks for the suggestion! I tried to run the same code with the processing to gamma0 replaced by .sar_backscatter(). Besides adding ‘VH’ as a band, I did not change anything else in the code above. Unfortunately the processing once again fails in the second half of 2016 (job-id:
vito-j-ebd56d972f5b49c2b205f9103b3069b4), while it works for the first half of 2016 (
vito-j-bec539dc8362466693ad061467c8bb84). Maybe that this time the logs do reveal what went wrong?

Kind regards
Olivier

jeroen.dries · 27 February 2023 12:28

Hi Olivier,
yes, now the error is more clear, you may also see it in the logs:

Requested band ‘VV’ is not present in Sentinel 1 tile ‘S1A_IW_GRDH_1SDH_20160713T173234_20160713T173303_012133_012CB5_35C2’ returned by criteria specified in dataFilter parameter.",“code”:"RENDERER_S1_MISSING_POLARIZATION

So apparently, filtering on IW alone is not sufficient, as also within IW there are some rare products with different polarization. For now, the only solution I have is to also add a filter on polarization:

properties={"polarization":lambda p: p == "DV"}

(Which is not so great, as the meaning/values of the polarization property can differ.)

olivier.bonte · 28 February 2023 09:25

Hi Jeroen,

Thanks for the respose! With this extra filter my processing chain seems to work fine so thank you!
I just had 2 questions left:

How do you know what key to use for filtering? In the case for polarization based on the Sentinel Hub page (https://docs.sentinel-hub.com/api/latest/data/sentinel-1-grd/#filter-extension), I would expect s1:polarization. Based on the metadata of the SENTINEL1_GRD collection in OpenEO I would expect it to be sar:polarizations (as this also is the only identifier that does not give a user warning). But in this case it is polarization, so how you should I have known this (if it weren’t for your great help)?
Is there a technical document somewhere describing what are all the steps included in the sar_backscatter() method? For the moment, I only find a very brief explanation on the Sentinel Hub backend page: https://docs.sentinel-hub.com/api/latest/data/sentinel-1-grd/#processing-chain

Thanks in advance
Kind regards
Olivier

jeroen.dries · 20 March 2023 07:50

Hi Olivier,

my apologies for taking a while.
The properties that a user can query on are part of the STAC collection metadata, for instance in:
https://openeocloud.vito.be/openeo/1.0.0/collections/SENTINEL2_L2A
you can find a property called ‘summaries’ that is supposed to list this.

However, we have two issues:

the summaries are not always correct/complete
the information in there is not easily findable on the website/documentation

Regarding sar_backscatter, the official process documentation contains a link to the scientific paper that the process should implement:
https://processes.openeo.org/#sar_backscatter

Backscatter is pretty well-defined, so implementations should normally not deviate too far from that. We also tried to adhere to CEOS guidelines for SAR ARD:
https://ceos.org/ard/

However, if you are referring to the actual software behind it, it differs per backend. You already found documentation for sentinelhub, but there’s also implementations based on SNAP and Orfeo toolbox.

Hope that helps!