Generating a cloud mask using the R-Client

Hi,
I am trying to create a cloud mask from the Sentinel-2 schene classification band (SCL). Particularly I want to flag pixels with a value of 3 (cloud shadow), 8 (clouds medium probability) and 9 (clouds high probability). I tried to adjust the example shown here.

Here’s the code (not strictly reproducible due to authentication, but if you have a connection you should be able to process it):

library(openeo)

collections = "SENTINEL2_L2A_SENTINELHUB"
bands = c("B02", "B03", "B04", "B08", "SCL")
period = c("2021-10-01", "2021-10-31")
ext = list(11.00204, 49.66901, 11.28197, 49.85029)
names(ext) = c("west", "south", "east", "north")

# change to login with your credentials
# con = openeo::connect("https://openeo.cloud")
# login(
#     login_type = "oidc"
#     , provider = "egi"
#     , config = list(
#         client_id = "<client_id>"
#         , secret = "<secret>"
#     )
# )

procs = processes()

cube = procs$load_collection(
    id = collections
    , spatial_extent = ext
    , temporal_extent = period
    , bands = bands
)

filterSCL = function(data, context) {
    cl_shadow = procs$eq(data[5], y = 3)
    cl_medium = procs$eq(data[5], y = 8)
    cl_high = procs$eq(data[5], y = 9)
    cl = procs$or(cl_medium, cl_high)
    cl = procs$or(cl, cl$shadow)
    return(cl)
}

mask = procs$reduce_dimension(
    data = cube
    , reducer = filterSCL
    , dimension = "bands"
)

## apply mask
cube_masked = procs$mask(cube, mask)

## create process graph
graph = procs$save_result(
    data = cube_masked
    , format = "GTiff"
)

## create and start job
job = openeo::create_job(graph)
openeo::start_job(job)
id = as.character(job$id)
jobs = openeo::list_jobs()

dirout = path.expand("~/Downloads/openEO/mask")
if (!dir.exists(dirout)) dir.create(dirout)

openeo::download_results(
    job = jobs[[id]]$id
    , folder = dirout
)

## create stars object
fls = list.files(dirout, pattern = "^openEO_", full.names = TRUE)
stars = stars::read_stars(fls, along = "time")
stars = stats::setNames(stars, nm = "value")

Now my questions:

  • Obiously “or” only compares two values, not many. Chaining several “or” statements is not very flexible (and not elegant). Is there a better way how to do this?
  • Subsetting to the SCL band by name (as in the linked example) throws a warning and returns an “error” from the backend. Subsetting by index works. Any idea why?

Hi Hendrik,
I can only answer the second question, my R client colleagues should be able to reply to the first.

In the VITO backend, we currently indeed only support band subsetting by index. The python client hides this, by translating the label into an index, the R client probably not, hence the error.

By the way, you could also do a separate load_collection call for creating the mask, that would save the extra reducing operation and also avoid requiring the band filter entirely.
The VITO backend also even has an optimization in place that will avoid reading chunks that are entirely masked to begin with, but this wont work if you construct the mask itself from a fully loaded datacube.

Hi Hendrik,
thanks for providing the well prepared issue.
I’ll give a quick reply for now. We’ll follow up later, after testing, with a more detailed answer:

  • Labels vs Index: As @driesj guessed, since the VITO backend natively doesn’t support labels, they also don’t work with the R-Client. Currently, the R-Client doesn’t have the possibility to translate a label into an index.
  • Chaining OR: There might be a solution within the R-Client using the “|” operator. But we need to test this first.
    We’ll get back to you later.
  1. the indexing: R supports both (by index and name), but does not automatically know what the back-end supports in this function
  2. Thanks Peter for the quick response. I have also looked into it and the operators ==, | and & should work for chaining and logical operators. However, not with the current release on CRAN. I had to make some fixes. So, now you can find updates in the develop branch on Github.

This would be the filterSCL can look more elegantly like this:

filterSCL = function(data, context) {
    scl = data[5]
    res = scl == 3 | scl == 8 | scl == 9
    return(res)
}
1 Like

Thanks Florian,
we tested it with the develop branch from GitHub and can confirm that it works as you described.