I am using CDSE as a back-end, but I am sure the same issue would appear with openeo.cloud using the VITO back-end.
So, if I create a boolean mask and download it, it’s wrong.
If I multiply it by 1.0, it is correct but now the masked area is lost and I get those chunks at zero instead of nans.
Hi @stefaan.lippens, you are right. Basically the main difference is that in the first version, the areas that are at no data of the second have value 129 instead of nan.
Anyway, both results are wrong, apparently when doing the comparison step, the no data values get messed up.
In case of the binary version, you are probably handling a byte-value pixels, where nodata is encoded as byte value 129.
In the float version of the mas, the nodata is encoded as a float nan, which is what you expect right?
I’m not sure I understand what is wrong with the float version of your mask. The nodata values are correctly handled as far as I understand your example and screenshots
Please check the notebook that I’ve shared, it’s pretty clear there. I used mask_polygon and after creating the boolean mask some areas that where nans became either 129 or 0.
I think the problem is that the VITO backend does not preserve nan-ness in comparisons: like ndsi > 0.4: pixels above 0.4 get value 1 and all other values (below 0.4, and nan) get value 0. This is probably because the data type in the implementation is just binary and there is no room for a third value like nan.
Additionally, there is an optimization to discard (internal) tiles that are completely nan.
As a result you can have multiple outputs where you expect nan:
in tiles not covered by polygon mask: value nan (encoded as 129 in binary output)
in tiles partially covered by polygon: value 0
that’s why you get a mix of 0/nan for float output (or 0/129 for binary output) in those blocky/staircase artifacts
Ok, but how should a normal user get to know about these details? It’s a bit frustrating for an advanced user like me already, I can imagine how it would be for a newbie.
I guess that if a process is implemented in a different way than the specs should be documented somewhere and probably also the exposed process definitions should be aligned: https://openeo.vito.be/openeo/1.1/processes
I don’t think it is intentional this is implemented differently from the spec.
I think it’s even possible that the VITO implementation came before the process spec was fully settled on nan-handling.
Indeed, it’s not intentional. It’s absolutely correct that we don’t want this to happen, which is one of the reasons for investing in a cross-backend test-suite.