I am currently using the FuseTS package within openEO to create gap-filled, “sensor-fused” maps. I was wondering if there is an implementation of local xarray/netcdf4 datasets into openEO and how that could be done?
the FuseTS toolbox uses sensor fusion to create gap-filled maps, now my objective is to use e.g. Copernicus LAI-300m data (from VITO) and my own netcdf4/xarray datasets, to combine and obtain such gap-filled maps. I would highly appreciate any ideas that could point me to the solution on implementing local data into openEO processes
Thanks for your response! I would want to upload my data to Terrascope and access it via openEO, for me that looks like the most feasible option.
I have used the Terrascope VM before, however I would need some help on how I could upload my data to the Terrascope VM and thus access it via openEO?
This is probably the folder you have write access to: /data/users/Public/david.kovacs/. I saw some files in it and made an example script that loads data from here:
import openeo
openeo.connect("https://openeo.cloud").authenticate_oidc()
spatial_extent={ # Johannesburg
"west": 27,
"south": -27,
"east": 30,
"north": -26,
}
# load_disk_collection is legacy. Use load_stac when metadata is available
datacube = connection.load_disk_collection(
format="GTiff",
glob_pattern="/data/users/Public/david.kovacs/tifs_david/LAI/*.tif",
options=dict(date_regex=r".*(\d{4})(\d{2})(\d{2}).tif"),
)
datacube = datacube.filter_bbox(spatial_extent)
datacube = datacube.filter_temporal("2019-01-01", "2019-12-31")
datacube.download("david_k_LAI.tif")
Actually, I want to process time series with the inherent temporal metadata, so I uploaded a “.nc” file into the VM, whats the function to access it? I was searching for load_stac and load_disk_collection however, it did not specify netcdf as a format.
When I modify the code you provided with: format="netCDF" it gives me the following error message:
OpenEoApiError: [500] Internal: Server error: NotImplementedError('The format is not supported by the backend: netCDF') (ref: r-24021597483244358da73b11a3d0ec3e)
I would recommend splitting up the CDF in multiple tiffs.
netCDF is greatly supported with load_disk_collection for the moment.
This script worked for that on my machine:
#!/bin/bash
cd /data/users/Public/david.kovacs/tifs_david/S3GPR/ || exit
infile=spain.nc
band=1
# sudo apt-get install -y cdo
for idate in $(cdo showdate $infile); do
date="${idate:0:10}"
# filter out non-date entries:
if [[ ${date} =~ [0-9]{4}-[0-9]{2}-[0-9]{2} ]]; then
echo "date: $date"
y="${idate:0:4}"
m="${idate:5:2}"
d="${idate:8:2}"
mkdir -p tiff_collection/$y/$m/$d
# apt-get install -y build-essential proj-bin proj-data gdal-bin gdal-data libgdal-dev
gdal_translate -co COMPRESS=DEFLATE -unscale -a_srs EPSG:32630 -ot Float32 NETCDF:$infile -b $band "tiff_collection/$y/$m/$d/${date}_S3GPR.tif"
((band++))
fi
done
echo "All done"
Is there a way to access directly the netCDFs? I am not familiar with bash, also I can store my data in netCDFs and it would be fairly easier to access them, with their temporal metadata assigned. I would process several netCDFs, and this would require to “slice” them each time into geotiffs, which I would avoid if possible.
Thank you very much for your help. It works perfect!
Btw, I am not able to delete the folder/files that the script creates. It is owned by root(root) and I have no permission to delete it.
With your help I managed to create some really nice maps! Thanks for all the help.
Right now, I use exactly the same code with which I have successfully processed areas few weeks ago, but for some reason there is an issue when I try to use the MOGPR batch job:
I’ve got the following job IDs with massive error outputs, from which I do not understand anything. Please help me with this: