Customized openeo-editor deployment

stefan.achtsnit · 25 February 2022 13:27

In the context of our openEO<->EuroDataCube evolutions we plan to extend the openeo-editor to not just support a local download but also to have the option in the Frontend to sync saved results from the openEO backend in the EDC user workspace.

(Side-note: at the current level we want to do that from outside and not with an additional openEO backend process similar to save_result!)

We’ll therefore create a separate deployment of the openeo-editor on something like https://openeo-editor.eurodatacube.com but what do you suggest regarding the oauth2 configuration <-> redirect_urls. In our understanding understanding this has to be changed on all backends (either in existing client or with new client).

stefaan.lippens · 25 February 2022 13:39

No, it is not necessary to change things on all back-ends. You can create a dedicated OIDC client config for your deployment (e.g. with all the desired redirect_urls) and use that.
The back-ends just expect a valid access token from the user and which OIDC client was involved to obtain that is not relevant.

Note that back-ends indeed can specify default OIDC clients, but that is mostly for Python/R scripting use cases, where it is infeasible for the end user to set up their own OIDC clients.

stefan.achtsnit · 25 February 2022 13:51

Thank you for your fast response but not sure if I can follow, where do we need to add this config - in my opinion we can only whitelist existing oauth2 clients in the openeo-editor deployment?

My understanding would be that

https://openeocloud.vito.be/openeo/1.0.0/credentials/oidc
https://openeo.vito.be/openeo/credentials/oidc
https://openeo.eodc.eu/v1.0/credentials/oidc
upcoming ones like the Sentinel-Hub backend
…
need to have these clients + redirect-urls registered and we can then whitelist the client-id for the respective provider?

But perhaps I’m on wrong track here

m.mohr · 28 February 2022 12:58

You can configure your own EGI client ID in the Web Editor, which you can then use for authentication. The back-ends just expect an access token from EGI and it’s irrelevant which client ID was used to generate it.

I like the idea of adding interfaces to share results and I’m wondering whether we could make that more pluggable in the Web Editor directly so that you don’t run into issues once the Web Editor gets updates. Would you want to collaborate on that? Maybe set up a call?

stefan.achtsnit · 28 February 2022 19:05

Thank you Stefaan and Matthias, think I got it now - so we register for our deployment another client on EGI (with our redirect url), to ensure that the openeo backend(s) continue to work as they are we only need to ensure that the “eduperson_entitlement” will be provided on the userinfo endpoint to openeo backend using the access_token as returned on token endpoint to our client - something we will clarify with people from EGI

Happy to discuss how we can best contribute to web editor, @daniel.thiex organized a bi-weekly call openEO<->EDC on Thu 10am (will happen this week) where also our dev lead will participate, if not possible we can coordinate a separate call via mail

daniel.thiex · 1 March 2022 11:37

@stefan.achtsnit If I am not mistaken we agreed on 10:30 am as the EDC meeting starting at 9 sometimes exceeds 1 hour.

m.mohr · 1 March 2022 13:31

I could join Thu @ 10:30. Could someone invite me, please?

daniel.thiex · 2 March 2022 15:12

@m.mohr Great! I just send you an e-mail including the meeting link.

stefan.achtsnit · 14 March 2022 11:55

@m.mohr
Following our previous discussion we are working on functionality to extend the EDC workspace environment to materialize a STAC item via a specific url, i.e.

given EDC url containing a query parameter with reference url pointing to a specific STAC item (or catalog/collection) to open EDC JupyterLab workspace
auth handshake if user not logged in on EDC already
ask user within JupyterLab for file location in EDC workspace to put this STAC item
copy STAC item to EDC workspace
(optional) materialize referenced assets of STAC item in EDC workspace, perhaps convenience function in future, at the moment just use python code with e.g. pystac

So if the results of openEO processing graph are described via STAC we need a way to download/url-reference this STAC item and can use above mechanism to connect openEO processing to further steps (e.g. exploratory analysis within JupyterLab). Can you help us on what is already existing in this regard or how this can be achieved resp. where contributions are needed?

The inclusion in the openEO-web-editor should then be straight forward and can be added in a quite generic way.

What is also not clear to us on how the assets within this STAC item will be referenced - you mentioned a signed url mechanism. Is this something you see on the openEO backend itself or should results=assets be generated during resp. immediately staged further after processing to e.g. a S3 bucket? Regarding signature and especially signed url validity/expiration - at which point will the signed url be generated, dynamically when the describing STAC item is generated? Similar question as above, what is already in place here?

Thanks!

@stephan @bernhard.mallinger

m.mohr · 16 March 2022 12:21

@stefan.achtsnit Ideally, you’d implement it in a way that you can pass any STAC entity and then just completely (deep) copy it over with all assets. Then you have pretty much an openEO agnostic implementation for everyone. I think pystac would allow to do this.

I’ve asked our backends to check whether they support providing a signed URL always. That seems to be not the case yet, but it’s on the agenda for sure. So I’ll implement the Web Editor interface in a way that sharing is only available if a signed URL is available. Work on the Web Editor interface has started.

The STAC JSON will contain signed URLs to all assets so you can access them without authentication at the back-end. Back-ends can choose different expiry times for the signed URLs. These specifics need to be answered by the back-ends itself. @stefaan.lippens @sean.hoyal

Did this answer all questions?

stefan.achtsnit · 17 March 2022 07:59

Thank you Matthias, it helped but I still have some problems to understand the overall result handling concept.

The save_result process induces that the openEO backend not only acts as “processing engine” but also as “storage engine” - this may hold true for some setups but I would assume a setup with a pure openEO processing backend where the final save_result=persist=materialize step is leveraging the underlying platform for storage. Will discuss with @daniel.thiex today too.

stefan.achtsnit · 17 March 2022 12:44

@m.mohr coming back to the original issue and goal with the customized openeo-editor: so we would propose an additional button (e.g. “share”) for the individual succeeded jobs in the editor which will do a HTTP POST to a configurable url with the STAC JSON returned from /jobs/{job_id}/results endpoint

If no url is configured then the button is not shown.

What do you think? And if you agree, is this something you plan to implement yourself or shall we come up with a pull request?

m.mohr · 17 March 2022 13:00

As discussed, I’m preparing an interface within the Web Editor which you can later implement against. The draft (WIP) is available here: Pluggable interface for sharing by m-mohr · Pull Request #242 · Open-EO/openeo-web-editor · GitHub

m.mohr · 17 March 2022 13:12

@stefan.achtsnit What’s your timeline on this? Until which you’d need this to be available?

stefan.achtsnit · 17 March 2022 14:52

it would demo nicely in the ESA review meeting in ~3 weeks but I would consider the final project meeting before summer as deadline to show deeper integration between the projects

m.mohr · 18 March 2022 10:20

Okay, the final project meeting will work from my side, having it for the review meeting I can’t promise.

stefaan.lippens · 24 March 2022 13:09

The VITO backend uses a expiry of 7 days at the moment.
But note that you will get a fresh signed URL (with updated expiry) each time you request the batch job results with GET /jobs/{job_id}/results