Skip to content

GH2701: Fuseki Mod to list and abort running executions. #3184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Aklakan
Copy link
Contributor

@Aklakan Aklakan commented May 12, 2025

GitHub issue resolved #2701

Pull request Description: ARQ plugin + Fuseki Plugin to track ongoing query executions:

  • Added DatasetGraphWithExecTracker which wraps another graph and puts a ExecTracker object into its context.
  • Added QueryEngineFactoryExecTracker and UpdateEngineFactoryTracker which handles DatasetGraphWithExecTracker by unwrapping the dataset graph and forwarding the execution request again to the execution factory registries. The obtained execution is then tracked in the ExecTracker.

It integrates into the existing machinery, but its not perfect because

  • the UpdateEngineFactoryExecTracker does not see the original UpdateRequest, and
  • every Update with a WHERE pattern results in tracking the underlying query execution.
  • Only query executions are tracked. If e.g. a query fails to parse then it won't become tracked because the query execution won't be created.
  • Out of band parameters, such as HTTP request parameters - as mentioned in API call to get a snapshot of running queries #2701 - are not supported yet. I suppose any missing information could be included in the Context of the query execution instances and served in addition with the current JSON output.

But still it is already quite useful.

Perhaps this could also serve as a base for discussion about further improvements and any necessary core changes for Jena 6. Ideally the ExecTracker mechanism would not require a wrapping with DatasetGraphWithExecTracker and instead this would be handled in the core machinery already. In cases where a specific DatasetGraph implementation is expected, the need for a wrapper to track executions may make things complex.

The Fuseki Mod adds a tracker endpoint (/dataset/tracker?command=status) that serves the state of an ExecTracker as JSON. An HTML view is provided that renders the state. Thereby, running executions have a stop button and the last N (by default = 100) completed queries are shown.

Jena-Query-Dashboard.webm
# New 'tracker' endpoint feature
<#ep-tracker> fuseki:name "tracker" ; fuseki:operation fuseki:tracker .

# Usual fuseki setup
<#ep-update> fuseki:name "update" ; fuseki:operation fuseki:update .
<#ep-query> fuseki:operation fuseki:query .

<#service> rdf:type fuseki:Service ;
    fuseki:name "test" ;
    fuseki:endpoint <#ep-query> , <#ep-update>,  <#ep-tracker> ;
    fuseki:dataset <#textDS>
   .

  • Tests are included.
  • Documentation change and updates are provided for the Apache Jena website
  • Commits have been squashed to remove intermediate development commit messages.
  • Key commit messages start with the issue number (GH-xxxx)

By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.


See the Apache Jena "Contributing" guide.

@Aklakan Aklakan marked this pull request as draft May 12, 2025 14:12
@Aklakan Aklakan force-pushed the 2025-05-11-exectracker branch from 0f8a131 to a5ca301 Compare May 12, 2025 15:47
@rvesse
Copy link
Member

rvesse commented May 13, 2025

Really interesting piece of work, once did something much cruder (at least UI wise) in a previous $dayjob

Perhaps this could also serve as a base for discussion about further improvements and any necessary core changes for Jena 6. Ideally the ExecTracker mechanism would not require a wrapping with DatasetGraphWithExecTracker and instead this would be handled in the core machinery already. In cases where a specific DatasetGraph implementation is expected, the need for a wrapper to track executions may make things complex.

Yes I think this would be much cleaner if the tracking mechanism was integrated directly into the execution machinery without requiring extra wrapping as you do in this PR.

It would be nice if there were programmatic APIs for interacting with tracked queries/updates (there's some pieces towards that here but appears mostly focused on exposing stuff to the UI from my skim-reading of the code) so that applications that embed Jena could access and manage tracked queries/updates as desired.

Fuseki already has the concept of Tasks that's used for things like backups and compactions, would it make sense to integrate query/update tracking into that rather than creating a separate tracking mechanism. That might need generalising that mechanism, or pulling it more into Jena's core rather than Fuseki machinery, so might not be worth the effort, wdyt?

@Aklakan
Copy link
Contributor Author

Aklakan commented May 13, 2025

It would be nice if there were programmatic APIs for interacting with tracked queries/updates (there's some pieces towards that here but appears mostly focused on exposing stuff to the UI from my skim-reading of the code) so that applications that embed Jena could access and manage tracked queries/updates as desired.

The ARQ Plugin adds ExecTracker which so far would be the programmatic API. (Still a bit crude but intended as a starting point towards this goal.)
The DatasetGraphWithExecTracker wrapping is needed such that the query/update engines process queries by adding the tracking.
So by integrating the tracking closer to the core, the DatasetGraphWithExecTracker would only be needed for adding tracking to alternative query engines, but the main query/update engines could interact with the tracker directly.

An important question is, whether tracking executions within the DatasetGraph's context is the way to move forward.

Fuseki already has the concept of Tasks [...] pulling it more into Jena's core [...]

Yes, I yet need to look into how much effort it would be to disentangle Fuseki' task tracker from the Fuseki - but adding such a mechanism to core (and updating Fuseki for it) would be most likely the way to go.

@Aklakan Aklakan force-pushed the 2025-05-11-exectracker branch 2 times, most recently from 92be0d4 to 28ccc1b Compare May 14, 2025 19:36
@Aklakan Aklakan force-pushed the 2025-05-11-exectracker branch from 28ccc1b to 603c8e7 Compare May 14, 2025 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API call to get a snapshot of running queries
2 participants