Skip to content

Add support for multiple semantic vocabularies on a single dataset #811

@diegoalexdiaz

Description

@diegoalexdiaz

Users who want to capture semantics over their entities & tiles in a dataset, may want to do so by using more than one vocabulary. That decision is highly influenced by the original source format of data.

Specifically in the AEC industry, a user may be interested in utilizing one or more vocabularies based on data models, taxonomies or ontologies used in their projects. e.g.

  • BIO (BIS' Ontology)
  • IFC (Industry Foundation Classes)
  • OnmiClass
  • UniClass
  • A custom taxonomy adopted by a particular organization
  • etc.

The request here is to ensure that a user can define metadata semantics in terms of more than one vocabulary. To achieve that, vocabularies in use by a dataset need to be differentiated by prefix.

Prefixes could be assigned by users if the terms of a vocabulary don't already have one. e.g.

Terms in ontologies like BFO or BIO are made unique by assigning them Unique Resource Identifiers (URIs). e.g.

Ontology prefix name for the BFO ontology:
bfo:

Ontology prefix for the BFO ontology:
http://purl.obolibrary.org/obo/

URI for "material entity" in BFO:
http://purl.obolibrary.org/obo/BFO_0000040

... which could be abbreviated in a semantic key as:
bfo:BFO_0000040

Note that other Ontologies use prefixes with additional characters.
e.g. for RDF:
http://www.w3.org/1999/02/22-rdf-syntax-ns#

In summary, users should be able to declare the list of vocabularies and their unique prefixes, for a dataset. Then, use prefixes with terms from a particular vocabulary while capturing metadata semantic on specific Tiles and objects.

Activity

javagl

javagl commented on Jun 10, 2025

@javagl
Contributor

This request seems to refer to a use case where I'll have to read more about the links and context, to better understand some domain-specific aspects. But in a broader sense, this seems to be related to #643 .

And it is indeed important to disambiguate semantics, on a strict, technical level. If I understood that correctly, then this is what you referred to with the 'vocabulary', and the suggestion to define prefixes for the semantics.

The suggestion in the linked issue may already address some of these goals - namely, to offer an option to define a specific set of semantics for one particular use-case. And I think that it is important to define these semantics in a machine-processable form, and keeping in mind some normative aspects of these definitions (i.e. disambiguation or "namespacing").

The idea there was to define the semantics with a metadata schema. At the first glance, it may seem strange: The metadata schema can contain semantics. And the semantics are supposed to be defined with a metadata schema. But the advantage would be that there already is a clear specification for this, and the information that is contained in a metadata schema is exactly the information that is required for defining semantics.

The concept of different "vocabularies" could then be addressed on two different levels:

  1. With different schemas
  2. With different classes in a schema

In fact, we already picked up some of these ideas. The 3d-tiles-validator already offers the option to pass in user-defined semantics in form of a metadata schema.

We have not yet addressed this on the level of the specification. On this level, we should extend the section about metadata semantics with the appropriate (technical, normative) details. (This is, to some extent, tracked in #574 )

Further steps would be to actually establish some technical infrastructure here. For example, the existing semantics should be stored as actual JSON files. (These files would just be what is currently stored/inlined in the validator). And we should consider to create some sort of "repository" of user-defined semantics, where users can submit their JSON files with semantics definitions, via pull requests, to make them available for other users as well.

diegoalexdiaz

diegoalexdiaz commented on Jun 10, 2025

@diegoalexdiaz
Author

Thanks for your reply.

Re: what you referred to with the 'vocabulary'

I believe we're on the same page. Vocabulary as in a list of terms (typically defined as a Taxonomy, but could be a full Ontology - with relationships among them) that bring agreed-upon meaning to one or more data-models (i.e. schemas). Thus, it enables the decoupling of data-meaning (typically in more abstract terms) from how data is concretely laid out in terms of a particular data-modeling technology (e.g. like the schema-related constructs provided by 3D Tiles - schemas, classes and properties of particular data-types).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @javagl@diegoalexdiaz

        Issue actions

          Add support for multiple semantic vocabularies on a single dataset · Issue #811 · CesiumGS/3d-tiles