-
Notifications
You must be signed in to change notification settings - Fork 478
Description
Users who want to capture semantics over their entities & tiles in a dataset, may want to do so by using more than one vocabulary. That decision is highly influenced by the original source format of data.
Specifically in the AEC industry, a user may be interested in utilizing one or more vocabularies based on data models, taxonomies or ontologies used in their projects. e.g.
- BIO (BIS' Ontology)
- IFC (Industry Foundation Classes)
- OnmiClass
- UniClass
- A custom taxonomy adopted by a particular organization
- etc.
The request here is to ensure that a user can define metadata semantics in terms of more than one vocabulary. To achieve that, vocabularies in use by a dataset need to be differentiated by prefix.
Prefixes could be assigned by users if the terms of a vocabulary don't already have one. e.g.
Terms in ontologies like BFO or BIO are made unique by assigning them Unique Resource Identifiers (URIs). e.g.
Ontology prefix name for the BFO ontology:
bfo:
Ontology prefix for the BFO ontology:
http://purl.obolibrary.org/obo/
URI for "material entity" in BFO:
http://purl.obolibrary.org/obo/BFO_0000040
... which could be abbreviated in a semantic key as:
bfo:BFO_0000040
Note that other Ontologies use prefixes with additional characters.
e.g. for RDF:
http://www.w3.org/1999/02/22-rdf-syntax-ns#
In summary, users should be able to declare the list of vocabularies and their unique prefixes, for a dataset. Then, use prefixes with terms from a particular vocabulary while capturing metadata semantic on specific Tiles and objects.
Activity
javagl commentedon Jun 10, 2025
This request seems to refer to a use case where I'll have to read more about the links and context, to better understand some domain-specific aspects. But in a broader sense, this seems to be related to #643 .
And it is indeed important to disambiguate semantics, on a strict, technical level. If I understood that correctly, then this is what you referred to with the 'vocabulary', and the suggestion to define prefixes for the semantics.
The suggestion in the linked issue may already address some of these goals - namely, to offer an option to define a specific set of semantics for one particular use-case. And I think that it is important to define these semantics in a machine-processable form, and keeping in mind some normative aspects of these definitions (i.e. disambiguation or "namespacing").
The idea there was to define the semantics with a metadata schema. At the first glance, it may seem strange: The metadata schema can contain semantics. And the semantics are supposed to be defined with a metadata schema. But the advantage would be that there already is a clear specification for this, and the information that is contained in a metadata schema is exactly the information that is required for defining semantics.
The concept of different "vocabularies" could then be addressed on two different levels:
In fact, we already picked up some of these ideas. The
3d-tiles-validator
already offers the option to pass in user-defined semantics in form of a metadata schema.We have not yet addressed this on the level of the specification. On this level, we should extend the section about metadata semantics with the appropriate (technical, normative) details. (This is, to some extent, tracked in #574 )
Further steps would be to actually establish some technical infrastructure here. For example, the existing semantics should be stored as actual JSON files. (These files would just be what is currently stored/inlined in the validator). And we should consider to create some sort of "repository" of user-defined semantics, where users can submit their JSON files with semantics definitions, via pull requests, to make them available for other users as well.
diegoalexdiaz commentedon Jun 10, 2025
Thanks for your reply.
I believe we're on the same page. Vocabulary as in a list of terms (typically defined as a Taxonomy, but could be a full Ontology - with relationships among them) that bring agreed-upon meaning to one or more data-models (i.e. schemas). Thus, it enables the decoupling of data-meaning (typically in more abstract terms) from how data is concretely laid out in terms of a particular data-modeling technology (e.g. like the schema-related constructs provided by 3D Tiles - schemas, classes and properties of particular data-types).