Skip to content

Mixed rdfs:labels for many chemical compounds #724

Open
@rogargon

Description

@rogargon

Issue validity

The version is currently available from https://dbpedia.org/sparql

Error Description

Many chemical compounds seem to have their labels mixed among them for languages different from English (es, fr, ar,...). For instance, for http://dbpedia.org/resource/Cholesterol there are more than 900 labels in Spanish, including many clearly not corresponding to it like: "Cocaina"...

Pinpointing the source of the error

Details

Using the following query, many resources with more than 900 labels in Spanish are detected:

SELECT  ?concept (COUNT(?label) AS ?count)
FROM <http://dbpedia.org>
WHERE {
  ?concept rdfs:label ?label
  FILTER(LANG(?label) = 'es')
} GROUP BY ?concept
HAVING (COUNT(?label) > 900)

Example DBpedia resource URL(s)

http://dbpedia.org/resource/Cholesterol

Other

Reducing the threshold to more than 100 labels, many other kinds of resources (including people) are also present. They seem also incorrect, like: https://dbpedia.org/page/Alexandra_of_Denmark

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions