Skip to content

Escaped '\n' in extracted URIs #774

Open
@Integer-Ctrl

Description

@Integer-Ctrl

Issue validity

Examples contain '\n' in URIs:

Error Description

The extracted triples contain URIs where '\n' (escaped newline characters) appear within the URI string, which violates URI syntax and leads to broken links.

Pinpointing the source of the error

The DIEF extractor includes newline characters in some URIs. The extraction process itself completes without error, but the resulting triples contain these invalid URIs.

Details

Here is an example of the extraction of Berlin. Not all lines which contain '\n' included

Wrong triples

<http://de.dbpedia.org/resource/Berlin> <http://xmlns.com/foaf/0.1/depiction> <http://commons.wikimedia.org/wiki/Special:FilePath/\n_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> <http://xmlns.com/foaf/0.1/thumbnail> <http://commons.wikimedia.org/wiki/Special:FilePath/\n_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg?width=300> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg?width=300> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> <http://purl.org/dc/elements/1.1/rights> <http://de.wikipedia.org/wiki/Datei:\n_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg?width=300> <http://purl.org/dc/elements/1.1/rights> <http://de.wikipedia.org/wiki/Datei:\n_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> .
<http://de.dbpedia.org/resource/Berlin> <http://xmlns.com/foaf/0.1/depiction> <http://commons.wikimedia.org/wiki/Special:FilePath/\n_Baureihe_483-484_der_S-Bahn_Berlin.jpg> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Baureihe_483-484_der_S-Bahn_Berlin.jpg> <http://xmlns.com/foaf/0.1/thumbnail> <http://commons.wikimedia.org/wiki/Special:FilePath/\n_Baureihe_483-484_der_S-Bahn_Berlin.jpg?width=300> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Baureihe_483-484_der_S-Bahn_Berlin.jpg> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Baureihe_483-484_der_S-Bahn_Berlin.jpg?width=300> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Baureihe_483-484_der_S-Bahn_Berlin.jpg> <http://purl.org/dc/elements/1.1/rights> <http://de.wikipedia.org/wiki/Datei:\n_Baureihe_483-484_der_S-Bahn_Berlin.jpg> .
<http://commons.wikimedia.org/wiki/Special:FilePath/\n_Baureihe_483-484_der_S-Bahn_Berlin.jpg?width=300> <http://purl.org/dc/elements/1.1/rights> <http://de.wikipedia.org/wiki/Datei:\n_Baureihe_483-484_der_S-Bahn_Berlin.jpg> .

Expected / corrected outcome snippet

<http://de.dbpedia.org/resource/Berlin> <http://xmlns.com/foaf/0.1/depiction> <http://commons.wikimedia.org/wiki/Special:FilePath/_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> <http://xmlns.com/foaf/0.1/thumbnail> <http://commons.wikimedia.org/wiki/Special:FilePath/_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg?width=300> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg?width=300> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> <http://purl.org/dc/elements/1.1/rights> <http://de.wikipedia.org/wiki/Datei:_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg?width=300> <http://purl.org/dc/elements/1.1/rights> <http://de.wikipedia.org/wiki/Datei:_Berlin_U-Bahn_IK_at_Olympia-Stadion_(3).jpg> .
<http://de.dbpedia.org/resource/Berlin> <http://xmlns.com/foaf/0.1/depiction> <http://commons.wikimedia.org/wiki/Special:FilePath/_Baureihe_483-484_der_S-Bahn_Berlin.jpg> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Baureihe_483-484_der_S-Bahn_Berlin.jpg> <http://xmlns.com/foaf/0.1/thumbnail> <http://commons.wikimedia.org/wiki/Special:FilePath/_Baureihe_483-484_der_S-Bahn_Berlin.jpg?width=300> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Baureihe_483-484_der_S-Bahn_Berlin.jpg> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Baureihe_483-484_der_S-Bahn_Berlin.jpg?width=300> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Image> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Baureihe_483-484_der_S-Bahn_Berlin.jpg> <http://purl.org/dc/elements/1.1/rights> <http://de.wikipedia.org/wiki/Datei:_Baureihe_483-484_der_S-Bahn_Berlin.jpg> .
<http://commons.wikimedia.org/wiki/Special:FilePath/_Baureihe_483-484_der_S-Bahn_Berlin.jpg?width=300> <http://purl.org/dc/elements/1.1/rights> <http://de.wikipedia.org/wiki/Datei:_Baureihe_483-484_der_S-Bahn_Berlin.jpg> .

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions