You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@rya.apache.org by "Brad (Jira)" <ji...@apache.org> on 2020/06/04 07:58:00 UTC

[jira] [Assigned] (RYA-530) Rya can't ingest hostnames beginning with a number

     [ https://issues.apache.org/jira/browse/RYA-530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brad reassigned RYA-530:
------------------------

    Assignee: Brad

> Rya can't ingest hostnames beginning with a number
> --------------------------------------------------
>
>                 Key: RYA-530
>                 URL: https://issues.apache.org/jira/browse/RYA-530
>             Project: Rya
>          Issue Type: Bug
>          Components: sail
>    Affects Versions: 4.0.0
>            Reporter: Brad
>            Assignee: Brad
>            Priority: Major
>
> I am attempting to ingest the latest DBpedia dataset. Rya is erroring out whenever it hits a URI with a hostname that begins with a number. I'm not sure if the problem is in Rya itself or in RDF4J.
>  
> 2020-05-28 00:53:07,971 ERROR [main -- parser thread] org.apache.rya.accumulo.mr.RdfFileInputFormat: Invalid IRI 'https://9p.io/plan9 [line 36207]
>  org.eclipse.rdf4j.rio.RDFParseException: Invalid IRI 'https://9p.io/plan9 [line 36207]
>  at org.eclipse.rdf4j.rio.helpers.RDFParserHelper.reportError(RDFParserHelper.java:322)
>  at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.reportError(AbstractRDFParser.java:684)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.reportError(TurtleParser.java:1309)
>  at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.resolveURI(AbstractRDFParser.java:387)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseURI(TurtleParser.java:941)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseValue(TurtleParser.java:588)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObject(TurtleParser.java:474)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:412)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:385)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:372)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:239)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:201)
>  at org.apache.rya.accumulo.mr.RdfFileInputFormat$RdfFileRecordReader$2.run(RdfFileInputFormat.java:275)
>  2020-05-28 00:53:07,972 ERROR [main -- parser thread] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[main -- parser thread,5,main] threw an Exception.
>  java.lang.RuntimeException: Invalid IRI 'https://9p.io/plan9 [line 36207]
>  at org.apache.rya.accumulo.mr.RdfFileInputFormat$RdfFileRecordReader$2.run(RdfFileInputFormat.java:280)
>  Caused by: org.eclipse.rdf4j.rio.RDFParseException: Invalid IRI 'https://9p.io/plan9 [line 36207]
>  at org.eclipse.rdf4j.rio.helpers.RDFParserHelper.reportError(RDFParserHelper.java:322)
>  at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.reportError(AbstractRDFParser.java:684)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.reportError(TurtleParser.java:1309)
>  at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.resolveURI(AbstractRDFParser.java:387)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseURI(TurtleParser.java:941)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseValue(TurtleParser.java:588)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObject(TurtleParser.java:474)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:412)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:385)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:372)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:239)
>  at org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:201)
>  at org.apache.rya.accumulo.mr.RdfFileInputFormat$RdfFileRecordReader$2.run(RdfFileInputFormat.java:275)
>  2020-05-28 00:53:07,972 ERROR [main -- reader thread] org.apache.rya.accumulo.mr.RdfFileInputFormat: Error processing line 38462 of input
>  java.io.InterruptedIOException
>  at java.io.PipedReader.receive(PipedReader.java:187)
>  at java.io.PipedReader.receive(PipedReader.java:206)
>  at java.io.PipedWriter.write(PipedWriter.java:150)
>  at java.io.Writer.write(Writer.java:192)
>  at java.io.Writer.write(Writer.java:157)
>  at org.apache.rya.accumulo.mr.RdfFileInputFormat$RdfFileRecordReader$1.run(RdfFileInputFormat.java:249)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)