You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Bernhard Stiftner (JIRA)" <ji...@apache.org> on 2018/09/28 11:18:00 UTC
[jira] [Comment Edited] (JENA-1553) Can't Backup data -
java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
[ https://issues.apache.org/jira/browse/JENA-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631709#comment-16631709 ]
Bernhard Stiftner edited comment on JENA-1553 at 9/28/18 11:17 AM:
-------------------------------------------------------------------
Experienced the same problem with Jena 3.8.0. TDB node tables got corrupted at some point under a combined, concurrent read/write workload, consequently leading to various exceptions being thrown in/around NodeLib.decode. Among the incarnations of the same problem were...
Different kinds of RiotParseExceptions when attemping to access corrupted TDB node tables:
{noformat}
org.apache.jena.riot.RiotParseException: [line: 1, col: 1 ] Failed to find a prefix name or keyword: ^@(0;0x0000)
at org.apache.jena.riot.tokens.TokenizerText$ErrorHandlerTokenizer.error(TokenizerText.java:65)
at org.apache.jena.riot.tokens.TokenizerText.error(TokenizerText.java:1244)
at org.apache.jena.riot.tokens.TokenizerText.readPrefixedNameOrKeyword(TokenizerText.java:536)
at org.apache.jena.riot.tokens.TokenizerText.parseToken(TokenizerText.java:445)
at org.apache.jena.riot.tokens.TokenizerText.hasNext(TokenizerText.java:99)
at org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:127)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:110
{noformat}
{noformat}
org.apache.jena.riot.RiotParseException: [line: 1, col: 3 ] Malformed double: 2e
at org.apache.jena.riot.tokens.TokenizerText$ErrorHandlerTokenizer.error(TokenizerText.java:65)
at org.apache.jena.riot.tokens.TokenizerText.error(TokenizerText.java:1244)
at org.apache.jena.riot.tokens.TokenizerText.exponent(TokenizerText.java:1011)
at org.apache.jena.riot.tokens.TokenizerText.readNumber(TokenizerText.java:916)
at org.apache.jena.riot.tokens.TokenizerText.parseToken(TokenizerText.java:421)
at org.apache.jena.riot.tokens.TokenizerText.hasNext(TokenizerText.java:99)
at org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:127)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:110)
{noformat}
Or a TDBException like this one:
{noformat}
org.apache.jena.tdb.TDBException: Not a node: if/stmt/6da980f15dedf35826cf3a4354525ded8efde37b>
at org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:133)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:110)
{noformat}
And I also got "Illegal UTF-8" errors just as in the stacktrace above:
{noformat}
org.apache.jena.atlas.RuntimeIOException: java.io.IOException: Illegal UTF-8: 0xFFFFFF97
at org.apache.jena.atlas.io.IO.exception(IO.java:254)
at org.apache.jena.atlas.io.BlockUTF8.exception(BlockUTF8.java:275)
at org.apache.jena.atlas.io.BlockUTF8.toCharsBuffer(BlockUTF8.java:150)
at org.apache.jena.atlas.io.BlockUTF8.toChars(BlockUTF8.java:73)
at org.apache.jena.atlas.io.BlockUTF8.toString(BlockUTF8.java:95)
at org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:101)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:110)
{noformat}
All of those errors disappeared after patching Jena (we're using our own fork of 3.8.0) with the proposed fix for JENA-1581 (upcoming Jena 3.9.0) and completely rebuilding TDB stores. Existing data is probably corrupted and cannot be recovered, but so far I believe that JENA-1581 prevents TDB corruption from happening in the first place.
was (Author: bersti):
Experienced the same problem with Jena 3.8.0. TDB node tables got corrupted at some point under a combined, concurrent read/write workload, consequently leading to various exceptions being thrown in/around NodeLib.decode. Among the incarnations of the same problem were...
Different kinds of RiotParseExceptions when attemping to access corrupted TDB node tables:
org.apache.jena.riot.RiotParseException: [line: 1, col: 1 ] Failed to find a prefix name or keyword: ^@(0;0x0000)
at org.apache.jena.riot.tokens.TokenizerText$ErrorHandlerTokenizer.error(TokenizerText.java:65)
at org.apache.jena.riot.tokens.TokenizerText.error(TokenizerText.java:1244)
at org.apache.jena.riot.tokens.TokenizerText.readPrefixedNameOrKeyword(TokenizerText.java:536)
at org.apache.jena.riot.tokens.TokenizerText.parseToken(TokenizerText.java:445)
at org.apache.jena.riot.tokens.TokenizerText.hasNext(TokenizerText.java:99)
at org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:127)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:110
org.apache.jena.riot.RiotParseException: [line: 1, col: 3 ] Malformed double: 2e
at org.apache.jena.riot.tokens.TokenizerText$ErrorHandlerTokenizer.error(TokenizerText.java:65)
at org.apache.jena.riot.tokens.TokenizerText.error(TokenizerText.java:1244)
at org.apache.jena.riot.tokens.TokenizerText.exponent(TokenizerText.java:1011)
at org.apache.jena.riot.tokens.TokenizerText.readNumber(TokenizerText.java:916)
at org.apache.jena.riot.tokens.TokenizerText.parseToken(TokenizerText.java:421)
at org.apache.jena.riot.tokens.TokenizerText.hasNext(TokenizerText.java:99)
at org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:127)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:110)
Or a TDBException like this one:
org.apache.jena.tdb.TDBException: Not a node: if/stmt/6da980f15dedf35826cf3a4354525ded8efde37b>
at org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:133)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:110)
And I also got "Illegal UTF-8" errors just as in the stacktrace above:
org.apache.jena.atlas.RuntimeIOException: java.io.IOException: Illegal UTF-8: 0xFFFFFF97
at org.apache.jena.atlas.io.IO.exception(IO.java:254)
at org.apache.jena.atlas.io.BlockUTF8.exception(BlockUTF8.java:275)
at org.apache.jena.atlas.io.BlockUTF8.toCharsBuffer(BlockUTF8.java:150)
at org.apache.jena.atlas.io.BlockUTF8.toChars(BlockUTF8.java:73)
at org.apache.jena.atlas.io.BlockUTF8.toString(BlockUTF8.java:95)
at org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:101)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:110)
All of those errors disappeared after patching Jena (we're using our own fork of 3.8.0) with the proposed fix for JENA-1581 (upcoming Jena 3.9.0) and completely rebuilding TDB stores. Existing data is probably corrupted and cannot be recovered, but so far I believe that JENA-1581 prevents TDB corruption from happening in the first place.
> Can't Backup data - java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
> ------------------------------------------------------------------
>
> Key: JENA-1553
> URL: https://issues.apache.org/jira/browse/JENA-1553
> Project: Apache Jena
> Issue Type: Bug
> Components: Jena
> Environment: Ubuntu 16.04 running Docker. Running stain/jena-fuseki from the official Docker Hub.
> Reporter: Brian Mullen
> Priority: Major
>
> Attempting to backup through Fuseki, TDB 500M+ triples, breaking with error:
>
> {code:java}
> [2018-06-01 13:25:46] Log4jLoggerAdapter WARN Exception in backup
> org.apache.jena.atlas.RuntimeIOException: java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
> at org.apache.jena.atlas.io.IO.exception(IO.java:233)
> at org.apache.jena.atlas.io.BlockUTF8.exception(BlockUTF8.java:275)
> at org.apache.jena.atlas.io.BlockUTF8.toCharsBuffer(BlockUTF8.java:150)
> at org.apache.jena.atlas.io.BlockUTF8.toChars(BlockUTF8.java:73)
> at org.apache.jena.atlas.io.BlockUTF8.toString(BlockUTF8.java:95)
> at org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:101)
> at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:105)
> at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:81)
> at org.apache.jena.tdb.store.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:186)
> at org.apache.jena.tdb.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:111)
> at org.apache.jena.tdb.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:70)
> at org.apache.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:128)
> at org.apache.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:82)
> at org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:50)
> at org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67)
> at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:107)
> at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:84)
> at org.apache.jena.tdb.lib.TupleLib.lambda$convertToTriples$2(TupleLib.java:54)
> at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
> at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
> at org.apache.jena.atlas.iterator.Iter.next(Iter.java:891)
> at org.apache.jena.riot.system.StreamOps.sendQuadsToStream(StreamOps.java:140)
> at org.apache.jena.riot.writer.NQuadsWriter.write$(NQuadsWriter.java:62)
> at org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:45)
> at org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:91)
> at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:208)
> at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:165)
> at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:112)
> at org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:149)
> at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1269)
> at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1162)
> at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1153)
> at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:115)
> at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:75)
> at org.apache.jena.fuseki.mgt.ActionBackup$BackupTask.run(ActionBackup.java:58)
> at org.apache.jena.fuseki.async.AsyncPool.lambda$submit$0(AsyncPool.java:55)
> at org.apache.jena.fuseki.async.AsyncTask.call(AsyncTask.java:100)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
> ... 40 more
> [2018-06-01 13:25:46] Log4jLoggerAdapter INFO Backup(/fuseki/backups/PDE_PROD_2018-06-01_13-24-00):2{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)