You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/08/04 17:39:27 UTC

[GitHub] [incubator-pinot] lgo opened a new issue #5804: File processing errors during `CreateSegment` does not print the name of the bad file

lgo opened a new issue #5804:
URL: https://github.com/apache/incubator-pinot/issues/5804


   When running the `CreateSegment` tool from the launcher script, I'd accidentally provided it a bad JSON file because the file was truncated (I believe a string wasn't closed).
   
   When the error was raised, there wasn't any information about which file the process failed for. It'd be nice to have that feedback so it's easier to figure out what the source of the error was. I'm not sure if this is any different when using a distributed FS, as this was just using a local `JSON` file without a running cluster.
   
   ```
   Caught exception while gathering stats
   java.lang.RuntimeException: Unexpected end-of-input in VALUE_STRING
    at [Source: (File); line: 2252, column: 4161]
   	at shaded.com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:196) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.plugin.inputformat.json.JSONRecordReader.next(JSONRecordReader.java:80) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.RecordReaderSegmentCreationDataSource.gatherStats(RecordReaderSegmentCreationDataSource.java:61) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.RecordReaderSegmentCreationDataSource.gatherStats(RecordReaderSegmentCreationDataSource.java:38) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:147) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:132) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:94) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.tools.admin.command.CreateSegmentCommand.lambda$execute$0(CreateSegmentCommand.java:238) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_232]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
   Caused by: shaded.com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input in VALUE_STRING
    at [Source: (File); line: 2252, column: 4161]
   	at shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:618) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:595) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._loadMoreGuaranteed(UTF8StreamJsonParser.java:2348) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2433) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishAndReturnString(UTF8StreamJsonParser.java:2413) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:269) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.deser.std.UntypedObjectDeserializer$Vanilla.deserialize(UntypedObjectDeserializer.java:672) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:527) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:364) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:29) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.MappingIterator.nextValue(MappingIterator.java:277) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:192) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	... 11 more
   Exception caught:
   java.util.concurrent.ExecutionException: java.lang.RuntimeException: Unexpected end-of-input in VALUE_STRING
    at [Source: (File); line: 2252, column: 4161]
   	at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_232]
   	at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_232]
   	at org.apache.pinot.tools.admin.command.CreateSegmentCommand.execute(CreateSegmentCommand.java:264) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:156) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:168) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   Caused by: java.lang.RuntimeException: Unexpected end-of-input in VALUE_STRING
    at [Source: (File); line: 2252, column: 4161]
   	at shaded.com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:196) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.plugin.inputformat.json.JSONRecordReader.next(JSONRecordReader.java:80) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.RecordReaderSegmentCreationDataSource.gatherStats(RecordReaderSegmentCreationDataSource.java:61) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.RecordReaderSegmentCreationDataSource.gatherStats(RecordReaderSegmentCreationDataSource.java:38) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:147) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:132) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:94) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.tools.admin.command.CreateSegmentCommand.lambda$execute$0(CreateSegmentCommand.java:238) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_232]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_232]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_232]
   	at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_232]
   Caused by: shaded.com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input in VALUE_STRING
    at [Source: (File); line: 2252, column: 4161]
   	at shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:618) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:595) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._loadMoreGuaranteed(UTF8StreamJsonParser.java:2348) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2433) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishAndReturnString(UTF8StreamJsonParser.java:2413) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:269) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.deser.std.UntypedObjectDeserializer$Vanilla.deserialize(UntypedObjectDeserializer.java:672) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:527) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:364) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:29) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.MappingIterator.nextValue(MappingIterator.java:277) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at shaded.com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:192) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.plugin.inputformat.json.JSONRecordReader.next(JSONRecordReader.java:80) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.RecordReaderSegmentCreationDataSource.gatherStats(RecordReaderSegmentCreationDataSource.java:61) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.RecordReaderSegmentCreationDataSource.gatherStats(RecordReaderSegmentCreationDataSource.java:38) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:147) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:132) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:94) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at org.apache.pinot.tools.admin.command.CreateSegmentCommand.lambda$execute$0(CreateSegmentCommand.java:238) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-903282c371ef7f154119699a1dd7f5dab3a627a7]
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_232]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_232]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_232]
   	at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_232]
   Finished building StatsCollector!
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] snleee commented on issue #5804: File processing errors during `CreateSegment` does not print the name of the bad file

Posted by GitBox <gi...@apache.org>.
snleee commented on issue #5804:
URL: https://github.com/apache/incubator-pinot/issues/5804#issuecomment-710588651


   Closing the issue since the fix is merged


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] renato2099 commented on issue #5804: File processing errors during `CreateSegment` does not print the name of the bad file

Posted by GitBox <gi...@apache.org>.
renato2099 commented on issue #5804:
URL: https://github.com/apache/incubator-pinot/issues/5804#issuecomment-678627276


   hey @kishoreg  I took a first stab at this in this PR https://github.com/apache/incubator-pinot/pull/5910
   let me know what you think 👍 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] snleee closed issue #5804: File processing errors during `CreateSegment` does not print the name of the bad file

Posted by GitBox <gi...@apache.org>.
snleee closed issue #5804:
URL: https://github.com/apache/incubator-pinot/issues/5804


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org