You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/02/26 13:39:19 UTC
[GitHub] [druid] irisshainsky opened a new issue #9408: Druid 0.17 native
parallel batch ingestion with orc files fails
irisshainsky opened a new issue #9408: Druid 0.17 native parallel batch ingestion with orc files fails
URL: https://github.com/apache/druid/issues/9408
I'm trying to run a native parallel batch ingestion from s3 to druid. the inputFormat is orc.
I deployed druid 0.17 cluster and enable the druid-orc-extensions.
when running the ingestion task I receive on the sub task the following exception:
org.apache.hadoop.util.VersionInfo - Could not read 'common-version-info.properties', java.io.IOException: Resource not found
it looks as if something is wrong with the hadoop-common loading, but on the log I see that it was loaded when the extension is loaded, is it a known issue?
Thanks
[druid-orc-extensions], jars: hadoop-auth-2.8.5.jar, druid-orc-extensions-0.17.0.jar, commons-digester-1.8.jar, orc-shims-1.5.6.jar, hadoop-hdfs-client-2.8.5.jar, commons-configuration-1.6.jar, protobuf-java-3.11.0.jar, hadoop-annotations-2.8.5.jar, jackson-core-asl-1.9.13.jar, hadoop-common-2.8.5.jar, orc-mapreduce-1.5.6.jar, jackson-mapper-asl-1.9.13.jar, hadoop-mapreduce-client-core-2.8.5.jar, hive-storage-api-2.6.0.jar, htrace-core4-4.0.1-incubating.jar, orc-core-1.5.6.jar, aircompressor-0.10.jar
2020-02-26T11:22:58,090 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [druid-s3-extensions], jars: druid-s3-extensions-0.17.0.jar
2020-02-26T11:22:58,091 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [postgresql-metadata-storage], jars: postgresql-metadata-storage-0.17.0.jar, postgresql-42.2.8.jar
2020-02-26T11:22:58,093 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [statsd-emitter], jars: jnr-unixsocket-0.18.jar, jnr-ffi-2.1.4.jar, jffi-1.2.15.jar, java-dogstatsd-client-2.6.1.jar, jnr-posix-3.0.35.jar, asm-util-5.0.3.jar, jnr-constants-0.9.8.jar, statsd-emitter-0.17.0.jar, asm-tree-5.0.3.jar, jffi-1.2.15-native.jar, jnr-enxio-0.16.jar, jnr-x86asm-1.0.2.jar, asm-analysis-5.0.3.jar, asm-7.1.jar, asm-commons-7.1.jar
2020-02-26T11:23:04,836 WARN [task-runner-0-priority-0] org.apache.hadoop.util.VersionInfo - Could not read 'common-version-info.properties', java.io.IOException: Resource not found
java.io.IOException: Resource not found
at org.apache.hadoop.util.VersionInfo.<init>(VersionInfo.java:49) ~[?:?]
at org.apache.hadoop.util.VersionInfo.<clinit>(VersionInfo.java:99) ~[?:?]
at org.apache.orc.impl.HadoopShimsFactory.get(HadoopShimsFactory.java:52) ~[?:?]
at org.apache.orc.impl.RecordReaderUtils.<clinit>(RecordReaderUtils.java:47) ~[?:?]
at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:257) ~[?:?]
at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:649) ~[?:?]
at org.apache.druid.data.input.orc.OrcReader.intermediateRowIterator(OrcReader.java:98) ~[?:?]
at org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:43) ~[druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78) ~[druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIeteratorIfNecessary(CloseableIterator.java:83) [druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.<init>(CloseableIterator.java:69) [druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67) [druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103) [druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74) [druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43) [druid-processing-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:442) [druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:225) [druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:138) [druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419) [druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391) [druid-indexing-service-0.17.0.jar:0.17.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
2020-02-26T11:23:04,847 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Uncaught Throwable while running task[AbstractTask{id='single_phase_sub_task_tasks_v4_eedbneak_2020-02-26T11:22:55.339Z', groupId='index_parallel_tasks_v4_ofddmnmk_2020-02-26T11:22:17.567Z', taskResource=TaskResource{availabilityGroup='single_phase_sub_task_tasks_v4_eedbneak_2020-02-26T11:22:55.339Z', requiredCapacity=1}, dataSource='tasks_v4', context={forceTimeChunkLock=true}}]
java.lang.ExceptionInInitializerError: null
at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:257) ~[?:?]
at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:649) ~[?:?]
at org.apache.druid.data.input.orc.OrcReader.intermediateRowIterator(OrcReader.java:98) ~[?:?]
at org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:43) ~[druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78) ~[druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIeteratorIfNecessary(CloseableIterator.java:83) ~[druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.<init>(CloseableIterator.java:69) ~[druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67) ~[druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103) ~[druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74) ~[druid-core-0.17.0.jar:0.17.0]
at org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43) ~[druid-processing-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:442) ~[druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:225) ~[druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:138) ~[druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419) [druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391) [druid-indexing-service-0.17.0.jar:0.17.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
Caused by: java.lang.NumberFormatException: For input string: "Unknown"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_242]
at java.lang.Integer.parseInt(Integer.java:580) ~[?:1.8.0_242]
at java.lang.Integer.parseInt(Integer.java:615) ~[?:1.8.0_242]
at org.apache.orc.impl.HadoopShimsFactory.get(HadoopShimsFactory.java:53) ~[?:?]
at org.apache.orc.impl.RecordReaderUtils.<clinit>(RecordReaderUtils.java:47) ~[?:?]
... 20 more
Error!
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.ExceptionInInitializerError
at org.apache.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:215)
at org.apache.druid.cli.CliPeon.run(CliPeon.java:288)
at org.apache.druid.cli.Main.main(Main.java:113)
Caused by: java.util.concurrent.ExecutionException: java.lang.ExceptionInInitializerError
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
at org.apache.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:212)
... 2 more
Caused by: java.lang.ExceptionInInitializerError
at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:257)
at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:649)
at org.apache.druid.data.input.orc.OrcReader.intermediateRowIterator(OrcReader.java:98)
at org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:43)
at org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78)
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIeteratorIfNecessary(CloseableIterator.java:83)
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.<init>(CloseableIterator.java:69)
at org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67)
at org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103)
at org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74)
at org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43)
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:442)
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:225)
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:138)
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419)
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NumberFormatException: For input string: "Unknown"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at org.apache.orc.impl.HadoopShimsFactory.get(HadoopShimsFactory.java:53)
at org.apache.orc.impl.RecordReaderUtils.<clinit>(RecordReaderUtils.java:47)
... 20 more
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org