You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/10 22:33:52 UTC

[GitHub] [incubator-druid] egor-ryashin opened a new issue #7438: druid-orc-extensions hadoop-common dependency is broken

egor-ryashin opened a new issue #7438: druid-orc-extensions hadoop-common dependency is broken
URL: https://github.com/apache/incubator-druid/issues/7438
 
 
   `druid-orc-extensions` `hadoop-common` dependency is broken or maybe the extension isn't properly documented
   
   ### Affected Version
   
   0.13.0-incubating
   
   ### Description
   
   Using this modules:
   `druid.extensions.loadList=["mysql-metadata-storage", "druid-kafka-indexing-service", "druid-orc-extensions", "druid-hdfs-storage"]`
   ```
   du extensions/*
   20032	extensions/druid-cassandra-storage
   46928	extensions/druid-hdfs-storage
   4240	extensions/druid-kafka-indexing-service
   168	extensions/druid-lookups-cached-global
   56640	extensions/druid-orc-extensions
   136	extensions/druid-s3-extensions
   1968	extensions/mysql-metadata-storage
   ```
   Posting this task:
   ```json
   {
     "type": "index_parallel",
     "spec": {
       "dataSchema": {
         "dataSource": "my_orc_test",
         "metricsSpec": [
           {
             "type": "count",
                 "name": "count"
               }
         ],
         "granularitySpec": {
             "segmentGranularity": "DAY",
             "queryGranularity": "second",
             "intervals" : [ "2018-07-10/2018-07-11" ]
          },
           "parser": {
             "type": "orc",
             "parseSpec": {
               "format": "timeAndDims",
               "timestampSpec": {
                 "column": "time",
                 "format": "auto"
               },
               "dimensionsSpec": {
                 "dimensions": [
                   "tag"
                 ],
                 "dimensionExclusions": [],
                 "spatialDimensions": []
               }
             },
             "typeString": "struct<time:string,tag:string>",
             "mapFieldNameFormat": "<PARENT>_<CHILD>"
           }
               
       },
       "ioConfig": {
           "type": "index_parallel",
           "firehose": {
             "type": "local",
             "baseDir": "./",
             "filter": "*.orc"
           }
       }
     }
   }
   ```
   Got error from the spawned subtask:
   ```
   2019-04-10T22:16:10,413 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Uncaught Throwable while running task[AbstractTask{id='index_sub_my_orc_test_2019-04-10T22:16:02.661Z', groupId='index_parallel_my_orc_test_2019-04-10T22:15:55.541Z', taskResource=TaskResource{availabilityGroup='index_sub_my_orc_test_2019-04-10T22:16:02.661Z', requiredCapacity=1}, dataSource='my_orc_test', context={}}]
   java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
   ```
   The log also says the needed dependency jar is loaded beforehand:
   ```
   2019-04-10T22:16:04,921 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/Users/egorryashin/a-druid-0.13-i/extensions/druid-hdfs-storage/hadoop-common-2.8.3.jar] for extension[druid-hdfs-storage]
   ```
   The task doesn't work neither with `druid-hdfs-storage` loaded nor without it.
   
   I spotted that while I was investigating https://github.com/apache/incubator-druid/issues/6925
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org