You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/04/06 15:53:08 UTC
[GitHub] [druid] suneet-pit opened a new issue #9623:
IllegalArgumentException: Can not deserialize while ingesting parquet file
from hdfs
suneet-pit opened a new issue #9623: IllegalArgumentException: Can not deserialize while ingesting parquet file from hdfs
URL: https://github.com/apache/druid/issues/9623
Trying to load the sample parquet data from HDFS to parquet but getting the illegal argument exeception:
Can not deserialize Class com.fasterxml.jackson.annotation.JsonTypeInfo (of type annotation) as a Bean
**2020-04-06T14:55:42,404 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1584920897538_0110_m_000000_2, Status : FAILED
Error: java.lang.IllegalArgumentException: Can not deserialize Class com.fasterxml.jackson.annotation.JsonTypeInfo (of type annotation) as a Bean
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:2774)
at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:2700)**
Version : druid-api-0.13.0-incubating
druid-parquet-extensions-0.12.3.jar, is being used
Sample parquet data:
2019-04-03 12:00:00,"druid","test_POC"
IngestionSpec used
{
"type" : "index_hadoop",
"spec" : {
"ioConfig" : {
"type" : "hadoop",
"inputSpec" : {
"type" : "static",
"inputFormat": "io.druid.data.input.parquet.DruidParquetInputFormat",
"paths" : "hdfs://druid-m:8020/data/testoutput_parquet/part-00000-565e5f99-fa33-41e0-9b99-855323b05a76-c000.parquet"
}
},
"dataSchema" : {
"dataSource" : "sample_data",
"parser" : {
"type" : "parquet",
"parseSpec" : {
"format" : "parquet",
"dimensionsSpec" : {
"dimensions" : []
},
"columns" : ["date_ts","name","project"],
"timestampSpec" : {
"format" : "auto",
"column" : "date_ts"
}
}
},
"metricsSpec" : [],
"granularitySpec" : {
"type" : "uniform",
"segmentGranularity" : "day",
"queryGranularity" : "none",
"rollup" : false
}
},
"tuningConfig" : {
"type" : "hadoop",
"partitionsSpec" : {
"type" : "hashed",
"targetPartitionSize" : 50
},
"forceExtendableShardSpecs" : true,
"jobProperties" : {
"fs.default.name" : "hdfs://druid-m",
"fs.defaultFS" : "hdfs://druid-m",
"dfs.datanode.address" : "druid-m",
"dfs.client.use.datanode.hostname" : "true",
"dfs.datanode.use.datanode.hostname" : "true",
"yarn.resourcemanager.hostname" : "druid-m",
"yarn.nodemanager.vmem-check-enabled" : "false",
"mapreduce.job.classloader": "true",
"mapreduce.map.java.opts" : "-Duser.timezone=UTC -Dfile.encoding=UTF-8",
"mapreduce.reduce.java.opts" : "-Duser.timezone=UTC -Dfile.encoding=UTF-8",
"mapreduce.map.memory.mb" : 1024,
"mapreduce.reduce.memory.mb" : 1024
}
}
},
"hadoopDependencyCoordinates": ["org.apache.hadoop:hadoop-client:2.9.2"]
}
}
- Debugging that is already done-
Tried different parseSpec format (parquet,TimeandDims,timespecFormat)
Tried with different timestamp formats
Tried parsing the dimensions explicitly
Tried with parquet file having schema
Tried with parquet file not having schema
Attached is the log file
[log.txt](https://github.com/apache/druid/files/4439448/log.txt)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org