You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Abin Shahab (JIRA)" <ji...@apache.org> on 2014/03/25 21:04:15 UTC

[jira] [Commented] (HIVE-6670) ClassNotFound with Serde

    [ https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13947077#comment-13947077 ] 

Abin Shahab commented on HIVE-6670:
-----------------------------------

[~hashutosh] I can write a test case. Is there a similar testcase that I can look at?
I'm not sure how to create a ReviewBoard entry. It'd be great if you can do that once I upload the test.


> ClassNotFound with Serde
> ------------------------
>
>                 Key: HIVE-6670
>                 URL: https://issues.apache.org/jira/browse/HIVE-6670
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>            Reporter: Abin Shahab
>            Assignee: Abin Shahab
>         Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.patch
>
>
> We are finding a ClassNotFound exception when we use CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
> This is happening because MapredLocalTask does not pass the local added jars to ExecDriver when that is launched.
> ExecDriver's classpath does not include the added jars. Therefore, when the plan is deserialized, it throws a ClassNotFoundException in the deserialization code, and results in a TableDesc object with a Null DeserializerClass.
> This results in an NPE during Fetch.
> Steps to reproduce:
> wget https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar into somewhere local eg. /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
> Place some sample SCV files in HDFS as follows:
> hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
> hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
> hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
> hdfs dfs -put /home/soam/sampleJoinTarget.csv /user/soam/HiveSerdeIssue/sampleJoinTarget/
> ====
> create the tables in hive:
> ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
> create external table sampleCSV (md5hash string, filepath string)
> row format serde 'com.bizo.hive.serde.csv.CSVSerde'
> stored as textfile
> location '/user/soam/HiveSerdeIssue/sampleCSV/'
> ;
> create external table sampleJoinTarget (md5hash string, filepath string, datestamp string, nblines string, nberrors string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',' 
> LINES TERMINATED BY '\n'
> STORED AS TEXTFILE
> LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
> ;
> ===============
> Now, try the following JOIN:
> ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
> SELECT 
> sampleCSV.md5hash, 
> sampleCSV.filepath 
> FROM sampleCSV
> JOIN sampleJoinTarget
> ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
> ;
> —
> This will fail with the error:
> Execution log at: /tmp/soam/.log
> java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
> Continuing ...
> 2014-03-11 10:35:03 Starting to launch local task to process map join; maximum memory = 238551040
> Execution failed with exit status: 2
> Obtaining error information
> Task failed!
> Task ID:
> Stage-4
> Logs:
> /var/log/hive/soam/hive.log
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> Try the following LEFT JOIN. This will work:
> SELECT 
> sampleCSV.md5hash, 
> sampleCSV.filepath 
> FROM sampleCSV
> LEFT JOIN sampleJoinTarget
> ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
> ;
> ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)