You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "Artem Ervits (JIRA)" <ji...@apache.org> on 2019/03/13 18:37:00 UTC
[jira] [Commented] (PHOENIX-3835) CSV Bulkload fails if hbase mapredcp was used for classpath

    [ https://issues.apache.org/jira/browse/PHOENIX-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791993#comment-16791993 ] 

Artem Ervits commented on PHOENIX-3835:
---------------------------------------

Tried this approach on the 4.14.1-1.4 branch and I'm not able to get past the problem.

> CSV Bulkload fails if hbase mapredcp was used for classpath
> -----------------------------------------------------------
>
>                 Key: PHOENIX-3835
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3835
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Sergey Soldatov
>            Priority: Major
>
> For long period of time our documentation has  a recommendation to use hbase mapredcp for HADOOP_CLASSPATH when MR bulk load is used. Actually it doesn't work and in this case the job will fail with the exception:
> {noformat}
> Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair not found
>         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2246)
>         at org.apache.hadoop.mapred.JobConf.getMapOutputKeyClass(JobConf.java:813)
>         at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapOutputKeyClass(JobContextImpl.java:142)
>         at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:779)
>         at org.apache.phoenix.mapreduce.MultiHfileOutputFormat.configureIncrementalLoad(MultiHfileOutputFormat.java:698)
>         at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:330)
>         at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:299)
>         at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:182)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>         at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:117)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair not found
>         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
>         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2238)
>         ... 16 more
> Caused by: java.lang.ClassNotFoundException: Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair not found
>         at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2120)
>         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2212)
>         ... 17 more
> {noformat}
> I may be wrong, but it looks like a side effect of HBASE-12108.  Not sure whether it's possible to fix it on phoenix side or we just need to update the documentation to use it for some specific versions of HBase. In most cases everything works just fine without specifying HADOOP_CLASSPATH.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)