You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Aritra Nayak (Jira)" <ji...@apache.org> on 2019/12/06 03:56:00 UTC
[jira] [Updated] (PHOENIX-5361) FileNotFoundException found when
schema is in lowercase
[ https://issues.apache.org/jira/browse/PHOENIX-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aritra Nayak updated PHOENIX-5361:
----------------------------------
Description:
The table name (DUMMY_DATA) is in uppercase, but the schema name (s01) is in lowercase.
Steps to reproduce:
# Create the Hive table:
{code:java}
CREATE TABLE IF NOT EXISTS "s01"."DUMMY_DATA"("id" BIGINT BIGINT PRIMARY KEY, "firstName" VARCHAR, "lastName" VARCHAR);
{code}
#
Upload the CSV file in your preferred HDFS location{code}
{code:java}
/data/s01/DUMMY_DATA/1.csv{code}
#
Run the hadoop jar command to bulk upload{code}
{code:java}
hadoop jar /opt/phoenix/phoenix4.13-cdh5.9.2-marin-1.5.1/phoenix4.13-cdh5.9.2-marin-1.5.1-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --s \"\"s01\"\" --t DUMMY_DATA --input /data/s01/DUMMY_DATA/1.csv --zookeeper zk-journalnode-lv-101:2181
{code}
Getting the below error:
{code:java}
Exception in thread "main" java.io.FileNotFoundException: Bulkload dir /tmp/94ea4875-3453-4ed6-823d-3544ff05fd56/s01.DUMMY_DATA not found
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:194)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:289)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:393)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:339)
at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)
at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)
at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)
at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}
The Map Reduce job reads 100_000 records, but does not write any
{code:java}
19/06/18 20:06:24 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=20
FILE: Number of bytes written=315801
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=41666811
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=39894
Total time spent by all reduces in occupied slots (ms)=56216
Total time spent by all map tasks (ms)=19947
Total time spent by all reduce tasks (ms)=14054
Total vcore-seconds taken by all map tasks=19947
Total vcore-seconds taken by all reduce tasks=14054
Total megabyte-seconds taken by all map tasks=40851456
Total megabyte-seconds taken by all reduce tasks=57565184
Map-Reduce Framework
Map input records=1000000
Map output records=0 <----- see here
Map output bytes=0
Map output materialized bytes=16
Input split bytes=123
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=16
Reduce input records=0
Reduce output records=0
Spilled Records=0
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=914
CPU time spent (ms)=49240
Physical memory (bytes) snapshot=2022809600
Virtual memory (bytes) snapshot=8064647168
Total committed heap usage (bytes)=3589275648
Phoenix MapReduce Import
Upserts Done=1000000
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=41666688
File Output Format Counters
Bytes Written=0
{code}
{color:#14892c}Same steps (1-3) when followed with schema name S01, passes and data gets successfully uploaded into the table{color}
was:
The table name (DUMMY_DATA) is in uppercase, but the schema name (s01) is in lowercase.
Steps to reproduce:
# Create the Hive table:
{code:java}
CREATE TABLE IF NOT EXISTS "s01"."DUMMY_DATA"("id" BIGINT BIGINT PRIMARY KEY, "firstName" VARCHAR, "lastName" VARCHAR);
{code}
# Upload the CSV file in your preferred HDFS location
{code:java}
/data/s01/DUMMY_DATA/1.csv{code}
# Run the hadoop jar command to bulk upload
{code:java}
hadoop jar /opt/phoenix/phoenix4.13-cdh5.9.2-marin-1.5.1/phoenix4.13-cdh5.9.2-marin-1.5.1-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --s \"\"s01\"\" --t DUMMY_DATA --input /data/s01/DUMMY_DATA/1.csv --zookeeper zk-journalnode-lv-101:2181
{code}
Getting the below error:
{code:java}
Exception in thread "main" java.io.FileNotFoundException: Bulkload dir /tmp/94ea4875-3453-4ed6-823d-3544ff05fd56/s01.DUMMY_DATA not found
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:194)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:289)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:393)
at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:339)
at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)
at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)
at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)
at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}
The Map Reduce job reads 100_000 records, but does not write any
{code:java}
19/06/18 20:06:24 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=20
FILE: Number of bytes written=315801
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=41666811
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=39894
Total time spent by all reduces in occupied slots (ms)=56216
Total time spent by all map tasks (ms)=19947
Total time spent by all reduce tasks (ms)=14054
Total vcore-seconds taken by all map tasks=19947
Total vcore-seconds taken by all reduce tasks=14054
Total megabyte-seconds taken by all map tasks=40851456
Total megabyte-seconds taken by all reduce tasks=57565184
Map-Reduce Framework
Map input records=1000000
Map output records=0 <----- see here
Map output bytes=0
Map output materialized bytes=16
Input split bytes=123
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=16
Reduce input records=0
Reduce output records=0
Spilled Records=0
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=914
CPU time spent (ms)=49240
Physical memory (bytes) snapshot=2022809600
Virtual memory (bytes) snapshot=8064647168
Total committed heap usage (bytes)=3589275648
Phoenix MapReduce Import
Upserts Done=1000000
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=41666688
File Output Format Counters
Bytes Written=0
{code}
{color:#14892c}Same steps (1-3) when followed with schema name S01, passes and data gets successfully uploaded into the table{color}
> FileNotFoundException found when schema is in lowercase
> -------------------------------------------------------
>
> Key: PHOENIX-5361
> URL: https://issues.apache.org/jira/browse/PHOENIX-5361
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.13.0
> Environment: *Hadoop*: 2.6.0-cdh5.9.2
> *Phoenix*: 4.13
> *HBase*: 1.2.0-cdh5.9.2
> *Java*: 8
> Reporter: Aritra Nayak
> Priority: Major
>
> The table name (DUMMY_DATA) is in uppercase, but the schema name (s01) is in lowercase.
>
> Steps to reproduce:
> # Create the Hive table:
> {code:java}
> CREATE TABLE IF NOT EXISTS "s01"."DUMMY_DATA"("id" BIGINT BIGINT PRIMARY KEY, "firstName" VARCHAR, "lastName" VARCHAR);
> {code}
> #
> Upload the CSV file in your preferred HDFS location{code}
> {code:java}
> /data/s01/DUMMY_DATA/1.csv{code}
> #
> Run the hadoop jar command to bulk upload{code}
> {code:java}
> hadoop jar /opt/phoenix/phoenix4.13-cdh5.9.2-marin-1.5.1/phoenix4.13-cdh5.9.2-marin-1.5.1-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --s \"\"s01\"\" --t DUMMY_DATA --input /data/s01/DUMMY_DATA/1.csv --zookeeper zk-journalnode-lv-101:2181
> {code}
> Getting the below error:
> {code:java}
> Exception in thread "main" java.io.FileNotFoundException: Bulkload dir /tmp/94ea4875-3453-4ed6-823d-3544ff05fd56/s01.DUMMY_DATA not found
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:194)
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:289)
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:393)
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:339)
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
> The Map Reduce job reads 100_000 records, but does not write any
>
> {code:java}
> 19/06/18 20:06:24 INFO mapreduce.Job: Counters: 50
> File System Counters
> FILE: Number of bytes read=20
> FILE: Number of bytes written=315801
> FILE: Number of read operations=0
> FILE: Number of large read operations=0
> FILE: Number of write operations=0
> HDFS: Number of bytes read=41666811
> HDFS: Number of bytes written=0
> HDFS: Number of read operations=4
> HDFS: Number of large read operations=0
> HDFS: Number of write operations=0
> Job Counters
> Launched map tasks=1
> Launched reduce tasks=1
> Data-local map tasks=1
> Total time spent by all maps in occupied slots (ms)=39894
> Total time spent by all reduces in occupied slots (ms)=56216
> Total time spent by all map tasks (ms)=19947
> Total time spent by all reduce tasks (ms)=14054
> Total vcore-seconds taken by all map tasks=19947
> Total vcore-seconds taken by all reduce tasks=14054
> Total megabyte-seconds taken by all map tasks=40851456
> Total megabyte-seconds taken by all reduce tasks=57565184
> Map-Reduce Framework
> Map input records=1000000
> Map output records=0 <----- see here
> Map output bytes=0
> Map output materialized bytes=16
> Input split bytes=123
> Combine input records=0
> Combine output records=0
> Reduce input groups=0
> Reduce shuffle bytes=16
> Reduce input records=0
> Reduce output records=0
> Spilled Records=0
> Shuffled Maps =1
> Failed Shuffles=0
> Merged Map outputs=1
> GC time elapsed (ms)=914
> CPU time spent (ms)=49240
> Physical memory (bytes) snapshot=2022809600
> Virtual memory (bytes) snapshot=8064647168
> Total committed heap usage (bytes)=3589275648
> Phoenix MapReduce Import
> Upserts Done=1000000
> Shuffle Errors
> BAD_ID=0
> CONNECTION=0
> IO_ERROR=0
> WRONG_LENGTH=0
> WRONG_MAP=0
> WRONG_REDUCE=0
> File Input Format Counters
> Bytes Read=41666688
> File Output Format Counters
> Bytes Written=0
> {code}
> {color:#14892c}Same steps (1-3) when followed with schema name S01, passes and data gets successfully uploaded into the table{color}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)