You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@impala.apache.org by 俊杰陈 <cj...@gmail.com> on 2017/03/10 06:30:18 UTC

Impala Failed to read file from HDFS

Hi,
I'm using latest impala built from github,  and setup impala cluster with
2-nodes like below:
node-1: statestored, catalogd, namenode,datanode.
node-2: impalad, datanode.

Then I created database and table, loaded data from external parquet file
into table. Everything was OK, but when I executed a query it failed with
following message:

Failed to open HDFS file
file:/user/hive/warehouse/parquet_data.db/test/1.parquet
Error(2): No such file or directory

But I can still ‘desc test’. Anyone met with this? Thanks in advanced.



-- 
Thanks & Best Regards

Re: Impala Failed to read file from HDFS

Posted by Alex Behm <al...@cloudera.com>.

Might be an issue with your 'fs.defaultFS' configuration in core-site.xml.
It should point to your NameNode.

On Sun, Mar 12, 2017 at 7:38 PM, 俊杰陈 <cj...@gmail.com> wrote:

> The issue might due to original parquet_data schema was created against
> local path. But I tried again to create a new schema without specifying the
> LOCATION parameter, "desc database parquet_data" shows that it stored at
> HDFS location. I'm not sure how I created a database store at local file
> system.
>
> 2017-03-13 9:42 GMT+08:00 俊杰陈 <cj...@gmail.com>:
>
>> Hi
>> Please see following:
>> [bdpe30-cjj:21000] > create table test2 like parquet
>> 'hdfs:///data/2.parquet' stored as parquet;
>> Query: create table test2 like parquet 'hdfs:///data/2.parquet' stored as
>> parquet
>>
>> Fetched 0 row(s) in 0.21s
>> [bdpe30-cjj:21000] > show create table test2;
>> Query: show create table test2
>> +------------------------------------------------------------+
>> | result                                                     |
>> +------------------------------------------------------------+
>> | CREATE TABLE parquet_data.test2 (                          |
>> |   a STRING COMMENT 'Inferred from Parquet file.',          |
>> |   b STRING COMMENT 'Inferred from Parquet file.',          |
>> |   c BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   d INT COMMENT 'Inferred from Parquet file.',             |
>> |   e INT COMMENT 'Inferred from Parquet file.',             |
>> |   f BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   g INT COMMENT 'Inferred from Parquet file.',             |
>> |   aa STRING COMMENT 'Inferred from Parquet file.',         |
>> |   bb BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   cc INT COMMENT 'Inferred from Parquet file.',            |
>> |   dd STRING COMMENT 'Inferred from Parquet file.',         |
>> |   ee BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   ff BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   gg BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   h BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   i STRING COMMENT 'Inferred from Parquet file.',          |
>> |   j STRING COMMENT 'Inferred from Parquet file.',          |
>> |   k INT COMMENT 'Inferred from Parquet file.',             |
>> |   l STRING COMMENT 'Inferred from Parquet file.',          |
>> |   m STRING COMMENT 'Inferred from Parquet file.',          |
>> |   n STRING COMMENT 'Inferred from Parquet file.',          |
>> |   hh STRING COMMENT 'Inferred from Parquet file.',         |
>> |   ii STRING COMMENT 'Inferred from Parquet file.',         |
>> |   jj INT COMMENT 'Inferred from Parquet file.',            |
>> |   kk INT COMMENT 'Inferred from Parquet file.',            |
>> |   ll BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   mm INT COMMENT 'Inferred from Parquet file.',            |
>> |   nn BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   o STRING COMMENT 'Inferred from Parquet file.',          |
>> |   p BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   q BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   r BIGINT COMMENT 'Inferred from Parquet file.',          |
>> |   s INT COMMENT 'Inferred from Parquet file.',             |
>> |   t INT COMMENT 'Inferred from Parquet file.',             |
>> |   u INT COMMENT 'Inferred from Parquet file.',             |
>> |   v INT COMMENT 'Inferred from Parquet file.',             |
>> |   w INT COMMENT 'Inferred from Parquet file.',             |
>> |   oo INT COMMENT 'Inferred from Parquet file.',            |
>> |   pp BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   qq INT COMMENT 'Inferred from Parquet file.',            |
>> |   rr BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   ss INT COMMENT 'Inferred from Parquet file.',            |
>> |   tt BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   u1 INT COMMENT 'Inferred from Parquet file.',            |
>> |   v1 INT COMMENT 'Inferred from Parquet file.',            |
>> |   w1 INT COMMENT 'Inferred from Parquet file.',            |
>> |   x STRING COMMENT 'Inferred from Parquet file.',          |
>> |   y INT COMMENT 'Inferred from Parquet file.',             |
>> |   z INT COMMENT 'Inferred from Parquet file.',             |
>> |   uu STRING COMMENT 'Inferred from Parquet file.',         |
>> |   vv STRING COMMENT 'Inferred from Parquet file.',         |
>> |   ww INT COMMENT 'Inferred from Parquet file.',            |
>> |   xx INT COMMENT 'Inferred from Parquet file.',            |
>> |   yy BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   zz BIGINT COMMENT 'Inferred from Parquet file.',         |
>> |   aaa STRING COMMENT 'Inferred from Parquet file.',        |
>> |   bbb STRING COMMENT 'Inferred from Parquet file.',        |
>> |   ccc STRING COMMENT 'Inferred from Parquet file.'         |
>> | )                                                          |
>> | STORED AS PARQUET                                          |
>> | LOCATION 'file:/user/hive/warehouse/parquet_data.db/test2' |
>> |                                                            |
>> +------------------------------------------------------------+
>> Fetched 1 row(s) in 3.22s
>> [bdpe30-cjj:21000] > select * from test2 limit 5;
>> Query: select * from test2 limit 5
>> Query submitted at: 2017-03-13 02:05:40 (Coordinator:
>> http://bdpe30-cjj:25000)
>> Query progress can be monitored at: http://bdpe30-cjj:25000/query_
>> plan?query_id=e444cb64be71c69:96546e2600000000
>>
>> Fetched 0 row(s) in 0.28s
>>
>>
>> 2017-03-10 17:12 GMT+08:00 Jeszy <je...@gmail.com>:
>>
>>> The above looks like you accidentally created the table in a different
>>> database - can you repro the 'file:/' error and paste 'show create
>>> table' of that table?
>>>
>>> On Fri, Mar 10, 2017 at 8:25 AM, 俊杰陈 <cj...@gmail.com> wrote:
>>> > Hi
>>> > Please see the following output. In node bdpe822n2, it worked well. I
>>> don't
>>> > know why it looks weird today.
>>> >
>>> > [bdpe822n2:21000] > create table test like parquet
>>> 'hdfs:///data/1.parquet'
>>> > stored as parquet;
>>> > Query: create table test like parquet 'hdfs:///data/1.parquet' stored
>>> as
>>> > parquet
>>> >
>>> > Fetched 0 row(s) in 0.14s
>>> > [bdpe822n2:21000] > load data inpath "hdfs:///data/1.parquet" into
>>> table
>>> > test;
>>> > Query: load data inpath "hdfs:///data/1.parquet" into table test
>>> > +----------------------------------------------------------+
>>> > | summary                                                  |
>>> > +----------------------------------------------------------+
>>> > | Loaded 1 file(s). Total files in destination location: 2 |
>>> > +----------------------------------------------------------+
>>> > Fetched 1 row(s) in 3.39s
>>> > [bdpe822n2:21000] > refresh test;
>>> > Query: refresh test
>>> > Query submitted at: 2017-03-10 14:46:54 (Coordinator:
>>> > http://bdpe822n2:25000)
>>> > Query progress can be monitored at:
>>> > http://bdpe822n2:25000/query_plan?query_id=4d4ad8038a0362d3:
>>> 8c7b326a00000000
>>> >
>>> > Fetched 0 row(s) in 0.09s
>>> > [bdpe822n2:21000] > show create table parquet_data.test;
>>> > Query: show create table parquet_data.test
>>> > ERROR: AnalysisException: Table does not exist: parquet_data.test
>>> >
>>> > [bdpe822n2:21000] > use parquet_data;
>>> > Query: use parquet_data
>>> > [bdpe822n2:21000] > show tables;
>>> > Query: show tables
>>> >
>>> > Fetched 0 row(s) in 0.02s
>>> >
>>> >
>>> >
>>> >
>>> > 2017-03-10 15:13 GMT+08:00 Sailesh Mukil <sa...@cloudera.com>:
>>> >>
>>> >> Hi,
>>> >>
>>> >> Can you do a 'show create table parquet_data.test;'  and paste the
>>> output?
>>> >>
>>> >> On Thu, Mar 9, 2017 at 11:09 PM, 俊杰陈 <cj...@gmail.com> wrote:
>>> >>>
>>> >>> Plus:
>>> >>>
>>> >>> In my root directory I found
>>> >>> user/hive/warehouse/parquet_data.db/test/2.parquet. So it seems
>>> impalad is
>>> >>> manipulating on local file system.  How do I configure this?
>>> >>>
>>> >>> 2017-03-10 15:03 GMT+08:00 俊杰陈 <cj...@gmail.com>:
>>> >>>>
>>> >>>> Thanks from quick reply:)
>>> >>>>
>>> >>>> 1.parquet is always in the hdfs. I also did following command for
>>> you
>>> >>>> reference, please note the URI which is start with file:. It looks
>>> weird.
>>> >>>>
>>> >>>> [bdpe30-cjj:21000] > use parquet_data;
>>> >>>> Query: use parquet_data
>>> >>>> [bdpe30-cjj:21000] > load data inpath "hdfs:///data/2.parquet" into
>>> >>>> table test;
>>> >>>> Query: load data inpath "hdfs:///data/2.parquet" into table test
>>> >>>> +----------------------------------------------------------+
>>> >>>> | summary                                                  |
>>> >>>> +----------------------------------------------------------+
>>> >>>> | Loaded 1 file(s). Total files in destination location: 2 |
>>> >>>> +----------------------------------------------------------+
>>> >>>> Fetched 1 row(s) in 0.50s
>>> >>>> [bdpe30-cjj:21000] > select count(*) from test;
>>> >>>> Query: select count(*) from test
>>> >>>> Query submitted at: 2017-03-10 07:14:45 (Coordinator:
>>> >>>> http://bdpe30-cjj:25000)
>>> >>>> Query progress can be monitored at:
>>> >>>> http://bdpe30-cjj:25000/query_plan?query_id=5d4ecce7d21182cc
>>> :e2dd7f5700000000
>>> >>>> WARNINGS:
>>> >>>> Failed to open HDFS file
>>> >>>> file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>>> >>>> Error(2): No such file or directory
>>> >>>>
>>> >>>>
>>> >>>> It seems like the load operation read data from hdfs, but not put
>>> into
>>> >>>> right place for query. Also the impalad seems access the file in
>>> local file
>>> >>>> system.
>>> >>>>
>>> >>>>
>>> >>>> 2017-03-10 14:48 GMT+08:00 Jeszy <je...@gmail.com>:
>>> >>>>>
>>> >>>>> Hello,
>>> >>>>>
>>> >>>>> Sounds like Impala expected 1.parquet to be in the folder, but it
>>> >>>>> wasn't.
>>> >>>>> You probably forgot to do 'refresh <table>' after altering data
>>> from
>>> >>>>> the outside.
>>> >>>>>
>>> >>>>> HTH
>>> >>>>>
>>> >>>>> On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cj...@gmail.com> wrote:
>>> >>>>> > Hi,
>>> >>>>> > I'm using latest impala built from github,  and setup impala
>>> cluster
>>> >>>>> > with
>>> >>>>> > 2-nodes like below:
>>> >>>>> > node-1: statestored, catalogd, namenode,datanode.
>>> >>>>> > node-2: impalad, datanode.
>>> >>>>> >
>>> >>>>> > Then I created database and table, loaded data from external
>>> parquet
>>> >>>>> > file
>>> >>>>> > into table. Everything was OK, but when I executed a query it
>>> failed
>>> >>>>> > with
>>> >>>>> > following message:
>>> >>>>> >
>>> >>>>> > Failed to open HDFS file
>>> >>>>> > file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>>> >>>>> > Error(2): No such file or directory
>>> >>>>> >
>>> >>>>> > But I can still ‘desc test’. Anyone met with this? Thanks in
>>> >>>>> > advanced.
>>> >>>>> >
>>> >>>>> >
>>> >>>>> >
>>> >>>>> > --
>>> >>>>> > Thanks & Best Regards
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Thanks & Best Regards
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Thanks & Best Regards
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks & Best Regards
>>>
>>
>>
>>
>> --
>> Thanks & Best Regards
>>
>
>
>
> --
> Thanks & Best Regards
>

Re: Impala Failed to read file from HDFS

Posted by 俊杰陈 <cj...@gmail.com>.

The issue might due to original parquet_data schema was created against
local path. But I tried again to create a new schema without specifying the
LOCATION parameter, "desc database parquet_data" shows that it stored at
HDFS location. I'm not sure how I created a database store at local file
system.

2017-03-13 9:42 GMT+08:00 俊杰陈 <cj...@gmail.com>:

> Hi
> Please see following:
> [bdpe30-cjj:21000] > create table test2 like parquet
> 'hdfs:///data/2.parquet' stored as parquet;
> Query: create table test2 like parquet 'hdfs:///data/2.parquet' stored as
> parquet
>
> Fetched 0 row(s) in 0.21s
> [bdpe30-cjj:21000] > show create table test2;
> Query: show create table test2
> +------------------------------------------------------------+
> | result                                                     |
> +------------------------------------------------------------+
> | CREATE TABLE parquet_data.test2 (                          |
> |   a STRING COMMENT 'Inferred from Parquet file.',          |
> |   b STRING COMMENT 'Inferred from Parquet file.',          |
> |   c BIGINT COMMENT 'Inferred from Parquet file.',          |
> |   d INT COMMENT 'Inferred from Parquet file.',             |
> |   e INT COMMENT 'Inferred from Parquet file.',             |
> |   f BIGINT COMMENT 'Inferred from Parquet file.',          |
> |   g INT COMMENT 'Inferred from Parquet file.',             |
> |   aa STRING COMMENT 'Inferred from Parquet file.',         |
> |   bb BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   cc INT COMMENT 'Inferred from Parquet file.',            |
> |   dd STRING COMMENT 'Inferred from Parquet file.',         |
> |   ee BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   ff BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   gg BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   h BIGINT COMMENT 'Inferred from Parquet file.',          |
> |   i STRING COMMENT 'Inferred from Parquet file.',          |
> |   j STRING COMMENT 'Inferred from Parquet file.',          |
> |   k INT COMMENT 'Inferred from Parquet file.',             |
> |   l STRING COMMENT 'Inferred from Parquet file.',          |
> |   m STRING COMMENT 'Inferred from Parquet file.',          |
> |   n STRING COMMENT 'Inferred from Parquet file.',          |
> |   hh STRING COMMENT 'Inferred from Parquet file.',         |
> |   ii STRING COMMENT 'Inferred from Parquet file.',         |
> |   jj INT COMMENT 'Inferred from Parquet file.',            |
> |   kk INT COMMENT 'Inferred from Parquet file.',            |
> |   ll BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   mm INT COMMENT 'Inferred from Parquet file.',            |
> |   nn BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   o STRING COMMENT 'Inferred from Parquet file.',          |
> |   p BIGINT COMMENT 'Inferred from Parquet file.',          |
> |   q BIGINT COMMENT 'Inferred from Parquet file.',          |
> |   r BIGINT COMMENT 'Inferred from Parquet file.',          |
> |   s INT COMMENT 'Inferred from Parquet file.',             |
> |   t INT COMMENT 'Inferred from Parquet file.',             |
> |   u INT COMMENT 'Inferred from Parquet file.',             |
> |   v INT COMMENT 'Inferred from Parquet file.',             |
> |   w INT COMMENT 'Inferred from Parquet file.',             |
> |   oo INT COMMENT 'Inferred from Parquet file.',            |
> |   pp BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   qq INT COMMENT 'Inferred from Parquet file.',            |
> |   rr BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   ss INT COMMENT 'Inferred from Parquet file.',            |
> |   tt BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   u1 INT COMMENT 'Inferred from Parquet file.',            |
> |   v1 INT COMMENT 'Inferred from Parquet file.',            |
> |   w1 INT COMMENT 'Inferred from Parquet file.',            |
> |   x STRING COMMENT 'Inferred from Parquet file.',          |
> |   y INT COMMENT 'Inferred from Parquet file.',             |
> |   z INT COMMENT 'Inferred from Parquet file.',             |
> |   uu STRING COMMENT 'Inferred from Parquet file.',         |
> |   vv STRING COMMENT 'Inferred from Parquet file.',         |
> |   ww INT COMMENT 'Inferred from Parquet file.',            |
> |   xx INT COMMENT 'Inferred from Parquet file.',            |
> |   yy BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   zz BIGINT COMMENT 'Inferred from Parquet file.',         |
> |   aaa STRING COMMENT 'Inferred from Parquet file.',        |
> |   bbb STRING COMMENT 'Inferred from Parquet file.',        |
> |   ccc STRING COMMENT 'Inferred from Parquet file.'         |
> | )                                                          |
> | STORED AS PARQUET                                          |
> | LOCATION 'file:/user/hive/warehouse/parquet_data.db/test2' |
> |                                                            |
> +------------------------------------------------------------+
> Fetched 1 row(s) in 3.22s
> [bdpe30-cjj:21000] > select * from test2 limit 5;
> Query: select * from test2 limit 5
> Query submitted at: 2017-03-13 02:05:40 (Coordinator:
> http://bdpe30-cjj:25000)
> Query progress can be monitored at: http://bdpe30-cjj:25000/query_
> plan?query_id=e444cb64be71c69:96546e2600000000
>
> Fetched 0 row(s) in 0.28s
>
>
> 2017-03-10 17:12 GMT+08:00 Jeszy <je...@gmail.com>:
>
>> The above looks like you accidentally created the table in a different
>> database - can you repro the 'file:/' error and paste 'show create
>> table' of that table?
>>
>> On Fri, Mar 10, 2017 at 8:25 AM, 俊杰陈 <cj...@gmail.com> wrote:
>> > Hi
>> > Please see the following output. In node bdpe822n2, it worked well. I
>> don't
>> > know why it looks weird today.
>> >
>> > [bdpe822n2:21000] > create table test like parquet
>> 'hdfs:///data/1.parquet'
>> > stored as parquet;
>> > Query: create table test like parquet 'hdfs:///data/1.parquet' stored as
>> > parquet
>> >
>> > Fetched 0 row(s) in 0.14s
>> > [bdpe822n2:21000] > load data inpath "hdfs:///data/1.parquet" into table
>> > test;
>> > Query: load data inpath "hdfs:///data/1.parquet" into table test
>> > +----------------------------------------------------------+
>> > | summary                                                  |
>> > +----------------------------------------------------------+
>> > | Loaded 1 file(s). Total files in destination location: 2 |
>> > +----------------------------------------------------------+
>> > Fetched 1 row(s) in 3.39s
>> > [bdpe822n2:21000] > refresh test;
>> > Query: refresh test
>> > Query submitted at: 2017-03-10 14:46:54 (Coordinator:
>> > http://bdpe822n2:25000)
>> > Query progress can be monitored at:
>> > http://bdpe822n2:25000/query_plan?query_id=4d4ad8038a0362d3:
>> 8c7b326a00000000
>> >
>> > Fetched 0 row(s) in 0.09s
>> > [bdpe822n2:21000] > show create table parquet_data.test;
>> > Query: show create table parquet_data.test
>> > ERROR: AnalysisException: Table does not exist: parquet_data.test
>> >
>> > [bdpe822n2:21000] > use parquet_data;
>> > Query: use parquet_data
>> > [bdpe822n2:21000] > show tables;
>> > Query: show tables
>> >
>> > Fetched 0 row(s) in 0.02s
>> >
>> >
>> >
>> >
>> > 2017-03-10 15:13 GMT+08:00 Sailesh Mukil <sa...@cloudera.com>:
>> >>
>> >> Hi,
>> >>
>> >> Can you do a 'show create table parquet_data.test;'  and paste the
>> output?
>> >>
>> >> On Thu, Mar 9, 2017 at 11:09 PM, 俊杰陈 <cj...@gmail.com> wrote:
>> >>>
>> >>> Plus:
>> >>>
>> >>> In my root directory I found
>> >>> user/hive/warehouse/parquet_data.db/test/2.parquet. So it seems
>> impalad is
>> >>> manipulating on local file system.  How do I configure this?
>> >>>
>> >>> 2017-03-10 15:03 GMT+08:00 俊杰陈 <cj...@gmail.com>:
>> >>>>
>> >>>> Thanks from quick reply:)
>> >>>>
>> >>>> 1.parquet is always in the hdfs. I also did following command for you
>> >>>> reference, please note the URI which is start with file:. It looks
>> weird.
>> >>>>
>> >>>> [bdpe30-cjj:21000] > use parquet_data;
>> >>>> Query: use parquet_data
>> >>>> [bdpe30-cjj:21000] > load data inpath "hdfs:///data/2.parquet" into
>> >>>> table test;
>> >>>> Query: load data inpath "hdfs:///data/2.parquet" into table test
>> >>>> +----------------------------------------------------------+
>> >>>> | summary                                                  |
>> >>>> +----------------------------------------------------------+
>> >>>> | Loaded 1 file(s). Total files in destination location: 2 |
>> >>>> +----------------------------------------------------------+
>> >>>> Fetched 1 row(s) in 0.50s
>> >>>> [bdpe30-cjj:21000] > select count(*) from test;
>> >>>> Query: select count(*) from test
>> >>>> Query submitted at: 2017-03-10 07:14:45 (Coordinator:
>> >>>> http://bdpe30-cjj:25000)
>> >>>> Query progress can be monitored at:
>> >>>> http://bdpe30-cjj:25000/query_plan?query_id=5d4ecce7d21182cc
>> :e2dd7f5700000000
>> >>>> WARNINGS:
>> >>>> Failed to open HDFS file
>> >>>> file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>> >>>> Error(2): No such file or directory
>> >>>>
>> >>>>
>> >>>> It seems like the load operation read data from hdfs, but not put
>> into
>> >>>> right place for query. Also the impalad seems access the file in
>> local file
>> >>>> system.
>> >>>>
>> >>>>
>> >>>> 2017-03-10 14:48 GMT+08:00 Jeszy <je...@gmail.com>:
>> >>>>>
>> >>>>> Hello,
>> >>>>>
>> >>>>> Sounds like Impala expected 1.parquet to be in the folder, but it
>> >>>>> wasn't.
>> >>>>> You probably forgot to do 'refresh <table>' after altering data from
>> >>>>> the outside.
>> >>>>>
>> >>>>> HTH
>> >>>>>
>> >>>>> On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cj...@gmail.com> wrote:
>> >>>>> > Hi,
>> >>>>> > I'm using latest impala built from github,  and setup impala
>> cluster
>> >>>>> > with
>> >>>>> > 2-nodes like below:
>> >>>>> > node-1: statestored, catalogd, namenode,datanode.
>> >>>>> > node-2: impalad, datanode.
>> >>>>> >
>> >>>>> > Then I created database and table, loaded data from external
>> parquet
>> >>>>> > file
>> >>>>> > into table. Everything was OK, but when I executed a query it
>> failed
>> >>>>> > with
>> >>>>> > following message:
>> >>>>> >
>> >>>>> > Failed to open HDFS file
>> >>>>> > file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>> >>>>> > Error(2): No such file or directory
>> >>>>> >
>> >>>>> > But I can still ‘desc test’. Anyone met with this? Thanks in
>> >>>>> > advanced.
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > --
>> >>>>> > Thanks & Best Regards
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Thanks & Best Regards
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Thanks & Best Regards
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Thanks & Best Regards
>>
>
>
>
> --
> Thanks & Best Regards
>



-- 
Thanks & Best Regards

Re: Impala Failed to read file from HDFS

Posted by 俊杰陈 <cj...@gmail.com>.

Hi
Please see following:
[bdpe30-cjj:21000] > create table test2 like parquet
'hdfs:///data/2.parquet' stored as parquet;
Query: create table test2 like parquet 'hdfs:///data/2.parquet' stored as
parquet

Fetched 0 row(s) in 0.21s
[bdpe30-cjj:21000] > show create table test2;
Query: show create table test2
+------------------------------------------------------------+
| result                                                     |
+------------------------------------------------------------+
| CREATE TABLE parquet_data.test2 (                          |
|   a STRING COMMENT 'Inferred from Parquet file.',          |
|   b STRING COMMENT 'Inferred from Parquet file.',          |
|   c BIGINT COMMENT 'Inferred from Parquet file.',          |
|   d INT COMMENT 'Inferred from Parquet file.',             |
|   e INT COMMENT 'Inferred from Parquet file.',             |
|   f BIGINT COMMENT 'Inferred from Parquet file.',          |
|   g INT COMMENT 'Inferred from Parquet file.',             |
|   aa STRING COMMENT 'Inferred from Parquet file.',         |
|   bb BIGINT COMMENT 'Inferred from Parquet file.',         |
|   cc INT COMMENT 'Inferred from Parquet file.',            |
|   dd STRING COMMENT 'Inferred from Parquet file.',         |
|   ee BIGINT COMMENT 'Inferred from Parquet file.',         |
|   ff BIGINT COMMENT 'Inferred from Parquet file.',         |
|   gg BIGINT COMMENT 'Inferred from Parquet file.',         |
|   h BIGINT COMMENT 'Inferred from Parquet file.',          |
|   i STRING COMMENT 'Inferred from Parquet file.',          |
|   j STRING COMMENT 'Inferred from Parquet file.',          |
|   k INT COMMENT 'Inferred from Parquet file.',             |
|   l STRING COMMENT 'Inferred from Parquet file.',          |
|   m STRING COMMENT 'Inferred from Parquet file.',          |
|   n STRING COMMENT 'Inferred from Parquet file.',          |
|   hh STRING COMMENT 'Inferred from Parquet file.',         |
|   ii STRING COMMENT 'Inferred from Parquet file.',         |
|   jj INT COMMENT 'Inferred from Parquet file.',            |
|   kk INT COMMENT 'Inferred from Parquet file.',            |
|   ll BIGINT COMMENT 'Inferred from Parquet file.',         |
|   mm INT COMMENT 'Inferred from Parquet file.',            |
|   nn BIGINT COMMENT 'Inferred from Parquet file.',         |
|   o STRING COMMENT 'Inferred from Parquet file.',          |
|   p BIGINT COMMENT 'Inferred from Parquet file.',          |
|   q BIGINT COMMENT 'Inferred from Parquet file.',          |
|   r BIGINT COMMENT 'Inferred from Parquet file.',          |
|   s INT COMMENT 'Inferred from Parquet file.',             |
|   t INT COMMENT 'Inferred from Parquet file.',             |
|   u INT COMMENT 'Inferred from Parquet file.',             |
|   v INT COMMENT 'Inferred from Parquet file.',             |
|   w INT COMMENT 'Inferred from Parquet file.',             |
|   oo INT COMMENT 'Inferred from Parquet file.',            |
|   pp BIGINT COMMENT 'Inferred from Parquet file.',         |
|   qq INT COMMENT 'Inferred from Parquet file.',            |
|   rr BIGINT COMMENT 'Inferred from Parquet file.',         |
|   ss INT COMMENT 'Inferred from Parquet file.',            |
|   tt BIGINT COMMENT 'Inferred from Parquet file.',         |
|   u1 INT COMMENT 'Inferred from Parquet file.',            |
|   v1 INT COMMENT 'Inferred from Parquet file.',            |
|   w1 INT COMMENT 'Inferred from Parquet file.',            |
|   x STRING COMMENT 'Inferred from Parquet file.',          |
|   y INT COMMENT 'Inferred from Parquet file.',             |
|   z INT COMMENT 'Inferred from Parquet file.',             |
|   uu STRING COMMENT 'Inferred from Parquet file.',         |
|   vv STRING COMMENT 'Inferred from Parquet file.',         |
|   ww INT COMMENT 'Inferred from Parquet file.',            |
|   xx INT COMMENT 'Inferred from Parquet file.',            |
|   yy BIGINT COMMENT 'Inferred from Parquet file.',         |
|   zz BIGINT COMMENT 'Inferred from Parquet file.',         |
|   aaa STRING COMMENT 'Inferred from Parquet file.',        |
|   bbb STRING COMMENT 'Inferred from Parquet file.',        |
|   ccc STRING COMMENT 'Inferred from Parquet file.'         |
| )                                                          |
| STORED AS PARQUET                                          |
| LOCATION 'file:/user/hive/warehouse/parquet_data.db/test2' |
|                                                            |
+------------------------------------------------------------+
Fetched 1 row(s) in 3.22s
[bdpe30-cjj:21000] > select * from test2 limit 5;
Query: select * from test2 limit 5
Query submitted at: 2017-03-13 02:05:40 (Coordinator:
http://bdpe30-cjj:25000)
Query progress can be monitored at:
http://bdpe30-cjj:25000/query_plan?query_id=e444cb64be71c69:96546e2600000000

Fetched 0 row(s) in 0.28s


2017-03-10 17:12 GMT+08:00 Jeszy <je...@gmail.com>:

> The above looks like you accidentally created the table in a different
> database - can you repro the 'file:/' error and paste 'show create
> table' of that table?
>
> On Fri, Mar 10, 2017 at 8:25 AM, 俊杰陈 <cj...@gmail.com> wrote:
> > Hi
> > Please see the following output. In node bdpe822n2, it worked well. I
> don't
> > know why it looks weird today.
> >
> > [bdpe822n2:21000] > create table test like parquet
> 'hdfs:///data/1.parquet'
> > stored as parquet;
> > Query: create table test like parquet 'hdfs:///data/1.parquet' stored as
> > parquet
> >
> > Fetched 0 row(s) in 0.14s
> > [bdpe822n2:21000] > load data inpath "hdfs:///data/1.parquet" into table
> > test;
> > Query: load data inpath "hdfs:///data/1.parquet" into table test
> > +----------------------------------------------------------+
> > | summary                                                  |
> > +----------------------------------------------------------+
> > | Loaded 1 file(s). Total files in destination location: 2 |
> > +----------------------------------------------------------+
> > Fetched 1 row(s) in 3.39s
> > [bdpe822n2:21000] > refresh test;
> > Query: refresh test
> > Query submitted at: 2017-03-10 14:46:54 (Coordinator:
> > http://bdpe822n2:25000)
> > Query progress can be monitored at:
> > http://bdpe822n2:25000/query_plan?query_id=4d4ad8038a0362d3:
> 8c7b326a00000000
> >
> > Fetched 0 row(s) in 0.09s
> > [bdpe822n2:21000] > show create table parquet_data.test;
> > Query: show create table parquet_data.test
> > ERROR: AnalysisException: Table does not exist: parquet_data.test
> >
> > [bdpe822n2:21000] > use parquet_data;
> > Query: use parquet_data
> > [bdpe822n2:21000] > show tables;
> > Query: show tables
> >
> > Fetched 0 row(s) in 0.02s
> >
> >
> >
> >
> > 2017-03-10 15:13 GMT+08:00 Sailesh Mukil <sa...@cloudera.com>:
> >>
> >> Hi,
> >>
> >> Can you do a 'show create table parquet_data.test;'  and paste the
> output?
> >>
> >> On Thu, Mar 9, 2017 at 11:09 PM, 俊杰陈 <cj...@gmail.com> wrote:
> >>>
> >>> Plus:
> >>>
> >>> In my root directory I found
> >>> user/hive/warehouse/parquet_data.db/test/2.parquet. So it seems
> impalad is
> >>> manipulating on local file system.  How do I configure this?
> >>>
> >>> 2017-03-10 15:03 GMT+08:00 俊杰陈 <cj...@gmail.com>:
> >>>>
> >>>> Thanks from quick reply:)
> >>>>
> >>>> 1.parquet is always in the hdfs. I also did following command for you
> >>>> reference, please note the URI which is start with file:. It looks
> weird.
> >>>>
> >>>> [bdpe30-cjj:21000] > use parquet_data;
> >>>> Query: use parquet_data
> >>>> [bdpe30-cjj:21000] > load data inpath "hdfs:///data/2.parquet" into
> >>>> table test;
> >>>> Query: load data inpath "hdfs:///data/2.parquet" into table test
> >>>> +----------------------------------------------------------+
> >>>> | summary                                                  |
> >>>> +----------------------------------------------------------+
> >>>> | Loaded 1 file(s). Total files in destination location: 2 |
> >>>> +----------------------------------------------------------+
> >>>> Fetched 1 row(s) in 0.50s
> >>>> [bdpe30-cjj:21000] > select count(*) from test;
> >>>> Query: select count(*) from test
> >>>> Query submitted at: 2017-03-10 07:14:45 (Coordinator:
> >>>> http://bdpe30-cjj:25000)
> >>>> Query progress can be monitored at:
> >>>> http://bdpe30-cjj:25000/query_plan?query_id=5d4ecce7d21182cc:
> e2dd7f5700000000
> >>>> WARNINGS:
> >>>> Failed to open HDFS file
> >>>> file:/user/hive/warehouse/parquet_data.db/test/1.parquet
> >>>> Error(2): No such file or directory
> >>>>
> >>>>
> >>>> It seems like the load operation read data from hdfs, but not put into
> >>>> right place for query. Also the impalad seems access the file in
> local file
> >>>> system.
> >>>>
> >>>>
> >>>> 2017-03-10 14:48 GMT+08:00 Jeszy <je...@gmail.com>:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> Sounds like Impala expected 1.parquet to be in the folder, but it
> >>>>> wasn't.
> >>>>> You probably forgot to do 'refresh <table>' after altering data from
> >>>>> the outside.
> >>>>>
> >>>>> HTH
> >>>>>
> >>>>> On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cj...@gmail.com> wrote:
> >>>>> > Hi,
> >>>>> > I'm using latest impala built from github,  and setup impala
> cluster
> >>>>> > with
> >>>>> > 2-nodes like below:
> >>>>> > node-1: statestored, catalogd, namenode,datanode.
> >>>>> > node-2: impalad, datanode.
> >>>>> >
> >>>>> > Then I created database and table, loaded data from external
> parquet
> >>>>> > file
> >>>>> > into table. Everything was OK, but when I executed a query it
> failed
> >>>>> > with
> >>>>> > following message:
> >>>>> >
> >>>>> > Failed to open HDFS file
> >>>>> > file:/user/hive/warehouse/parquet_data.db/test/1.parquet
> >>>>> > Error(2): No such file or directory
> >>>>> >
> >>>>> > But I can still ‘desc test’. Anyone met with this? Thanks in
> >>>>> > advanced.
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > --
> >>>>> > Thanks & Best Regards
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Thanks & Best Regards
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Thanks & Best Regards
> >>
> >>
> >
> >
> >
> > --
> > Thanks & Best Regards
>



-- 
Thanks & Best Regards

Re: Impala Failed to read file from HDFS

Posted by Jeszy <je...@gmail.com>.

The above looks like you accidentally created the table in a different
database - can you repro the 'file:/' error and paste 'show create
table' of that table?

On Fri, Mar 10, 2017 at 8:25 AM, 俊杰陈 <cj...@gmail.com> wrote:
> Hi
> Please see the following output. In node bdpe822n2, it worked well. I don't
> know why it looks weird today.
>
> [bdpe822n2:21000] > create table test like parquet 'hdfs:///data/1.parquet'
> stored as parquet;
> Query: create table test like parquet 'hdfs:///data/1.parquet' stored as
> parquet
>
> Fetched 0 row(s) in 0.14s
> [bdpe822n2:21000] > load data inpath "hdfs:///data/1.parquet" into table
> test;
> Query: load data inpath "hdfs:///data/1.parquet" into table test
> +----------------------------------------------------------+
> | summary                                                  |
> +----------------------------------------------------------+
> | Loaded 1 file(s). Total files in destination location: 2 |
> +----------------------------------------------------------+
> Fetched 1 row(s) in 3.39s
> [bdpe822n2:21000] > refresh test;
> Query: refresh test
> Query submitted at: 2017-03-10 14:46:54 (Coordinator:
> http://bdpe822n2:25000)
> Query progress can be monitored at:
> http://bdpe822n2:25000/query_plan?query_id=4d4ad8038a0362d3:8c7b326a00000000
>
> Fetched 0 row(s) in 0.09s
> [bdpe822n2:21000] > show create table parquet_data.test;
> Query: show create table parquet_data.test
> ERROR: AnalysisException: Table does not exist: parquet_data.test
>
> [bdpe822n2:21000] > use parquet_data;
> Query: use parquet_data
> [bdpe822n2:21000] > show tables;
> Query: show tables
>
> Fetched 0 row(s) in 0.02s
>
>
>
>
> 2017-03-10 15:13 GMT+08:00 Sailesh Mukil <sa...@cloudera.com>:
>>
>> Hi,
>>
>> Can you do a 'show create table parquet_data.test;'  and paste the output?
>>
>> On Thu, Mar 9, 2017 at 11:09 PM, 俊杰陈 <cj...@gmail.com> wrote:
>>>
>>> Plus:
>>>
>>> In my root directory I found
>>> user/hive/warehouse/parquet_data.db/test/2.parquet. So it seems impalad is
>>> manipulating on local file system.  How do I configure this?
>>>
>>> 2017-03-10 15:03 GMT+08:00 俊杰陈 <cj...@gmail.com>:
>>>>
>>>> Thanks from quick reply:)
>>>>
>>>> 1.parquet is always in the hdfs. I also did following command for you
>>>> reference, please note the URI which is start with file:. It looks weird.
>>>>
>>>> [bdpe30-cjj:21000] > use parquet_data;
>>>> Query: use parquet_data
>>>> [bdpe30-cjj:21000] > load data inpath "hdfs:///data/2.parquet" into
>>>> table test;
>>>> Query: load data inpath "hdfs:///data/2.parquet" into table test
>>>> +----------------------------------------------------------+
>>>> | summary                                                  |
>>>> +----------------------------------------------------------+
>>>> | Loaded 1 file(s). Total files in destination location: 2 |
>>>> +----------------------------------------------------------+
>>>> Fetched 1 row(s) in 0.50s
>>>> [bdpe30-cjj:21000] > select count(*) from test;
>>>> Query: select count(*) from test
>>>> Query submitted at: 2017-03-10 07:14:45 (Coordinator:
>>>> http://bdpe30-cjj:25000)
>>>> Query progress can be monitored at:
>>>> http://bdpe30-cjj:25000/query_plan?query_id=5d4ecce7d21182cc:e2dd7f5700000000
>>>> WARNINGS:
>>>> Failed to open HDFS file
>>>> file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>>>> Error(2): No such file or directory
>>>>
>>>>
>>>> It seems like the load operation read data from hdfs, but not put into
>>>> right place for query. Also the impalad seems access the file in local file
>>>> system.
>>>>
>>>>
>>>> 2017-03-10 14:48 GMT+08:00 Jeszy <je...@gmail.com>:
>>>>>
>>>>> Hello,
>>>>>
>>>>> Sounds like Impala expected 1.parquet to be in the folder, but it
>>>>> wasn't.
>>>>> You probably forgot to do 'refresh <table>' after altering data from
>>>>> the outside.
>>>>>
>>>>> HTH
>>>>>
>>>>> On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cj...@gmail.com> wrote:
>>>>> > Hi,
>>>>> > I'm using latest impala built from github,  and setup impala cluster
>>>>> > with
>>>>> > 2-nodes like below:
>>>>> > node-1: statestored, catalogd, namenode,datanode.
>>>>> > node-2: impalad, datanode.
>>>>> >
>>>>> > Then I created database and table, loaded data from external parquet
>>>>> > file
>>>>> > into table. Everything was OK, but when I executed a query it failed
>>>>> > with
>>>>> > following message:
>>>>> >
>>>>> > Failed to open HDFS file
>>>>> > file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>>>>> > Error(2): No such file or directory
>>>>> >
>>>>> > But I can still ‘desc test’. Anyone met with this? Thanks in
>>>>> > advanced.
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Thanks & Best Regards
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks & Best Regards
>>>
>>>
>>>
>>>
>>> --
>>> Thanks & Best Regards
>>
>>
>
>
>
> --
> Thanks & Best Regards

Re: Impala Failed to read file from HDFS

Posted by 俊杰陈 <cj...@gmail.com>.

Hi
Please see the following output. In node bdpe822n2, it worked well. I don't
know why it looks weird today.

[bdpe822n2:21000] > create table test like parquet 'hdfs:///data/1.parquet'
stored as parquet;
Query: create table test like parquet 'hdfs:///data/1.parquet' stored as
parquet

Fetched 0 row(s) in 0.14s
[bdpe822n2:21000] > load data inpath "hdfs:///data/1.parquet" into table
test;
Query: load data inpath "hdfs:///data/1.parquet" into table test
+----------------------------------------------------------+
| summary                                                  |
+----------------------------------------------------------+
| Loaded 1 file(s). Total files in destination location: 2 |
+----------------------------------------------------------+
Fetched 1 row(s) in 3.39s
[bdpe822n2:21000] > refresh test;
Query: refresh test
Query submitted at: 2017-03-10 14:46:54 (Coordinator: http://bdpe822n2:25000
)
Query progress can be monitored at:
http://bdpe822n2:25000/query_plan?query_id=4d4ad8038a0362d3:8c7b326a00000000

Fetched 0 row(s) in 0.09s
[bdpe822n2:21000] > show create table parquet_data.test;
Query: show create table parquet_data.test
ERROR: AnalysisException: Table does not exist: parquet_data.test

[bdpe822n2:21000] > use parquet_data;
Query: use parquet_data
[bdpe822n2:21000] > show tables;
Query: show tables

Fetched 0 row(s) in 0.02s




2017-03-10 15:13 GMT+08:00 Sailesh Mukil <sa...@cloudera.com>:

> Hi,
>
> Can you do a 'show create table parquet_data.test;'  and paste the output?
>
> On Thu, Mar 9, 2017 at 11:09 PM, 俊杰陈 <cj...@gmail.com> wrote:
>
>> Plus:
>>
>> In my root directory I found user/hive/warehouse/parquet_data.db/test/2.parquet.
>> So it seems impalad is manipulating on local file system.  How do I
>> configure this?
>>
>> 2017-03-10 15:03 GMT+08:00 俊杰陈 <cj...@gmail.com>:
>>
>>> Thanks from quick reply:)
>>>
>>> 1.parquet is always in the hdfs. I also did following command for you
>>> reference, please note the URI which is start with file:. It looks weird.
>>>
>>> [bdpe30-cjj:21000] > use parquet_data;
>>> Query: use parquet_data
>>> [bdpe30-cjj:21000] > load data inpath "hdfs:///data/2.parquet" into
>>> table test;
>>> Query: load data inpath "hdfs:///data/2.parquet" into table test
>>> +----------------------------------------------------------+
>>> | summary                                                  |
>>> +----------------------------------------------------------+
>>> | Loaded 1 file(s). Total files in destination location: 2 |
>>> +----------------------------------------------------------+
>>> Fetched 1 row(s) in 0.50s
>>> [bdpe30-cjj:21000] > select count(*) from test;
>>> Query: select count(*) from test
>>> Query submitted at: 2017-03-10 07:14:45 (Coordinator:
>>> http://bdpe30-cjj:25000)
>>> Query progress can be monitored at: http://bdpe30-cjj:25000/query_
>>> plan?query_id=5d4ecce7d21182cc:e2dd7f5700000000
>>> WARNINGS:
>>> Failed to open HDFS file *file:*/user/hive/warehouse/parq
>>> uet_data.db/test/1.parquet
>>> Error(2): No such file or directory
>>>
>>>
>>> It seems like the load operation read data from hdfs, but not put into
>>> right place for query. Also the impalad seems access the file in local file
>>> system.
>>>
>>>
>>> 2017-03-10 14:48 GMT+08:00 Jeszy <je...@gmail.com>:
>>>
>>>> Hello,
>>>>
>>>> Sounds like Impala expected 1.parquet to be in the folder, but it
>>>> wasn't.
>>>> You probably forgot to do 'refresh <table>' after altering data from
>>>> the outside.
>>>>
>>>> HTH
>>>>
>>>> On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cj...@gmail.com> wrote:
>>>> > Hi,
>>>> > I'm using latest impala built from github,  and setup impala cluster
>>>> with
>>>> > 2-nodes like below:
>>>> > node-1: statestored, catalogd, namenode,datanode.
>>>> > node-2: impalad, datanode.
>>>> >
>>>> > Then I created database and table, loaded data from external parquet
>>>> file
>>>> > into table. Everything was OK, but when I executed a query it failed
>>>> with
>>>> > following message:
>>>> >
>>>> > Failed to open HDFS file
>>>> > file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>>>> > Error(2): No such file or directory
>>>> >
>>>> > But I can still ‘desc test’. Anyone met with this? Thanks in advanced.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Thanks & Best Regards
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks & Best Regards
>>>
>>
>>
>>
>> --
>> Thanks & Best Regards
>>
>
>


-- 
Thanks & Best Regards

Re: Impala Failed to read file from HDFS

Posted by Sailesh Mukil <sa...@cloudera.com>.

Hi,

Can you do a 'show create table parquet_data.test;'  and paste the output?

On Thu, Mar 9, 2017 at 11:09 PM, 俊杰陈 <cj...@gmail.com> wrote:

> Plus:
>
> In my root directory I found user/hive/warehouse/parquet_data.db/test/2.parquet.
> So it seems impalad is manipulating on local file system.  How do I
> configure this?
>
> 2017-03-10 15:03 GMT+08:00 俊杰陈 <cj...@gmail.com>:
>
>> Thanks from quick reply:)
>>
>> 1.parquet is always in the hdfs. I also did following command for you
>> reference, please note the URI which is start with file:. It looks weird.
>>
>> [bdpe30-cjj:21000] > use parquet_data;
>> Query: use parquet_data
>> [bdpe30-cjj:21000] > load data inpath "hdfs:///data/2.parquet" into table
>> test;
>> Query: load data inpath "hdfs:///data/2.parquet" into table test
>> +----------------------------------------------------------+
>> | summary                                                  |
>> +----------------------------------------------------------+
>> | Loaded 1 file(s). Total files in destination location: 2 |
>> +----------------------------------------------------------+
>> Fetched 1 row(s) in 0.50s
>> [bdpe30-cjj:21000] > select count(*) from test;
>> Query: select count(*) from test
>> Query submitted at: 2017-03-10 07:14:45 (Coordinator:
>> http://bdpe30-cjj:25000)
>> Query progress can be monitored at: http://bdpe30-cjj:25000/query_
>> plan?query_id=5d4ecce7d21182cc:e2dd7f5700000000
>> WARNINGS:
>> Failed to open HDFS file *file:*/user/hive/warehouse/parq
>> uet_data.db/test/1.parquet
>> Error(2): No such file or directory
>>
>>
>> It seems like the load operation read data from hdfs, but not put into
>> right place for query. Also the impalad seems access the file in local file
>> system.
>>
>>
>> 2017-03-10 14:48 GMT+08:00 Jeszy <je...@gmail.com>:
>>
>>> Hello,
>>>
>>> Sounds like Impala expected 1.parquet to be in the folder, but it wasn't.
>>> You probably forgot to do 'refresh <table>' after altering data from
>>> the outside.
>>>
>>> HTH
>>>
>>> On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cj...@gmail.com> wrote:
>>> > Hi,
>>> > I'm using latest impala built from github,  and setup impala cluster
>>> with
>>> > 2-nodes like below:
>>> > node-1: statestored, catalogd, namenode,datanode.
>>> > node-2: impalad, datanode.
>>> >
>>> > Then I created database and table, loaded data from external parquet
>>> file
>>> > into table. Everything was OK, but when I executed a query it failed
>>> with
>>> > following message:
>>> >
>>> > Failed to open HDFS file
>>> > file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>>> > Error(2): No such file or directory
>>> >
>>> > But I can still ‘desc test’. Anyone met with this? Thanks in advanced.
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks & Best Regards
>>>
>>
>>
>>
>> --
>> Thanks & Best Regards
>>
>
>
>
> --
> Thanks & Best Regards
>

Re: Impala Failed to read file from HDFS

Posted by 俊杰陈 <cj...@gmail.com>.

Plus:

In my root directory I found
user/hive/warehouse/parquet_data.db/test/2.parquet.
So it seems impalad is manipulating on local file system.  How do I
configure this?

2017-03-10 15:03 GMT+08:00 俊杰陈 <cj...@gmail.com>:

> Thanks from quick reply:)
>
> 1.parquet is always in the hdfs. I also did following command for you
> reference, please note the URI which is start with file:. It looks weird.
>
> [bdpe30-cjj:21000] > use parquet_data;
> Query: use parquet_data
> [bdpe30-cjj:21000] > load data inpath "hdfs:///data/2.parquet" into table
> test;
> Query: load data inpath "hdfs:///data/2.parquet" into table test
> +----------------------------------------------------------+
> | summary                                                  |
> +----------------------------------------------------------+
> | Loaded 1 file(s). Total files in destination location: 2 |
> +----------------------------------------------------------+
> Fetched 1 row(s) in 0.50s
> [bdpe30-cjj:21000] > select count(*) from test;
> Query: select count(*) from test
> Query submitted at: 2017-03-10 07:14:45 (Coordinator:
> http://bdpe30-cjj:25000)
> Query progress can be monitored at: http://bdpe30-cjj:25000/query_
> plan?query_id=5d4ecce7d21182cc:e2dd7f5700000000
> WARNINGS:
> Failed to open HDFS file *file:*/user/hive/warehouse/
> parquet_data.db/test/1.parquet
> Error(2): No such file or directory
>
>
> It seems like the load operation read data from hdfs, but not put into
> right place for query. Also the impalad seems access the file in local file
> system.
>
>
> 2017-03-10 14:48 GMT+08:00 Jeszy <je...@gmail.com>:
>
>> Hello,
>>
>> Sounds like Impala expected 1.parquet to be in the folder, but it wasn't.
>> You probably forgot to do 'refresh <table>' after altering data from
>> the outside.
>>
>> HTH
>>
>> On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cj...@gmail.com> wrote:
>> > Hi,
>> > I'm using latest impala built from github,  and setup impala cluster
>> with
>> > 2-nodes like below:
>> > node-1: statestored, catalogd, namenode,datanode.
>> > node-2: impalad, datanode.
>> >
>> > Then I created database and table, loaded data from external parquet
>> file
>> > into table. Everything was OK, but when I executed a query it failed
>> with
>> > following message:
>> >
>> > Failed to open HDFS file
>> > file:/user/hive/warehouse/parquet_data.db/test/1.parquet
>> > Error(2): No such file or directory
>> >
>> > But I can still ‘desc test’. Anyone met with this? Thanks in advanced.
>> >
>> >
>> >
>> > --
>> > Thanks & Best Regards
>>
>
>
>
> --
> Thanks & Best Regards
>



-- 
Thanks & Best Regards

Re: Impala Failed to read file from HDFS

Posted by 俊杰陈 <cj...@gmail.com>.

Thanks from quick reply:)

1.parquet is always in the hdfs. I also did following command for you
reference, please note the URI which is start with file:. It looks weird.

[bdpe30-cjj:21000] > use parquet_data;
Query: use parquet_data
[bdpe30-cjj:21000] > load data inpath "hdfs:///data/2.parquet" into table
test;
Query: load data inpath "hdfs:///data/2.parquet" into table test
+----------------------------------------------------------+
| summary                                                  |
+----------------------------------------------------------+
| Loaded 1 file(s). Total files in destination location: 2 |
+----------------------------------------------------------+
Fetched 1 row(s) in 0.50s
[bdpe30-cjj:21000] > select count(*) from test;
Query: select count(*) from test
Query submitted at: 2017-03-10 07:14:45 (Coordinator:
http://bdpe30-cjj:25000)
Query progress can be monitored at:
http://bdpe30-cjj:25000/query_plan?query_id=5d4ecce7d21182cc:e2dd7f5700000000
WARNINGS:
Failed to open HDFS file *file:*
/user/hive/warehouse/parquet_data.db/test/1.parquet
Error(2): No such file or directory


It seems like the load operation read data from hdfs, but not put into
right place for query. Also the impalad seems access the file in local file
system.


2017-03-10 14:48 GMT+08:00 Jeszy <je...@gmail.com>:

> Hello,
>
> Sounds like Impala expected 1.parquet to be in the folder, but it wasn't.
> You probably forgot to do 'refresh <table>' after altering data from
> the outside.
>
> HTH
>
> On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cj...@gmail.com> wrote:
> > Hi,
> > I'm using latest impala built from github,  and setup impala cluster with
> > 2-nodes like below:
> > node-1: statestored, catalogd, namenode,datanode.
> > node-2: impalad, datanode.
> >
> > Then I created database and table, loaded data from external parquet file
> > into table. Everything was OK, but when I executed a query it failed with
> > following message:
> >
> > Failed to open HDFS file
> > file:/user/hive/warehouse/parquet_data.db/test/1.parquet
> > Error(2): No such file or directory
> >
> > But I can still ‘desc test’. Anyone met with this? Thanks in advanced.
> >
> >
> >
> > --
> > Thanks & Best Regards
>



-- 
Thanks & Best Regards

Re: Impala Failed to read file from HDFS

Posted by Jeszy <je...@gmail.com>.

Hello,

Sounds like Impala expected 1.parquet to be in the folder, but it wasn't.
You probably forgot to do 'refresh <table>' after altering data from
the outside.

HTH

On Fri, Mar 10, 2017 at 7:30 AM, 俊杰陈 <cj...@gmail.com> wrote:
> Hi,
> I'm using latest impala built from github,  and setup impala cluster with
> 2-nodes like below:
> node-1: statestored, catalogd, namenode,datanode.
> node-2: impalad, datanode.
>
> Then I created database and table, loaded data from external parquet file
> into table. Everything was OK, but when I executed a query it failed with
> following message:
>
> Failed to open HDFS file
> file:/user/hive/warehouse/parquet_data.db/test/1.parquet
> Error(2): No such file or directory
>
> But I can still ‘desc test’. Anyone met with this? Thanks in advanced.
>
>
>
> --
> Thanks & Best Regards