You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Quanlong Huang <hu...@gmail.com> on 2021/04/02 03:20:19 UTC

Re: Issue in creating tables in S3 in Impala 3.4.0

I think logs of catalogd and HMS have more details about this error. Could
you find and share the stacktrace of this exception?

BTW, is your impala-3.4 able to create other kinds of tables, e.g. hdfs
table or kudu table?

On Wed, Mar 31, 2021 at 9:59 PM Hashan Gayasri <ha...@gmail.com>
wrote:

> Hi all,
>
> I'm getting a Thrift error when trying to create an S3 backed tables in
> Impala 3.4.0. This worked without any issues in impala 3.3.0.
>
> Output from impala-shell:
>
>> Server version: impalad version 3.4.0-RELEASE RELEASE (build Could not
>> obtain git hash)
>>
>> ***********************************************************************************
>> Welcome to the Impala shell.
>> (Impala Shell v3.4.0-RELEASE (Could) built on Wed Feb 24 23:52:04 GMT
>> 2021)
>>
>> After running a query, type SUMMARY to see a summary of where time was
>> spent.
>>
>> ***********************************************************************************
>> [localhost:11432] default> CREATE TABLE parqt_table_s3 (time INT)
>> PARTITIONED BY (time_part INT) STORED AS PARQUET LOCATION
>> 's3a://hashan-0011220-test-s3/Sample_Data';
>> Query: CREATE TABLE parqt_table_s3 (time INT) PARTITIONED BY (time_part
>> INT) STORED AS PARQUET LOCATION
>> 's3a://hashan-0011220-test-s3/Sample_Data'
>> ERROR: ImpalaRuntimeException: Error making 'createTable' RPC to Hive
>> Metastore:
>> CAUSED BY: TTransportException: null
>>
>
> Log (impalad):
>
>> I0331 14:31:56.494190 1732 Frontend.java:1487]
>> d54db09df4e5acfe:4e10d11500000000] Analyzing query: CREATE TABLE
>> parqt_table_s3 (time INT) PARTITIONED BY (time_part INT) STORED AS PARQUET
>> LOCATION 's3a://hashan-0011220-test-s3/Sample_Data' db: default
>> I0331 14:31:56.600950 1732 Frontend.java:1529]
>> d54db09df4e5acfe:4e10d11500000000] Analysis and authorization finished.
>> I0331 14:31:57.650727 1732 client-request-state.cc:211]
>> d54db09df4e5acfe:4e10d11500000000] ImpalaRuntimeException: Error making
>> 'createTable' RPC to Hive Metastore:
>> CAUSED BY: TTransportException: null
>>
>
> Is this a known issue / has someone faced this before?
>
>
> Regards,
> Hashan Gayasri
>
>

Re: Issue in creating tables in S3 in Impala 3.4.0

Posted by Hashan Gayasri <ha...@gmail.com>.
Hi Quanlong,
Yes, I've tried creating tables stored as Kudu and as parquet (local fs)
without any issues.

The stack trace of the exception was as follows:

Java exception follows:
> org.apache.thrift.transport.TTransportException
>     at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>     at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>     at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
>     at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
>     at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
>     at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
>     at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1191)
>     at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1177)
>     at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2634)
>     at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:827)
>     at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:813)
>     at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>     at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:150)
>     at com.sun.proxy.$Proxy7.createTable(Unknown Source)
>     at
> org.apache.impala.service.CatalogOpExecutor.createTable(CatalogOpExecutor.java:2412)
>     at
> org.apache.impala.service.CatalogOpExecutor.createTable(CatalogOpExecutor.java:2239)
>     at
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:384)
>     at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:173)
> E0505 16:16:28.718776 27445 catalog-server.cc:114] ImpalaRuntimeException:
> Error making 'createTable' RPC to Hive Metastore:
> CAUSED BY: TTransportException: null
>

I've also uploaded all the logs (catalogd, impalad, kudu-master,
kudu-tserver and statestored) here:
https://drive.google.com/drive/folders/1ItiNkFbxTP78CIp4w7yMHnXMKI8mphh8?usp=sharing


From the following Impala session:

> hashan@ip-10-10-10-139.ap-southeast-1.compute.internal:/mnt/hashan/logs>
> impala-shell -i 0.0.0.0:11432
> Starting Impala Shell without Kerberos authentication
> Opened TCP connection to 0.0.0.0:11432
> Connected to 0.0.0.0:11432
> Server version: impalad version 3.4.0-RELEASE RELEASE (build Could not
> obtain git hash)
>
> ***********************************************************************************
> Welcome to the Impala shell.
> (Impala Shell v3.4.0-RELEASE (Could) built on Wed Feb 24 23:52:04 GMT 2021)
>
> When pretty-printing is disabled, you can use the '--output_delimiter'
> flag to set
> the delimiter for fields in the same row. The default is '\t'.
>
> ***********************************************************************************
> [0.0.0.0:11432] default> create table sample_data_s3_2xx (id int, val
> int, zerofill string,name string, assertion boolean, city string, state
> string) stored as parquet location 's3a://hg-hashan-test-1-s3/test-tblxx';
> Query: create table sample_data_s3_2xx (id int, val int, zerofill
> string,name string, assertion boolean, city string, state string) stored as
> parquet location 's3a://hg-hashan-test-1-s3/test-tblxx'
> ERROR: ImpalaRuntimeException: Error making 'createTable' RPC to Hive
> Metastore:
> CAUSED BY: TTransportException: null
>
> [0.0.0.0:11432] default> create table sample_data_s3_2xxloc (id int, val
> int, zerofill string,name string, assertion boolean, city string, state
> string) stored as parquet location '/tmp/tbllocal';
> Query: create table sample_data_s3_2xxloc (id int, val int, zerofill
> string,name string, assertion boolean, city string, state string) stored as
> parquet location '/tmp/tbllocal'
> +-------------------------+
> | summary                 |
> +-------------------------+
> | Table has been created. |
> +-------------------------+
> Fetched 1 row(s) in 0.37s
> [0.0.0.0:11432] default> create table sample_data_s3_2xxkudu (id int, val
> int, zerofill string,name string, assertion boolean, city string, state
> string) stored as kudu;
> Query: create table sample_data_s3_2xxkudu (id int, val int, zerofill
> string,name string, assertion boolean, city string, state string) stored as
> kudu
> ERROR: AnalysisException: A primary key is required for a Kudu table.
>
> [0.0.0.0:11432] default> create table sample_data_s3_2xxkudu (id int, val
> int, zerofill string,name string, assertion boolean, city string, state
> string, PRIMARY KEY(id)) stored as kudu;
> Query: create table sample_data_s3_2xxkudu (id int, val int, zerofill
> string,name string, assertion boolean, city string, state string, PRIMARY
> KEY(id)) stored as kudu
> ERROR: ImpalaRuntimeException: Error creating Kudu table
> 'impala::default.sample_data_s3_2xxkudu'
> CAUSED BY: NonRecoverableException: not enough live tablet servers to
> create a table with the requested replication factor 3; 1 tablet servers
> are alive
>
> [0.0.0.0:11432] default> create table sample_data_s3_2xxkudu (id int, val
> int, zerofill string,name string, assertion boolean, city string, state
> string, PRIMARY KEY(id)) TBLPROPERTIES ('kudu.num_tablet_replicas' = '1')
> stored as kudu;
> Query: create table sample_data_s3_2xxkudu (id int, val int, zerofill
> string,name string, assertion boolean, city string, state string, PRIMARY
> KEY(id)) TBLPROPERTIES ('kudu.num_tablet_replicas' = '1') stored as kudu
> ERROR: ParseException: Syntax error in line 1:
> ...m_tablet_replicas' = '1') stored as kudu
>                              ^
> Encountered: STORED
> Expected: AS
>
> CAUSED BY: Exception: Syntax error
>
> [0.0.0.0:11432] default> create table sample_data_s3_2xxkudu (id int, val
> int, zerofill string,name string, assertion boolean, city string, state
> string, PRIMARY KEY(id)) stored as kudu TBLPROPERTIES
> ('kudu.num_tablet_replicas' = '1');
> Query: create table sample_data_s3_2xxkudu (id int, val int, zerofill
> string,name string, assertion boolean, city string, state string, PRIMARY
> KEY(id)) stored as kudu TBLPROPERTIES ('kudu.num_tablet_replicas' = '1')
> +-------------------------+
> | summary                 |
> +-------------------------+
> | Table has been created. |
> +-------------------------+
> WARNINGS: Unpartitioned Kudu tables are inefficient for large data sizes.
>
> Fetched 1 row(s) in 0.27s
>

PS: Sorry for the late reply.

Regards,
Hashan Gayasri

On Fri, Apr 2, 2021 at 8:51 AM Quanlong Huang <hu...@gmail.com>
wrote:

> I think logs of catalogd and HMS have more details about this error. Could
> you find and share the stacktrace of this exception?
>
> BTW, is your impala-3.4 able to create other kinds of tables, e.g. hdfs
> table or kudu table?
>
> On Wed, Mar 31, 2021 at 9:59 PM Hashan Gayasri <ha...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I'm getting a Thrift error when trying to create an S3 backed tables in
>> Impala 3.4.0. This worked without any issues in impala 3.3.0.
>>
>> Output from impala-shell:
>>
>>> Server version: impalad version 3.4.0-RELEASE RELEASE (build Could not
>>> obtain git hash)
>>>
>>> ***********************************************************************************
>>> Welcome to the Impala shell.
>>> (Impala Shell v3.4.0-RELEASE (Could) built on Wed Feb 24 23:52:04 GMT
>>> 2021)
>>>
>>> After running a query, type SUMMARY to see a summary of where time was
>>> spent.
>>>
>>> ***********************************************************************************
>>> [localhost:11432] default> CREATE TABLE parqt_table_s3 (time INT)
>>> PARTITIONED BY (time_part INT) STORED AS PARQUET LOCATION
>>> 's3a://hashan-0011220-test-s3/Sample_Data';
>>> Query: CREATE TABLE parqt_table_s3 (time INT) PARTITIONED BY (time_part
>>> INT) STORED AS PARQUET LOCATION
>>> 's3a://hashan-0011220-test-s3/Sample_Data'
>>> ERROR: ImpalaRuntimeException: Error making 'createTable' RPC to Hive
>>> Metastore:
>>> CAUSED BY: TTransportException: null
>>>
>>
>> Log (impalad):
>>
>>> I0331 14:31:56.494190 1732 Frontend.java:1487]
>>> d54db09df4e5acfe:4e10d11500000000] Analyzing query: CREATE TABLE
>>> parqt_table_s3 (time INT) PARTITIONED BY (time_part INT) STORED AS PARQUET
>>> LOCATION 's3a://hashan-0011220-test-s3/Sample_Data' db: default
>>> I0331 14:31:56.600950 1732 Frontend.java:1529]
>>> d54db09df4e5acfe:4e10d11500000000] Analysis and authorization finished.
>>> I0331 14:31:57.650727 1732 client-request-state.cc:211]
>>> d54db09df4e5acfe:4e10d11500000000] ImpalaRuntimeException: Error making
>>> 'createTable' RPC to Hive Metastore:
>>> CAUSED BY: TTransportException: null
>>>
>>
>> Is this a known issue / has someone faced this before?
>>
>>
>> Regards,
>> Hashan Gayasri
>>
>>

-- 
-Hashan Gayasri