You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by "Vitale, Tom " <th...@credit-suisse.com> on 2015/02/26 17:15:04 UTC

Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why?

I used sqoop to import an MS SQL Server table into an Avro file on HDFS.  No problem. Then I tried to create an external Impala table using the following DDL:

CREATE EXTERNAL TABLE AvroTable
STORED AS AVRO
        LOCATION '/tmp/AvroTable';

I got the error "ERROR: AnalysisException: Error loading Avro schema: No Avro schema provided in SERDEPROPERTIES or TBLPROPERTIES for table: default.AvroTable"

So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar (-getschema) into a JSON file, then per the recommendation above, changed the DDL to point to it:

CREATE EXTERNAL TABLE AvroTable
STORED AS AVRO
        LOCATION '/tmp/AvroTable'
TBLPROPERTIES(
        'serialization.format'='1',
        'avro.schema.url'='hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema'
);

This worked fine.  But my question is, why do you have to do this?  The schema is already in the Avro file - that's where I got the JSON schema file that I point to in the TBLPROPERTIES parameter!

Thanks, Tom

Tom Vitale
CREDIT SUISSE
Information Technology | Infra Arch & Strategy NY, KIVP
Eleven Madison Avenue | 10010-3629 New York | United States
Phone +1 212 538 0708
thomas.vitale@credit-suisse.com<ma...@credit-suisse.com> | www.credit-suisse.com<http://www.credit-suisse.com>




=============================================================================== 
Please access the attached hyperlink for an important electronic communications disclaimer: 
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
=============================================================================== 

Re: Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why?

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi,

Impala is a product of Cloudera. You might request help per:
https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user <https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user>

BR, 
 Alex


> On 26 Feb 2015, at 17:15, Vitale, Tom <th...@credit-suisse.com> wrote:
> 
> I used sqoop to import an MS SQL Server table into an Avro file on HDFS.  No problem. Then I tried to create an external Impala table using the following DDL:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
>         LOCATION '/tmp/AvroTable';
>  
> I got the error “ERROR: AnalysisException: Error loading Avro schema: No Avro schema provided in SERDEPROPERTIES or TBLPROPERTIES for table: default.AvroTable”
>  
> So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar (-getschema) into a JSON file, then per the recommendation above, changed the DDL to point to it:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
>         LOCATION '/tmp/AvroTable'
> TBLPROPERTIES(
>         'serialization.format'='1',
>         'avro.schema.url'='hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema' <hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema'>
> );
>  
> This worked fine.  But my question is, why do you have to do this?  The schema is already in the Avro file – that’s where I got the JSON schema file that I point to in the TBLPROPERTIES parameter!
>  
> Thanks, Tom
>  
> Tom Vitale
> CREDIT SUISSE
> Information Technology | Infra Arch & Strategy NY, KIVP
> Eleven Madison Avenue | 10010-3629 New York | United States
> Phone +1 212 538 0708
> thomas.vitale@credit-suisse.com <ma...@credit-suisse.com> | www.credit-suisse.com <http://www.credit-suisse.com/>
>  
> 
> 
> 
> ==============================================================================
> Please access the attached hyperlink for an important electronic communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html <http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html>
> ==============================================================================


Re: Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why?

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi,

Impala is a product of Cloudera. You might request help per:
https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user <https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user>

BR, 
 Alex


> On 26 Feb 2015, at 17:15, Vitale, Tom <th...@credit-suisse.com> wrote:
> 
> I used sqoop to import an MS SQL Server table into an Avro file on HDFS.  No problem. Then I tried to create an external Impala table using the following DDL:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
>         LOCATION '/tmp/AvroTable';
>  
> I got the error “ERROR: AnalysisException: Error loading Avro schema: No Avro schema provided in SERDEPROPERTIES or TBLPROPERTIES for table: default.AvroTable”
>  
> So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar (-getschema) into a JSON file, then per the recommendation above, changed the DDL to point to it:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
>         LOCATION '/tmp/AvroTable'
> TBLPROPERTIES(
>         'serialization.format'='1',
>         'avro.schema.url'='hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema' <hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema'>
> );
>  
> This worked fine.  But my question is, why do you have to do this?  The schema is already in the Avro file – that’s where I got the JSON schema file that I point to in the TBLPROPERTIES parameter!
>  
> Thanks, Tom
>  
> Tom Vitale
> CREDIT SUISSE
> Information Technology | Infra Arch & Strategy NY, KIVP
> Eleven Madison Avenue | 10010-3629 New York | United States
> Phone +1 212 538 0708
> thomas.vitale@credit-suisse.com <ma...@credit-suisse.com> | www.credit-suisse.com <http://www.credit-suisse.com/>
>  
> 
> 
> 
> ==============================================================================
> Please access the attached hyperlink for an important electronic communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html <http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html>
> ==============================================================================


Re: Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why?

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi,

Impala is a product of Cloudera. You might request help per:
https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user <https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user>

BR, 
 Alex


> On 26 Feb 2015, at 17:15, Vitale, Tom <th...@credit-suisse.com> wrote:
> 
> I used sqoop to import an MS SQL Server table into an Avro file on HDFS.  No problem. Then I tried to create an external Impala table using the following DDL:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
>         LOCATION '/tmp/AvroTable';
>  
> I got the error “ERROR: AnalysisException: Error loading Avro schema: No Avro schema provided in SERDEPROPERTIES or TBLPROPERTIES for table: default.AvroTable”
>  
> So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar (-getschema) into a JSON file, then per the recommendation above, changed the DDL to point to it:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
>         LOCATION '/tmp/AvroTable'
> TBLPROPERTIES(
>         'serialization.format'='1',
>         'avro.schema.url'='hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema' <hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema'>
> );
>  
> This worked fine.  But my question is, why do you have to do this?  The schema is already in the Avro file – that’s where I got the JSON schema file that I point to in the TBLPROPERTIES parameter!
>  
> Thanks, Tom
>  
> Tom Vitale
> CREDIT SUISSE
> Information Technology | Infra Arch & Strategy NY, KIVP
> Eleven Madison Avenue | 10010-3629 New York | United States
> Phone +1 212 538 0708
> thomas.vitale@credit-suisse.com <ma...@credit-suisse.com> | www.credit-suisse.com <http://www.credit-suisse.com/>
>  
> 
> 
> 
> ==============================================================================
> Please access the attached hyperlink for an important electronic communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html <http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html>
> ==============================================================================


Re: Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why?

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi,

Impala is a product of Cloudera. You might request help per:
https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user <https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user>

BR, 
 Alex


> On 26 Feb 2015, at 17:15, Vitale, Tom <th...@credit-suisse.com> wrote:
> 
> I used sqoop to import an MS SQL Server table into an Avro file on HDFS.  No problem. Then I tried to create an external Impala table using the following DDL:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
>         LOCATION '/tmp/AvroTable';
>  
> I got the error “ERROR: AnalysisException: Error loading Avro schema: No Avro schema provided in SERDEPROPERTIES or TBLPROPERTIES for table: default.AvroTable”
>  
> So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar (-getschema) into a JSON file, then per the recommendation above, changed the DDL to point to it:
>  
> CREATE EXTERNAL TABLE AvroTable
> STORED AS AVRO
>         LOCATION '/tmp/AvroTable'
> TBLPROPERTIES(
>         'serialization.format'='1',
>         'avro.schema.url'='hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema' <hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema'>
> );
>  
> This worked fine.  But my question is, why do you have to do this?  The schema is already in the Avro file – that’s where I got the JSON schema file that I point to in the TBLPROPERTIES parameter!
>  
> Thanks, Tom
>  
> Tom Vitale
> CREDIT SUISSE
> Information Technology | Infra Arch & Strategy NY, KIVP
> Eleven Madison Avenue | 10010-3629 New York | United States
> Phone +1 212 538 0708
> thomas.vitale@credit-suisse.com <ma...@credit-suisse.com> | www.credit-suisse.com <http://www.credit-suisse.com/>
>  
> 
> 
> 
> ==============================================================================
> Please access the attached hyperlink for an important electronic communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html <http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html>
> ==============================================================================