You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Joaquin Alzola <Jo...@lebara.com> on 2016/12/13 11:37:27 UTC

PARTITION error because different columns size

Hi List

I change Spark to 2.0.2 and Hive 2.0.1.
I have the bellow tables but the INSERT INTO TABLE employee_orc PARTITION (country='USA', office='HQ-TX') select * from employee where country='USA' and office='HQ-TX';
Is giving me --> Cannot insert into table `default`.`employee_orc` because the number of columns are different: need 4 columns, but query has 6 columns.;

When doing select it is adding the Partition as columns ....

CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
salary String, destination String)
COMMENT 'Employee details'
PARTITIONED BY(country string, office string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;

CREATE TABLE IF NOT EXISTS employee_orc ( eid int, name String,
salary String, destination String)
COMMENT 'Employee details'
PARTITIONED BY(country string, office string)
STORED AS ORC tblproperties ("orc.compress"="ZLIB");

0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH '/mnt/sample.txt.gz' INTO TABLE employee PARTITION (country='USA', office='HQ-TX');
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.685 seconds)
0: jdbc:hive2://localhost:10000> select * from employee;
+-------+--------------+---------+--------------------+----------+---------+--+
|  eid  |     name     | salary  |    destination     | country  | office  |
+-------+--------------+---------+--------------------+----------+---------+--+
| 1201  | Gopal        | 45000   | Technical manager  | USA      | HQ-TX   |
| 1202  | Manisha      | 45000   | Proof reader       | USA      | HQ-TX   |
| 1203  | Masthanvali  | 40000   | Technical writer   | USA      | HQ-TX   |
| 1204  | Kiran        | 40000   | Hr Admin           | USA      | HQ-TX   |
| 1205  | Kranthi      | 30000   | Op Admin           | USA      | HQ-TX   |
+-------+--------------+---------+--------------------+----------+---------+--+
5 rows selected (0.358 seconds)
0: jdbc:hive2://localhost:10000> INSERT INTO TABLE employee_orc PARTITION (country='USA', office='HQ-TX') select * from employee where country='USA' and office='HQ-TX';
Error: org.apache.spark.sql.AnalysisException: Cannot insert into table `default`.`employee_orc` because the number of columns are different: need 4 columns, but query has 6 columns.; (state=,code=0)


0: jdbc:hive2://localhost:10000> describe employee_orc;
+--------------------------+------------+----------+--+
|         col_name         | data_type  | comment  |
+--------------------------+------------+----------+--+
| eid                      | int        | NULL     |
| name                     | string     | NULL     |
| salary                   | string     | NULL     |
| destination              | string     | NULL     |
| country                  | string     | NULL     |
| office                   | string     | NULL     |
| # Partition Information  |            |          |
| # col_name               | data_type  | comment  |
| country                  | string     | NULL     |
| office                   | string     | NULL     |
+--------------------------+------------+----------+--+

0: jdbc:hive2://localhost:10000>  describe employee;
+--------------------------+------------+----------+--+
|         col_name         | data_type  | comment  |
+--------------------------+------------+----------+--+
| eid                      | int        | NULL     |
| name                     | string     | NULL     |
| salary                   | string     | NULL     |
| destination              | string     | NULL     |
| country                  | string     | NULL     |
| office                   | string     | NULL     |
| # Partition Information  |            |          |
| # col_name               | data_type  | comment  |
| country                  | string     | NULL     |
| office                   | string     | NULL     |
+--------------------------+------------+----------+--+
10 rows selected (0.045 seconds)
This email is confidential and may be subject to privilege. If you are not the intended recipient, please do not copy or disclose its content but contact the sender immediately upon receipt.

Re: PARTITION error because different columns size

Posted by Furcy Pin <fu...@flaminem.com>.

Actually, there is a legacy feature in Hive that would do exactly what you
need :

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification


Another way would simply to use dynamic partitioning :
INSERT INTO TABLE employee_orc PARTITION *(country, office)* select * from
employee where country='USA' and office='HQ-TX';

The only downside of this method (that I know of), is that Hive will put a
lock on every partitions, even if only one is inserted.


I also know a third solution, but I can't talk about it yet.



On Tue, Dec 13, 2016 at 3:27 PM, Joaquin Alzola <Jo...@lebara.com>
wrote:

> Hi Suresh
>
>
>
> I choose the * and not the specific fields because I have 520 columns.
>
> The data that I tested was only a testing ground.
>
>
>
> I suppose then that I need to select the 520 fileds. L
>
>
>
>
>
>
>
> *From:* Suresh Kumar Sethuramaswamy [mailto:rockssk@gmail.com]
> *Sent:* 13 December 2016 14:19
> *To:* user@hive.apache.org
> *Subject:* Re: PARTITION error because different columns size
>
>
>
> Hi Joaquin
>
>
>
>     In hive , when u run 'select * from employee' it is going to return
> the partitioned columns also at the end,  whereas you don't want that to be
> inserted into ur ORC table , so ur insert query should look like
>
>
>
>       INSERT INTO TABLE employee_orc PARTITION (country='USA',
> office='HQ-TX') select eid,salary from employee where country='USA' and
> office='HQ-TX';
>
>
>
>
>
>  Remember partition in hive is a physical folder name
>
>
>
> Regards
>
> Suresh
>
>
>
>
>
>
>
> On Tue, Dec 13, 2016 at 6:37 AM Joaquin Alzola <Jo...@lebara.com>
> wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi List
>
>
>
>
>
>
>
> I change Spark to 2.0.2 and Hive 2.0.1.
>
>
>
> I have the bellow tables but the INSERT INTO TABLE employee_orc PARTITION
> (country='USA', office='HQ-TX') select * from employee where country='USA'
> and office='HQ-TX';
>
>
>
> Is giving me à Cannot insert into table `default`.`employee_orc` because
> the number of columns are different: need 4 columns, but query has 6
> columns.;
>
>
>
>
>
>
>
> When doing select it is adding the Partition as columns ….
>
>
>
>
>
>
>
> CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
>
>
>
> salary String, destination String)
>
>
>
> COMMENT 'Employee details'
>
>
>
> PARTITIONED BY(country string, office string)
>
>
>
> ROW FORMAT DELIMITED
>
>
>
> FIELDS TERMINATED BY '\t'
>
>
>
> LINES TERMINATED BY '\n'
>
>
>
> STORED AS TEXTFILE;
>
>
>
>
>
>
>
> CREATE TABLE IF NOT EXISTS employee_orc ( eid int, name String,
>
>
>
> salary String, destination String)
>
>
>
> COMMENT 'Employee details'
>
>
>
> PARTITIONED BY(country string, office string)
>
>
>
> STORED AS ORC tblproperties ("orc.compress"="ZLIB");
>
>
>
>
>
>
>
> 0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH
> '/mnt/sample.txt.gz' INTO TABLE employee PARTITION (country='USA',
> office='HQ-TX');
>
>
>
> +---------+--+
>
>
>
> | Result  |
>
>
>
> +---------+--+
>
>
>
> +---------+--+
>
>
>
> No rows selected (0.685 seconds)
>
>
>
> 0: jdbc:hive2://localhost:10000> select * from employee;
>
>
>
> +-------+--------------+---------+--------------------+-----
> -----+---------+--+
>
>
>
> |  eid  |     name     | salary  |    destination     | country  | office
> |
>
>
>
> +-------+--------------+---------+--------------------+-----
> -----+---------+--+
>
>
>
> | 1201  | Gopal        | 45000   | Technical manager  | USA      | HQ-TX
> |
>
>
>
> | 1202  | Manisha      | 45000   | Proof reader       | USA      | HQ-TX
> |
>
>
>
> | 1203  | Masthanvali  | 40000   | Technical writer   | USA      | HQ-TX
> |
>
>
>
> | 1204  | Kiran        | 40000   | Hr Admin           | USA      | HQ-TX
> |
>
>
>
> | 1205  | Kranthi      | 30000   | Op Admin           | USA      | HQ-TX
> |
>
>
>
> +-------+--------------+---------+--------------------+-----
> -----+---------+--+
>
>
>
> 5 rows selected (0.358 seconds)
>
>
>
> 0: jdbc:hive2://localhost:10000> INSERT INTO TABLE employee_orc PARTITION
> (country='USA', office='HQ-TX') select * from employee where country='USA'
> and office='HQ-TX';
>
>
>
> Error: org.apache.spark.sql.AnalysisException: Cannot insert into table
> `default`.`employee_orc` because the number of columns are different: need
> 4 columns, but query has 6 columns.; (state=,code=0)
>
>
>
>
>
>
>
>
>
>
>
> 0: jdbc:hive2://localhost:10000> describe employee_orc;
>
>
>
> +--------------------------+------------+----------+--+
>
>
>
> |         col_name         | data_type  | comment  |
>
>
>
> +--------------------------+------------+----------+--+
>
>
>
> | eid                      | int        | NULL     |
>
>
>
> | name                     | string     | NULL     |
>
>
>
> | salary                   | string     | NULL     |
>
>
>
> | destination              | string     | NULL     |
>
>
>
> | country                  | string     | NULL     |
>
>
>
> | office                   | string     | NULL     |
>
>
>
> | # Partition Information  |            |          |
>
>
>
> | # col_name               | data_type  | comment  |
>
>
>
> | country                  | string     | NULL     |
>
>
>
> | office                   | string     | NULL     |
>
>
>
> +--------------------------+------------+----------+--+
>
>
>
>
>
>
>
> 0: jdbc:hive2://localhost:10000>  describe employee;
>
>
>
> +--------------------------+------------+----------+--+
>
>
>
> |         col_name         | data_type  | comment  |
>
>
>
> +--------------------------+------------+----------+--+
>
>
>
> | eid                      | int        | NULL     |
>
>
>
> | name                     | string     | NULL     |
>
>
>
> | salary                   | string     | NULL     |
>
>
>
> | destination              | string     | NULL     |
>
>
>
> | country                  | string     | NULL     |
>
>
>
> | office                   | string     | NULL     |
>
>
>
> | # Partition Information  |            |          |
>
>
>
> | # col_name               | data_type  | comment  |
>
>
>
> | country                  | string     | NULL     |
>
>
>
> | office                   | string     | NULL     |
>
>
>
> +--------------------------+------------+----------+--+
>
>
>
> 10 rows selected (0.045 seconds)
>
>
>
>
>
> This email is confidential and may be subject to privilege. If you are not
> the intended recipient, please do not copy or disclose its content but
> contact the sender immediately upon receipt.
>
>
>
>
> This email is confidential and may be subject to privilege. If you are not
> the intended recipient, please do not copy or disclose its content but
> contact the sender immediately upon receipt.
>

RE: PARTITION error because different columns size

Posted by Joaquin Alzola <Jo...@lebara.com>.

Hi Suresh

I choose the * and not the specific fields because I have 520 columns.
The data that I tested was only a testing ground.

I suppose then that I need to select the 520 fileds. ☹



From: Suresh Kumar Sethuramaswamy [mailto:rockssk@gmail.com]
Sent: 13 December 2016 14:19
To: user@hive.apache.org
Subject: Re: PARTITION error because different columns size

Hi Joaquin

    In hive , when u run 'select * from employee' it is going to return the partitioned columns also at the end,  whereas you don't want that to be inserted into ur ORC table , so ur insert query should look like

      INSERT INTO TABLE employee_orc PARTITION (country='USA', office='HQ-TX') select eid,salary from employee where country='USA' and office='HQ-TX';


 Remember partition in hive is a physical folder name

Regards
Suresh



On Tue, Dec 13, 2016 at 6:37 AM Joaquin Alzola <Jo...@lebara.com>> wrote:













Hi List



I change Spark to 2.0.2 and Hive 2.0.1.

I have the bellow tables but the INSERT INTO TABLE employee_orc PARTITION (country='USA', office='HQ-TX') select * from employee where country='USA' and office='HQ-TX';

Is giving me --> Cannot insert into table `default`.`employee_orc` because the number of columns are different: need 4 columns, but query has 6 columns.;



When doing select it is adding the Partition as columns ….



CREATE TABLE IF NOT EXISTS employee ( eid int, name String,

salary String, destination String)

COMMENT 'Employee details'

PARTITIONED BY(country string, office string)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY '\t'

LINES TERMINATED BY '\n'

STORED AS TEXTFILE;



CREATE TABLE IF NOT EXISTS employee_orc ( eid int, name String,

salary String, destination String)

COMMENT 'Employee details'

PARTITIONED BY(country string, office string)

STORED AS ORC tblproperties ("orc.compress"="ZLIB");



0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH '/mnt/sample.txt.gz' INTO TABLE employee PARTITION (country='USA', office='HQ-TX');

+---------+--+

| Result  |

+---------+--+

+---------+--+

No rows selected (0.685 seconds)

0: jdbc:hive2://localhost:10000> select * from employee;

+-------+--------------+---------+--------------------+----------+---------+--+

|  eid  |     name     | salary  |    destination     | country  | office  |

+-------+--------------+---------+--------------------+----------+---------+--+

| 1201  | Gopal        | 45000   | Technical manager  | USA      | HQ-TX   |

| 1202  | Manisha      | 45000   | Proof reader       | USA      | HQ-TX   |

| 1203  | Masthanvali  | 40000   | Technical writer   | USA      | HQ-TX   |

| 1204  | Kiran        | 40000   | Hr Admin           | USA      | HQ-TX   |

| 1205  | Kranthi      | 30000   | Op Admin           | USA      | HQ-TX   |

+-------+--------------+---------+--------------------+----------+---------+--+

5 rows selected (0.358 seconds)

0: jdbc:hive2://localhost:10000> INSERT INTO TABLE employee_orc PARTITION (country='USA', office='HQ-TX') select * from employee where country='USA' and office='HQ-TX';

Error: org.apache.spark.sql.AnalysisException: Cannot insert into table `default`.`employee_orc` because the number of columns are different: need 4 columns, but query has 6 columns.; (state=,code=0)





0: jdbc:hive2://localhost:10000> describe employee_orc;

+--------------------------+------------+----------+--+

|         col_name         | data_type  | comment  |

+--------------------------+------------+----------+--+

| eid                      | int        | NULL     |

| name                     | string     | NULL     |

| salary                   | string     | NULL     |

| destination              | string     | NULL     |

| country                  | string     | NULL     |

| office                   | string     | NULL     |

| # Partition Information  |            |          |

| # col_name               | data_type  | comment  |

| country                  | string     | NULL     |

| office                   | string     | NULL     |

+--------------------------+------------+----------+--+



0: jdbc:hive2://localhost:10000>  describe employee;

+--------------------------+------------+----------+--+

|         col_name         | data_type  | comment  |

+--------------------------+------------+----------+--+

| eid                      | int        | NULL     |

| name                     | string     | NULL     |

| salary                   | string     | NULL     |

| destination              | string     | NULL     |

| country                  | string     | NULL     |

| office                   | string     | NULL     |

| # Partition Information  |            |          |

| # col_name               | data_type  | comment  |

| country                  | string     | NULL     |

| office                   | string     | NULL     |

+--------------------------+------------+----------+--+

10 rows selected (0.045 seconds)



This email is confidential and may be subject to privilege. If you are not the intended recipient, please do not copy or disclose its content but contact the sender immediately upon receipt.



This email is confidential and may be subject to privilege. If you are not the intended recipient, please do not copy or disclose its content but contact the sender immediately upon receipt.

Re: PARTITION error because different columns size

Posted by Furcy Pin <fu...@flaminem.com>.

Hi Joaquin,

Suresh was faster than me ...

Also, you should check this :
https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-Dynamic-PartitionInsert

On Tue, Dec 13, 2016 at 3:19 PM, Suresh Kumar Sethuramaswamy <
rockssk@gmail.com> wrote:

> Hi Joaquin
>
>     In hive , when u run 'select * from employee' it is going to return
> the partitioned columns also at the end,  whereas you don't want that to be
> inserted into ur ORC table , so ur insert query should look like
>
>       INSERT INTO TABLE employee_orc PARTITION (country='USA',
> office='HQ-TX') select eid,salary from employee where country='USA' and
> office='HQ-TX';
>
>
>  Remember partition in hive is a physical folder name
>
> Regards
> Suresh
>
>
>
> On Tue, Dec 13, 2016 at 6:37 AM Joaquin Alzola <Jo...@lebara.com>
> wrote:
>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Hi List
>>
>>
>>
>>
>>
>> I change Spark to 2.0.2 and Hive 2.0.1.
>>
>>
>> I have the bellow tables but the INSERT INTO TABLE employee_orc PARTITION
>> (country='USA', office='HQ-TX') select * from employee where country='USA'
>> and office='HQ-TX';
>>
>>
>> Is giving me à Cannot insert into table `default`.`employee_orc` because
>> the number of columns are different: need 4 columns, but query has 6
>> columns.;
>>
>>
>>
>>
>>
>> When doing select it is adding the Partition as columns ….
>>
>>
>>
>>
>>
>>
>> CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
>>
>>
>> salary String, destination String)
>>
>>
>> COMMENT 'Employee details'
>>
>>
>> PARTITIONED BY(country string, office string)
>>
>>
>> ROW FORMAT DELIMITED
>>
>>
>> FIELDS TERMINATED BY '\t'
>>
>>
>> LINES TERMINATED BY '\n'
>>
>>
>> STORED AS TEXTFILE;
>>
>>
>>
>>
>>
>> CREATE TABLE IF NOT EXISTS employee_orc ( eid int, name String,
>>
>>
>> salary String, destination String)
>>
>>
>> COMMENT 'Employee details'
>>
>>
>> PARTITIONED BY(country string, office string)
>>
>>
>> STORED AS ORC tblproperties ("orc.compress"="ZLIB");
>>
>>
>>
>>
>>
>> 0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH
>> '/mnt/sample.txt.gz' INTO TABLE employee PARTITION (country='USA',
>> office='HQ-TX');
>>
>>
>> +---------+--+
>>
>>
>> | Result  |
>>
>>
>> +---------+--+
>>
>>
>> +---------+--+
>>
>>
>> No rows selected (0.685 seconds)
>>
>>
>> 0: jdbc:hive2://localhost:10000> select * from employee;
>>
>>
>> +-------+--------------+---------+--------------------+-----
>> -----+---------+--+
>>
>>
>> |  eid  |     name     | salary  |    destination     | country  |
>> office  |
>>
>>
>> +-------+--------------+---------+--------------------+-----
>> -----+---------+--+
>>
>>
>> | 1201  | Gopal        | 45000   | Technical manager  | USA      |
>> HQ-TX   |
>>
>>
>> | 1202  | Manisha      | 45000   | Proof reader       | USA      |
>> HQ-TX   |
>>
>>
>> | 1203  | Masthanvali  | 40000   | Technical writer   | USA      |
>> HQ-TX   |
>>
>>
>> | 1204  | Kiran        | 40000   | Hr Admin           | USA      |
>> HQ-TX   |
>>
>>
>> | 1205  | Kranthi      | 30000   | Op Admin           | USA      |
>> HQ-TX   |
>>
>>
>> +-------+--------------+---------+--------------------+-----
>> -----+---------+--+
>>
>>
>> 5 rows selected (0.358 seconds)
>>
>>
>> 0: jdbc:hive2://localhost:10000> INSERT INTO TABLE employee_orc PARTITION
>> (country='USA', office='HQ-TX') select * from employee where country='USA'
>> and office='HQ-TX';
>>
>>
>> Error: org.apache.spark.sql.AnalysisException: Cannot insert into table
>> `default`.`employee_orc` because the number of columns are different: need
>> 4 columns, but query has 6 columns.; (state=,code=0)
>>
>>
>>
>>
>>
>>
>>
>>
>> 0: jdbc:hive2://localhost:10000> describe employee_orc;
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> |         col_name         | data_type  | comment  |
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> | eid                      | int        | NULL     |
>>
>>
>> | name                     | string     | NULL     |
>>
>>
>> | salary                   | string     | NULL     |
>>
>>
>> | destination              | string     | NULL     |
>>
>>
>> | country                  | string     | NULL     |
>>
>>
>> | office                   | string     | NULL     |
>>
>>
>> | # Partition Information  |            |          |
>>
>>
>> | # col_name               | data_type  | comment  |
>>
>>
>> | country                  | string     | NULL     |
>>
>>
>> | office                   | string     | NULL     |
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>>
>>
>>
>> 0: jdbc:hive2://localhost:10000>  describe employee;
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> |         col_name         | data_type  | comment  |
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> | eid                      | int        | NULL     |
>>
>>
>> | name                     | string     | NULL     |
>>
>>
>> | salary                   | string     | NULL     |
>>
>>
>> | destination              | string     | NULL     |
>>
>>
>> | country                  | string     | NULL     |
>>
>>
>> | office                   | string     | NULL     |
>>
>>
>> | # Partition Information  |            |          |
>>
>>
>> | # col_name               | data_type  | comment  |
>>
>>
>> | country                  | string     | NULL     |
>>
>>
>> | office                   | string     | NULL     |
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> 10 rows selected (0.045 seconds)
>>
>>
>>
>>
>> This email is confidential and may be subject to privilege. If you are
>> not the intended recipient, please do not copy or disclose its content but
>> contact the sender immediately upon receipt.
>>
>>
>>
>>
>>
>>

Re: PARTITION error because different columns size

Posted by Suresh Kumar Sethuramaswamy <ro...@gmail.com>.

Hi Joaquin

    In hive , when u run 'select * from employee' it is going to return the
partitioned columns also at the end,  whereas you don't want that to be
inserted into ur ORC table , so ur insert query should look like

      INSERT INTO TABLE employee_orc PARTITION (country='USA',
office='HQ-TX') select eid,salary from employee where country='USA' and
office='HQ-TX';


 Remember partition in hive is a physical folder name

Regards
Suresh



On Tue, Dec 13, 2016 at 6:37 AM Joaquin Alzola <Jo...@lebara.com>
wrote:

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi List
>
>
>
>
>
> I change Spark to 2.0.2 and Hive 2.0.1.
>
>
> I have the bellow tables but the INSERT INTO TABLE employee_orc PARTITION
> (country='USA', office='HQ-TX') select * from employee where country='USA'
> and office='HQ-TX';
>
>
> Is giving me à Cannot insert into table `default`.`employee_orc` because
> the number of columns are different: need 4 columns, but query has 6
> columns.;
>
>
>
>
>
> When doing select it is adding the Partition as columns ….
>
>
>
>
>
>
> CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
>
>
> salary String, destination String)
>
>
> COMMENT 'Employee details'
>
>
> PARTITIONED BY(country string, office string)
>
>
> ROW FORMAT DELIMITED
>
>
> FIELDS TERMINATED BY '\t'
>
>
> LINES TERMINATED BY '\n'
>
>
> STORED AS TEXTFILE;
>
>
>
>
>
> CREATE TABLE IF NOT EXISTS employee_orc ( eid int, name String,
>
>
> salary String, destination String)
>
>
> COMMENT 'Employee details'
>
>
> PARTITIONED BY(country string, office string)
>
>
> STORED AS ORC tblproperties ("orc.compress"="ZLIB");
>
>
>
>
>
> 0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH
> '/mnt/sample.txt.gz' INTO TABLE employee PARTITION (country='USA',
> office='HQ-TX');
>
>
> +---------+--+
>
>
> | Result  |
>
>
> +---------+--+
>
>
> +---------+--+
>
>
> No rows selected (0.685 seconds)
>
>
> 0: jdbc:hive2://localhost:10000> select * from employee;
>
>
>
> +-------+--------------+---------+--------------------+----------+---------+--+
>
>
> |  eid  |     name     | salary  |    destination     | country  | office
> |
>
>
>
> +-------+--------------+---------+--------------------+----------+---------+--+
>
>
> | 1201  | Gopal        | 45000   | Technical manager  | USA      | HQ-TX
> |
>
>
> | 1202  | Manisha      | 45000   | Proof reader       | USA      | HQ-TX
> |
>
>
> | 1203  | Masthanvali  | 40000   | Technical writer   | USA      | HQ-TX
> |
>
>
> | 1204  | Kiran        | 40000   | Hr Admin           | USA      | HQ-TX
> |
>
>
> | 1205  | Kranthi      | 30000   | Op Admin           | USA      | HQ-TX
> |
>
>
>
> +-------+--------------+---------+--------------------+----------+---------+--+
>
>
> 5 rows selected (0.358 seconds)
>
>
> 0: jdbc:hive2://localhost:10000> INSERT INTO TABLE employee_orc PARTITION
> (country='USA', office='HQ-TX') select * from employee where country='USA'
> and office='HQ-TX';
>
>
> Error: org.apache.spark.sql.AnalysisException: Cannot insert into table
> `default`.`employee_orc` because the number of columns are different: need
> 4 columns, but query has 6 columns.; (state=,code=0)
>
>
>
>
>
>
>
>
> 0: jdbc:hive2://localhost:10000> describe employee_orc;
>
>
> +--------------------------+------------+----------+--+
>
>
> |         col_name         | data_type  | comment  |
>
>
> +--------------------------+------------+----------+--+
>
>
> | eid                      | int        | NULL     |
>
>
> | name                     | string     | NULL     |
>
>
> | salary                   | string     | NULL     |
>
>
> | destination              | string     | NULL     |
>
>
> | country                  | string     | NULL     |
>
>
> | office                   | string     | NULL     |
>
>
> | # Partition Information  |            |          |
>
>
> | # col_name               | data_type  | comment  |
>
>
> | country                  | string     | NULL     |
>
>
> | office                   | string     | NULL     |
>
>
> +--------------------------+------------+----------+--+
>
>
>
>
>
> 0: jdbc:hive2://localhost:10000>  describe employee;
>
>
> +--------------------------+------------+----------+--+
>
>
> |         col_name         | data_type  | comment  |
>
>
> +--------------------------+------------+----------+--+
>
>
> | eid                      | int        | NULL     |
>
>
> | name                     | string     | NULL     |
>
>
> | salary                   | string     | NULL     |
>
>
> | destination              | string     | NULL     |
>
>
> | country                  | string     | NULL     |
>
>
> | office                   | string     | NULL     |
>
>
> | # Partition Information  |            |          |
>
>
> | # col_name               | data_type  | comment  |
>
>
> | country                  | string     | NULL     |
>
>
> | office                   | string     | NULL     |
>
>
> +--------------------------+------------+----------+--+
>
>
> 10 rows selected (0.045 seconds)
>
>
>
>
> This email is confidential and may be subject to privilege. If you are not
> the intended recipient, please do not copy or disclose its content but
> contact the sender immediately upon receipt.
>
>
>
>
>
>