You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Zhenhua Chai <ch...@gmail.com> on 2013/05/20 11:18:32 UTC

Strange display when upload RCFile from local with different column.

hello,
  I tried to import data from SQL to Hive RCFile. I use RCFile.Writer to
generate rcfile and upload to HIVE table directory. the rcfiles has
different columns:
c0_1.rc has 2 columns
c1_1.rc has 3 columns
c2_1.rc has 4 columns

the outputs was strange:

# All rc files are loaded:
hive> select * from simple;
OK
1 foo NULL null
2 bar NULL null
3 foobar NULL null
3 haliluya NULL null
4 holy shit NULL null
4 NULL null
7 NULL null

# c1_1.rc and c2_1.rc are loaded.
hive> select * from simple;
OK
3 haliluya 2013-05-17 14:19:02 null
4 holy shit 2013-05-17 14:19:20 null
4 2013-05-17 14:19:45 null
7 2013-05-17 14:20:02 null

# only c2_1 is loaded.
hive> select * from simple;
OK
4 2013-05-17 14:19:45 4Vx�
7 2013-05-17 14:20:02 �P"�:mo�d�)�2��


i create the table use the command ``` create table foo (k int, v text, t
timestamp, v1 binary)stored as rcfile; ```
i am using hive-0.8.1. and the appendix contains the rc files.

can anyone help to solve this? did it a known bug? or should i use a later
version of hive? thanks.

Re: Strange display when upload RCFile from local with different column.

Posted by Jon Hartlaub <jh...@gmail.com>.
Timestamp handling, specifically TimestampWritable, has several bugs.
The latest versions of Hive (11.0) has some fixes needed to make it work
correctly.


On Mon, May 20, 2013 at 2:18 AM, Zhenhua Chai <ch...@gmail.com> wrote:

> hello,
>   I tried to import data from SQL to Hive RCFile. I use RCFile.Writer to
> generate rcfile and upload to HIVE table directory. the rcfiles has
> different columns:
> c0_1.rc has 2 columns
> c1_1.rc has 3 columns
> c2_1.rc has 4 columns
>
> the outputs was strange:
>
> # All rc files are loaded:
> hive> select * from simple;
> OK
> 1 foo NULL null
> 2 bar NULL null
> 3 foobar NULL null
> 3 haliluya NULL null
> 4 holy shit NULL null
> 4 NULL null
> 7 NULL null
>
> # c1_1.rc and c2_1.rc are loaded.
> hive> select * from simple;
> OK
> 3 haliluya 2013-05-17 14:19:02 null
> 4 holy shit 2013-05-17 14:19:20 null
> 4 2013-05-17 14:19:45 null
> 7 2013-05-17 14:20:02 null
>
> # only c2_1 is loaded.
> hive> select * from simple;
> OK
> 4 2013-05-17 14:19:45 4Vx�
> 7 2013-05-17 14:20:02 �P"�:mo�d �)�2��
>
>
> i create the table use the command ``` create table foo (k int, v text, t
> timestamp, v1 binary)stored as rcfile; ```
> i am using hive-0.8.1. and the appendix contains the rc files.
>
> can anyone help to solve this? did it a known bug? or should i use a later
> version of hive? thanks.
>