You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by ra...@accenture.com on 2012/05/30 18:27:22 UTC

FW: Hive 'rest' column

Hi,

   I'm trying to define a table over an external file. My file has 12 fixed columns followed by a varying amount of columns that depends on some of the fixed ones. I tried to define the table as:

CREATE EXTERNAL TABLE IF NOT EXISTS log_array (
dt              string,
txOperOpciResto string,
idRegPerf       string,
oper            string,
opcion          string,
accion          string,
servc           string,
canal           string,
platf           string,
codIdioma       string,
pais            string,
lacre           string,
dirIP           string,
restoMsg        array<string>
)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '|'
  COLLECTION ITEMS TERMINATED BY '|'
STORED AS SEQUENCEFILE
LOCATION '/user/hadoop-user/uc3/seq/';

So what I tried was to get all varing part on an array field (restoMsg). The trick is not working because both delimiters, fields and collections, are the same. My restoMsg field only gets one column and the rest are omitted.

Is there any way to get that last part without custom code? If not, what classes should I create to this and how can I define the table then?

Thx,
   Ramón Pin


________________________________
Subject to local law, communications with Accenture and its affiliates including telephone calls and emails (including content), may be monitored by our systems for the purposes of security and the assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com

Re: FW: Hive 'rest' column

Posted by Gireesh Subramanya <gi...@gmail.com>.
Ramon,

If all the data is in one line, then you would need to preprocess the data,
but from your explanation below it sounds like the lines terminated by a
newline character after the | ?

Thanks,Gireesh
vivarasystems.com

On Wed, May 30, 2012 at 9:27 AM, <ra...@accenture.com> wrote:

>  Hi,
>
>
>
>    I’m trying to define a table over an external file. My file has 12
> fixed columns followed by a varying amount of columns that depends on some
> of the fixed ones. I tried to define the table as:
>
>
>
> CREATE EXTERNAL TABLE IF NOT EXISTS log_array (
>
> dt              string,
>
> txOperOpciResto string,
>
> idRegPerf       string,
>
> oper            string,
>
> opcion          string,
>
> accion          string,
>
> servc           string,
>
> canal           string,
>
> platf           string,
>
> codIdioma       string,
>
> pais            string,
>
> lacre           string,
>
> dirIP           string,
>
> restoMsg        array<string>
>
> )
>
> ROW FORMAT DELIMITED
>
>   FIELDS TERMINATED BY '|'
>
>   COLLECTION ITEMS TERMINATED BY '|'
>
> STORED AS SEQUENCEFILE
>
> LOCATION '/user/hadoop-user/uc3/seq/';
>
>
>
> So what I tried was to get all varing part on an array field (restoMsg).
> The trick is not working because both delimiters, fields and collections,
> are the same. My restoMsg field only gets one column and the rest are
> omitted.
>
>
>
> Is there any way to get that last part without custom code? If not, what
> classes should I create to this and how can I define the table then?
>
>
>
> Thx,
>
>    Ramón Pin
>
>
>
> ------------------------------
> Subject to local law, communications with Accenture and its affiliates
> including telephone calls and emails (including content), may be monitored
> by our systems for the purposes of security and the assessment of internal
> compliance with Accenture policy.
>
> ______________________________________________________________________________________
>
> www.accenture.com
>

RE: Hive 'rest' column

Posted by ra...@accenture.com.
I'm reviewing that LazySerDe's option right now and it seems to be what I want. Do you know any good tutorial or documentation of all LazySerde's options I can use from Hive?

Thx,
   Ramón Pin

From: shrikanth shankar [mailto:sshankar@qubole.com]
Sent: miércoles, 30 de mayo de 2012 22:47
To: user@hive.apache.org
Subject: Re: Hive 'rest' column

I believe the default LazySerDe takes a parameter called 'serialization.last.column.takes.rest'. Setting this to true might solve your issue (restoMsg would become a string then and you might have to parse it in the query into an array)

thanks,
Shrikanth
On May 30, 2012, at 9:27 AM, <ra...@accenture.com>> <ra...@accenture.com>> wrote:


Hi,

   I'm trying to define a table over an external file. My file has 12 fixed columns followed by a varying amount of columns that depends on some of the fixed ones. I tried to define the table as:

CREATE EXTERNAL TABLE IF NOT EXISTS log_array (
dt              string,
txOperOpciResto string,
idRegPerf       string,
oper            string,
opcion          string,
accion          string,
servc           string,
canal           string,
platf           string,
codIdioma       string,
pais            string,
lacre           string,
dirIP           string,
restoMsg        array<string>
)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '|'
  COLLECTION ITEMS TERMINATED BY '|'
STORED AS SEQUENCEFILE
LOCATION '/user/hadoop-user/uc3/seq/';

So what I tried was to get all varing part on an array field (restoMsg). The trick is not working because both delimiters, fields and collections, are the same. My restoMsg field only gets one column and the rest are omitted.

Is there any way to get that last part without custom code? If not, what classes should I create to this and how can I define the table then?

Thx,
   Ramón Pin


________________________________
Subject to local law, communications with Accenture and its affiliates including telephone calls and emails (including content), may be monitored by our systems for the purposes of security and the assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com<http://www.accenture.com>


RE: Hive 'rest' column

Posted by ra...@accenture.com.
Great, that seems to be what I was looking for. Do you know any good resource explaining all LazySerDer available paramters?

Thx,
   Ramón Pin

From: shrikanth shankar [mailto:sshankar@qubole.com]
Sent: miércoles, 30 de mayo de 2012 22:47
To: user@hive.apache.org
Subject: Re: Hive 'rest' column

I believe the default LazySerDe takes a parameter called 'serialization.last.column.takes.rest'. Setting this to true might solve your issue (restoMsg would become a string then and you might have to parse it in the query into an array)

thanks,
Shrikanth
On May 30, 2012, at 9:27 AM, <ra...@accenture.com>> <ra...@accenture.com>> wrote:


Hi,

   I'm trying to define a table over an external file. My file has 12 fixed columns followed by a varying amount of columns that depends on some of the fixed ones. I tried to define the table as:

CREATE EXTERNAL TABLE IF NOT EXISTS log_array (
dt              string,
txOperOpciResto string,
idRegPerf       string,
oper            string,
opcion          string,
accion          string,
servc           string,
canal           string,
platf           string,
codIdioma       string,
pais            string,
lacre           string,
dirIP           string,
restoMsg        array<string>
)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '|'
  COLLECTION ITEMS TERMINATED BY '|'
STORED AS SEQUENCEFILE
LOCATION '/user/hadoop-user/uc3/seq/';

So what I tried was to get all varing part on an array field (restoMsg). The trick is not working because both delimiters, fields and collections, are the same. My restoMsg field only gets one column and the rest are omitted.

Is there any way to get that last part without custom code? If not, what classes should I create to this and how can I define the table then?

Thx,
   Ramón Pin


________________________________
Subject to local law, communications with Accenture and its affiliates including telephone calls and emails (including content), may be monitored by our systems for the purposes of security and the assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com<http://www.accenture.com>


Re: Hive 'rest' column

Posted by shrikanth shankar <ss...@qubole.com>.
I believe the default LazySerDe takes a parameter called 'serialization.last.column.takes.rest'. Setting this to true might solve your issue (restoMsg would become a string then and you might have to parse it in the query into an array)

thanks,
Shrikanth
On May 30, 2012, at 9:27 AM, <ra...@accenture.com> <ra...@accenture.com> wrote:

> Hi,
>  
>    I’m trying to define a table over an external file. My file has 12 fixed columns followed by a varying amount of columns that depends on some of the fixed ones. I tried to define the table as:
>  
> CREATE EXTERNAL TABLE IF NOT EXISTS log_array (
> dt              string,
> txOperOpciResto string,
> idRegPerf       string,
> oper            string,
> opcion          string,
> accion          string,
> servc           string,
> canal           string,
> platf           string,
> codIdioma       string,
> pais            string,
> lacre           string,
> dirIP           string,
> restoMsg        array<string>
> )
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '|'
>   COLLECTION ITEMS TERMINATED BY '|'
> STORED AS SEQUENCEFILE
> LOCATION '/user/hadoop-user/uc3/seq/';
>  
> So what I tried was to get all varing part on an array field (restoMsg). The trick is not working because both delimiters, fields and collections, are the same. My restoMsg field only gets one column and the rest are omitted.
>  
> Is there any way to get that last part without custom code? If not, what classes should I create to this and how can I define the table then?
>  
> Thx,
>    Ramón Pin
>  
> 
> Subject to local law, communications with Accenture and its affiliates including telephone calls and emails (including content), may be monitored by our systems for the purposes of security and the assessment of internal compliance with Accenture policy.
> ______________________________________________________________________________________
> 
> www.accenture.com