You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Saurabh Nanda <sa...@gmail.com> on 2009/07/13 10:26:38 UTC

Hive SerDe?

Hi,

The DDL page in the Hive Language Manual (
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL) refers to SerDe (
http://wiki.apache.org/hadoop/SerDe), but the page is non-existent. I'm
assuming the SerDe stands for Serialization-Deserialization, and is used for
importing input files which are not in an standard format.

Where can I find more information on how to use SerDe?

Saurabh.
-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Re: Hive SerDe?

Posted by Saurabh Nanda <sa...@gmail.com>.
https://issues.apache.org/jira/browse/HIVE

> Click "create new issue"



Filed https://issues.apache.org/jira/browse/HIVE-633

Saurabh.
-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Re: Hive SerDe?

Posted by Zheng Shao <zs...@gmail.com>.
https://issues.apache.org/jira/browse/HIVE
Click "create new issue"

On Mon, Jul 13, 2009 at 11:50 PM, Saurabh Nanda<sa...@gmail.com> wrote:
>
>>
>> This command does not take quotations for some historical reasons. We
>> will fix it in the future.
>
> Where do I add a bug for this?
>
> Saurabh.
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>



-- 
Yours,
Zheng

Re: Hive SerDe?

Posted by Saurabh Nanda <sa...@gmail.com>.
>
> This command does not take quotations for some historical reasons. We
> will fix it in the future.



Where do I add a bug for this?

Saurabh.
-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Re: Hive SerDe?

Posted by Zheng Shao <zs...@gmail.com>.
Hi Saurabh,

Please try:

hive> add files /tmp/testing.jar;

This command does not take quotations for some historical reasons. We
will fix it in the future.

Zheng

On Mon, Jul 13, 2009 at 11:37 PM, Saurabh Nanda<sa...@gmail.com> wrote:
>
>> I guess
>> http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#head-3142e65b497fb43c4367e6a642a4588c6eaf4985
>> is what I was looking for. Isn't that information in the wrong place?
>
> I'm trying to register a UDF to parse my log file format. For this purpose
> I'm trying to add a local JAR file to the Hive session. No matter what I
> try, Hive always says the file does not exist or throws an error.
>
> hive> add files '/tmp/testing.jar';
>
> hive> add files 'file:///tmp/testing.jar';  -- this throws
> java.net.URISyntaxException: Illegal character in scheme name at index 0
>
> After copying the file to HDFS:
>
> hive> add files 'hdfs://master-hadoop:8020/tmp/testing.jar'; -- this throws
> java.net.URISyntaxException: Illegal character in scheme name at index 0
>
>
> What am I doing wrong?
>
> Saurabh.
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>



-- 
Yours,
Zheng

Re: Hive SerDe?

Posted by Saurabh Nanda <sa...@gmail.com>.
> I guess
> http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#head-3142e65b497fb43c4367e6a642a4588c6eaf4985is what I was looking for. Isn't that information in the wrong place?


I'm trying to register a UDF to parse my log file format. For this purpose
I'm trying to add a local JAR file to the Hive session. No matter what I
try, Hive always says the file does not exist or throws an error.

hive> add files '/tmp/testing.jar';

hive> add files 'file:///tmp/testing.jar';  -- this throws
java.net.URISyntaxException: Illegal character in scheme name at index 0

After copying the file to HDFS:

hive> add files 'hdfs://master-hadoop:8020/tmp/testing.jar'; -- this throws
java.net.URISyntaxException: Illegal character in scheme name at index 0


What am I doing wrong?

Saurabh.
-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Re: Hive SerDe?

Posted by Saurabh Nanda <sa...@gmail.com>.
>
> About user defined functions,
> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF, does not seem to
> have any information on how to create them.



I guess
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#head-3142e65b497fb43c4367e6a642a4588c6eaf4985is
what I was looking for. Isn't that information in the wrong place?

Saurabh.
-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Re: Hive SerDe?

Posted by Saurabh Nanda <sa...@gmail.com>.
> You could pre-parse the input as well, create a user defined function, or
> use the streaming
> logic in MAP/REDUCE/TRANSFORM.


Are you referring to
http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform for
map/reduce/transform? From that page, the following points are not very
clear:

1. How do I use a custom Java function for mapping the input?
2. Can my mapper take the whole row as a single input and transform it into
multiple columns? (All examples on the Wiki seem to have an equal number of
input parameters and output values. Can they be different? Sorry, newbie
question!)
3. Does the mapping script/function need to be present locally or on the
DFS?

About user defined functions,
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF, does not seem to have
any information on how to create them.

Saurabh.
-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Re: Hive SerDe?

Posted by Edward Capriolo <ed...@gmail.com>.
Saurabh,

Yes a SerDe is one way to deal with the input. You could pre-parse the
input as well, create a user defined function, or use the streaming
logic in MAP/REDUCE/TRANSFORM. SerDe is hyper linked in the wiki
because MashingWordsTogether creates a link by default in most wiki
systems.

Edward


On Mon, Jul 13, 2009 at 5:23 AM, Saurabh Nanda<sa...@gmail.com> wrote:
> Hi Zheng,
>
> Is my interpretation about SerDes correct -- "I'm assuming the SerDe stands
> for Serialization-Deserialization, and is used for importing input files
> which are not in an standard format."
>
> Do I need SerDes to import an access log in the following format:
>
> ip_address "-" apache_uid [dd/MMM/yyyy:HH:mm:ss +0530] "GET /location
> HTTP/1.1" response_code response_size "referrer" "user_agent_string"
> "cookies"
>
> If possible, please could you let me know the exact CREATE TABLE and LOAD
> DATA commands that I need to use to load this log file without using SerDes.
>
> Thanks,
> Saurabh.
>
> On Mon, Jul 13, 2009 at 2:35 PM, Zheng Shao <zs...@gmail.com> wrote:
>>
>> Hi Saurabh,
>>
>> In most cases, you won't need to know about SerDe.
>>
>> We are writing a how-to for adding new SerDes. Before that, you might
>> want to take a look at the code
>> serde/src/org/apache/hadoop/hive/serde2 if really interested.
>>
>>
>> Zheng
>>
>> On Mon, Jul 13, 2009 at 1:26 AM, Saurabh Nanda<sa...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > The DDL page in the Hive Language Manual
>> > (http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL) refers to SerDe
>> > (http://wiki.apache.org/hadoop/SerDe), but the page is non-existent. I'm
>> > assuming the SerDe stands for Serialization-Deserialization, and is used
>> > for
>> > importing input files which are not in an standard format.
>> >
>> > Where can I find more information on how to use SerDe?
>> >
>> > Saurabh.
>> > --
>> > http://nandz.blogspot.com
>> > http://foodieforlife.blogspot.com
>> >
>>
>>
>>
>> --
>> Yours,
>> Zheng
>
>
>
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>

Re: Hive SerDe?

Posted by Saurabh Nanda <sa...@gmail.com>.
Hi Zheng,

Is my interpretation about SerDes correct -- "I'm assuming the SerDe stands
for Serialization-Deserialization, and is used for importing input files
which are not in an standard format."

Do I need SerDes to import an access log in the following format:

ip_address "-" apache_uid [dd/MMM/yyyy:HH:mm:ss +0530] "GET /location
HTTP/1.1" response_code response_size "referrer" "user_agent_string"
"cookies"

If possible, please could you let me know the exact CREATE TABLE and LOAD
DATA commands that I need to use to load this log file without using SerDes.

Thanks,
Saurabh.

On Mon, Jul 13, 2009 at 2:35 PM, Zheng Shao <zs...@gmail.com> wrote:

> Hi Saurabh,
>
> In most cases, you won't need to know about SerDe.
>
> We are writing a how-to for adding new SerDes. Before that, you might
> want to take a look at the code
> serde/src/org/apache/hadoop/hive/serde2 if really interested.
>
>
> Zheng
>
> On Mon, Jul 13, 2009 at 1:26 AM, Saurabh Nanda<sa...@gmail.com>
> wrote:
> > Hi,
> >
> > The DDL page in the Hive Language Manual
> > (http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL) refers to SerDe
> > (http://wiki.apache.org/hadoop/SerDe), but the page is non-existent. I'm
> > assuming the SerDe stands for Serialization-Deserialization, and is used
> for
> > importing input files which are not in an standard format.
> >
> > Where can I find more information on how to use SerDe?
> >
> > Saurabh.
> > --
> > http://nandz.blogspot.com
> > http://foodieforlife.blogspot.com
> >
>
>
>
> --
> Yours,
> Zheng
>



-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Re: Hive SerDe?

Posted by Zheng Shao <zs...@gmail.com>.
Hi Saurabh,

In most cases, you won't need to know about SerDe.

We are writing a how-to for adding new SerDes. Before that, you might
want to take a look at the code
serde/src/org/apache/hadoop/hive/serde2 if really interested.


Zheng

On Mon, Jul 13, 2009 at 1:26 AM, Saurabh Nanda<sa...@gmail.com> wrote:
> Hi,
>
> The DDL page in the Hive Language Manual
> (http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL) refers to SerDe
> (http://wiki.apache.org/hadoop/SerDe), but the page is non-existent. I'm
> assuming the SerDe stands for Serialization-Deserialization, and is used for
> importing input files which are not in an standard format.
>
> Where can I find more information on how to use SerDe?
>
> Saurabh.
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>



-- 
Yours,
Zheng