You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Divya Gehlot <di...@gmail.com> on 2015/07/04 07:20:12 UTC

how to write custom log loader and store in JSON format

Hi,
I am new to pig and I have a log file in below format
(Message,NIL,2015-07-01,22:58:53.66,E,xxxxxxxxxx.xxx.xxxxx.xxx,12,0xd6,BIZ,Componentname,0,0.0,key_1=value&KEY_2=1111&KEY_3=VALUE&KEY_4=AU&KEY_5=COMPANY&KEY_6=VALUE&KEY_7=12222222&KEY_8=VALUE&KEY_9=VALUE&KEY_10=VALUE&KEY_10=VALUE)


for which I need to write pig script and store in below JSON format
{Message1:Message,date:2015-07-01,Time:22:58:53.66,E:E,machine
:xxxxxxxxxx.xxx.xxxxx.xxx,data:{key_1:value,key_2:value,key_3:value,key_3:value,key_3:value,key_5:value.....}
}

Can somebody help me in writing custom loader .

would really appreciate your help.

thanks,

Re: how to write custom log loader and store in JSON format

Posted by Arvind S <ar...@gmail.com>.
i am not sure if you need a custom loader .. you could read this as a comma
separated string into individual fields ..
convert the last field into a map data type (using a UDF may be.. ) ..

but if you still want to persue custom loader you can probably take hint
from
https://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/apachelog/

To write this out to Json ..use the twitter elephantbird libs ..
https://github.com/twitter/elephant-bird



*Cheers !!*
Arvind

On Sat, Jul 4, 2015 at 10:50 AM, Divya Gehlot <di...@gmail.com>
wrote:

> Hi,
> I am new to pig and I have a log file in below format
>
> (Message,NIL,2015-07-01,22:58:53.66,E,xxxxxxxxxx.xxx.xxxxx.xxx,12,0xd6,BIZ,Componentname,0,0.0,key_1=value&KEY_2=1111&KEY_3=VALUE&KEY_4=AU&KEY_5=COMPANY&KEY_6=VALUE&KEY_7=12222222&KEY_8=VALUE&KEY_9=VALUE&KEY_10=VALUE&KEY_10=VALUE)
>
>
> for which I need to write pig script and store in below JSON format
> {Message1:Message,date:2015-07-01,Time:22:58:53.66,E:E,machine
>
> :xxxxxxxxxx.xxx.xxxxx.xxx,data:{key_1:value,key_2:value,key_3:value,key_3:value,key_3:value,key_5:value.....}
> }
>
> Can somebody help me in writing custom loader .
>
> would really appreciate your help.
>
> thanks,
>

Re: how to write custom log loader and store in JSON format

Posted by James Bond <bo...@gmail.com>.
I am not sure about Pig, but its easily achievable in MapReduce. We had a
similar requirement, we had to convert logs from RFC syslog format (5424)
into JSON. We have a MR job which does this for us. The reason why we chose
MR was mainly for Error Handling - like missing fields in some records,
removing some blacklisted fields (like SSN etc) which we thought was easier
to do it in MR than pig.

Thanks,
Ashwin

On Sat, Jul 4, 2015 at 10:50 AM, Divya Gehlot <di...@gmail.com>
wrote:

> Hi,
> I am new to pig and I have a log file in below format
>
> (Message,NIL,2015-07-01,22:58:53.66,E,xxxxxxxxxx.xxx.xxxxx.xxx,12,0xd6,BIZ,Componentname,0,0.0,key_1=value&KEY_2=1111&KEY_3=VALUE&KEY_4=AU&KEY_5=COMPANY&KEY_6=VALUE&KEY_7=12222222&KEY_8=VALUE&KEY_9=VALUE&KEY_10=VALUE&KEY_10=VALUE)
>
>
> for which I need to write pig script and store in below JSON format
> {Message1:Message,date:2015-07-01,Time:22:58:53.66,E:E,machine
> :xxxxxxxxxx.xxx.xxxxx.xxx,data:{key_1:value,key_2:value,key_3:value,key_3:value,key_3:value,key_5:value.....}
> }
>
> Can somebody help me in writing custom loader .
>
> would really appreciate your help.
>
> thanks,
>
>

Re: how to write custom log loader and store in JSON format

Posted by James Bond <bo...@gmail.com>.
I am not sure about Pig, but its easily achievable in MapReduce. We had a
similar requirement, we had to convert logs from RFC syslog format (5424)
into JSON. We have a MR job which does this for us. The reason why we chose
MR was mainly for Error Handling - like missing fields in some records,
removing some blacklisted fields (like SSN etc) which we thought was easier
to do it in MR than pig.

Thanks,
Ashwin

On Sat, Jul 4, 2015 at 10:50 AM, Divya Gehlot <di...@gmail.com>
wrote:

> Hi,
> I am new to pig and I have a log file in below format
>
> (Message,NIL,2015-07-01,22:58:53.66,E,xxxxxxxxxx.xxx.xxxxx.xxx,12,0xd6,BIZ,Componentname,0,0.0,key_1=value&KEY_2=1111&KEY_3=VALUE&KEY_4=AU&KEY_5=COMPANY&KEY_6=VALUE&KEY_7=12222222&KEY_8=VALUE&KEY_9=VALUE&KEY_10=VALUE&KEY_10=VALUE)
>
>
> for which I need to write pig script and store in below JSON format
> {Message1:Message,date:2015-07-01,Time:22:58:53.66,E:E,machine
> :xxxxxxxxxx.xxx.xxxxx.xxx,data:{key_1:value,key_2:value,key_3:value,key_3:value,key_3:value,key_5:value.....}
> }
>
> Can somebody help me in writing custom loader .
>
> would really appreciate your help.
>
> thanks,
>
>

Re: how to write custom log loader and store in JSON format

Posted by James Bond <bo...@gmail.com>.
I am not sure about Pig, but its easily achievable in MapReduce. We had a
similar requirement, we had to convert logs from RFC syslog format (5424)
into JSON. We have a MR job which does this for us. The reason why we chose
MR was mainly for Error Handling - like missing fields in some records,
removing some blacklisted fields (like SSN etc) which we thought was easier
to do it in MR than pig.

Thanks,
Ashwin

On Sat, Jul 4, 2015 at 10:50 AM, Divya Gehlot <di...@gmail.com>
wrote:

> Hi,
> I am new to pig and I have a log file in below format
>
> (Message,NIL,2015-07-01,22:58:53.66,E,xxxxxxxxxx.xxx.xxxxx.xxx,12,0xd6,BIZ,Componentname,0,0.0,key_1=value&KEY_2=1111&KEY_3=VALUE&KEY_4=AU&KEY_5=COMPANY&KEY_6=VALUE&KEY_7=12222222&KEY_8=VALUE&KEY_9=VALUE&KEY_10=VALUE&KEY_10=VALUE)
>
>
> for which I need to write pig script and store in below JSON format
> {Message1:Message,date:2015-07-01,Time:22:58:53.66,E:E,machine
> :xxxxxxxxxx.xxx.xxxxx.xxx,data:{key_1:value,key_2:value,key_3:value,key_3:value,key_3:value,key_5:value.....}
> }
>
> Can somebody help me in writing custom loader .
>
> would really appreciate your help.
>
> thanks,
>
>

Re: how to write custom log loader and store in JSON format

Posted by James Bond <bo...@gmail.com>.
I am not sure about Pig, but its easily achievable in MapReduce. We had a
similar requirement, we had to convert logs from RFC syslog format (5424)
into JSON. We have a MR job which does this for us. The reason why we chose
MR was mainly for Error Handling - like missing fields in some records,
removing some blacklisted fields (like SSN etc) which we thought was easier
to do it in MR than pig.

Thanks,
Ashwin

On Sat, Jul 4, 2015 at 10:50 AM, Divya Gehlot <di...@gmail.com>
wrote:

> Hi,
> I am new to pig and I have a log file in below format
>
> (Message,NIL,2015-07-01,22:58:53.66,E,xxxxxxxxxx.xxx.xxxxx.xxx,12,0xd6,BIZ,Componentname,0,0.0,key_1=value&KEY_2=1111&KEY_3=VALUE&KEY_4=AU&KEY_5=COMPANY&KEY_6=VALUE&KEY_7=12222222&KEY_8=VALUE&KEY_9=VALUE&KEY_10=VALUE&KEY_10=VALUE)
>
>
> for which I need to write pig script and store in below JSON format
> {Message1:Message,date:2015-07-01,Time:22:58:53.66,E:E,machine
> :xxxxxxxxxx.xxx.xxxxx.xxx,data:{key_1:value,key_2:value,key_3:value,key_3:value,key_3:value,key_5:value.....}
> }
>
> Can somebody help me in writing custom loader .
>
> would really appreciate your help.
>
> thanks,
>
>

Re: how to write custom log loader and store in JSON format

Posted by James Bond <bo...@gmail.com>.
I am not sure about Pig, but its easily achievable in MapReduce. We had a
similar requirement, we had to convert logs from RFC syslog format (5424)
into JSON. We have a MR job which does this for us. The reason why we chose
MR was mainly for Error Handling - like missing fields in some records,
removing some blacklisted fields (like SSN etc) which we thought was easier
to do it in MR than pig.

Thanks,
Ashwin

On Sat, Jul 4, 2015 at 10:50 AM, Divya Gehlot <di...@gmail.com>
wrote:

> Hi,
> I am new to pig and I have a log file in below format
>
> (Message,NIL,2015-07-01,22:58:53.66,E,xxxxxxxxxx.xxx.xxxxx.xxx,12,0xd6,BIZ,Componentname,0,0.0,key_1=value&KEY_2=1111&KEY_3=VALUE&KEY_4=AU&KEY_5=COMPANY&KEY_6=VALUE&KEY_7=12222222&KEY_8=VALUE&KEY_9=VALUE&KEY_10=VALUE&KEY_10=VALUE)
>
>
> for which I need to write pig script and store in below JSON format
> {Message1:Message,date:2015-07-01,Time:22:58:53.66,E:E,machine
> :xxxxxxxxxx.xxx.xxxxx.xxx,data:{key_1:value,key_2:value,key_3:value,key_3:value,key_3:value,key_5:value.....}
> }
>
> Can somebody help me in writing custom loader .
>
> would really appreciate your help.
>
> thanks,
>
>