You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Niels Basjes <Ni...@basjes.nl> on 2015/01/26 10:02:29 UTC

Example of a Loader returning a typed map

Hi,

I'm working on a custom Loader and for one of the usecases I want to be
able to return a typed map. So essentially a map where each element has a
specific schema (tuple/bag).

I.e. something like this:
    http://pig.apache.org/docs/r0.13.0/basic.html#map-schema
    b = stream a through `cat` as (m:*map[{(i:int,j:chararray)}]*); -- Map
value is bag

Possibly I will need a deeper nesting level (i.e. tuple in a bag in a bag).

What I am looking for is preferably an existing opensource Loader that does
this so I can see which API methods must be called where to achieve this
correctly.

Thanks.

-- 
Best regards

Niels Basjes

Re: Example of a Loader returning a typed map

Posted by Niels Basjes <Ni...@basjes.nl>.
Hi,

My bad to copy a streaming example. Sorry. I'm not doing streaming.
I have a working Loader written in Java and I want to return a 'better'
schema for the situation at hand and I'm in need of some example to look
at.

I already have this that implements the LoadMetadata and LoadPushDown stuff.
https://github.com/nielsbasjes/logparser/blob/master/httpdlog/httpdlog-pigloader/src/main/java/nl/basjes/pig/input/apachehttpdlog/Loader.java#L364

For some situations (in my case: parsing the the cookies that are set by
the server) I want to be able to return a map with the key being the name
of the cookie and the value being a record containing all the elements
present when the cookie was set (i.e. value, path, expires, domain).

So far I have not yet found an example on how to pull that off correctly in
both the getSchema (i.e. how do I fill the ResourceSchema) and the getNext
(i.e. how do I fill the Tuple).
In addition an example that shows this in combination with having a
pushProjection that says it only wants a specific field in the record in
the map would be really great.

Has anyone done something like this before (in Java)?

Niels Basjes

On Fri, Jan 30, 2015 at 7:52 PM, Daniel Dai <da...@hortonworks.com> wrote:

> Does streaming python UDF
> (http://pig.apache.org/docs/r0.14.0/udf.html#python-udfs) works? It use
> streaming plus the boilerplate code to convert stdin/stdout to python data
> according to schema.
>
> Daniel
>
> On 1/26/15, 1:02 AM, "Niels Basjes" <Ni...@basjes.nl> wrote:
>
> >Hi,
> >
> >I'm working on a custom Loader and for one of the usecases I want to be
> >able to return a typed map. So essentially a map where each element has a
> >specific schema (tuple/bag).
> >
> >I.e. something like this:
> >    http://pig.apache.org/docs/r0.13.0/basic.html#map-schema
> >    b = stream a through `cat` as (m:*map[{(i:int,j:chararray)}]*); -- Map
> >value is bag
> >
> >Possibly I will need a deeper nesting level (i.e. tuple in a bag in a
> >bag).
> >
> >What I am looking for is preferably an existing opensource Loader that
> >does
> >this so I can see which API methods must be called where to achieve this
> >correctly.
> >
> >Thanks.
> >
> >--
> >Best regards
> >
> >Niels Basjes
>
>


-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Re: Example of a Loader returning a typed map

Posted by Daniel Dai <da...@hortonworks.com>.
Does streaming python UDF
(http://pig.apache.org/docs/r0.14.0/udf.html#python-udfs) works? It use
streaming plus the boilerplate code to convert stdin/stdout to python data
according to schema.

Daniel

On 1/26/15, 1:02 AM, "Niels Basjes" <Ni...@basjes.nl> wrote:

>Hi,
>
>I'm working on a custom Loader and for one of the usecases I want to be
>able to return a typed map. So essentially a map where each element has a
>specific schema (tuple/bag).
>
>I.e. something like this:
>    http://pig.apache.org/docs/r0.13.0/basic.html#map-schema
>    b = stream a through `cat` as (m:*map[{(i:int,j:chararray)}]*); -- Map
>value is bag
>
>Possibly I will need a deeper nesting level (i.e. tuple in a bag in a
>bag).
>
>What I am looking for is preferably an existing opensource Loader that
>does
>this so I can see which API methods must be called where to achieve this
>correctly.
>
>Thanks.
>
>-- 
>Best regards
>
>Niels Basjes