You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Niels Basjes <Ni...@basjes.nl> on 2015/01/26 10:02:29 UTC
Example of a Loader returning a typed map
Hi,
I'm working on a custom Loader and for one of the usecases I want to be
able to return a typed map. So essentially a map where each element has a
specific schema (tuple/bag).
I.e. something like this:
http://pig.apache.org/docs/r0.13.0/basic.html#map-schema
b = stream a through `cat` as (m:*map[{(i:int,j:chararray)}]*); -- Map
value is bag
Possibly I will need a deeper nesting level (i.e. tuple in a bag in a bag).
What I am looking for is preferably an existing opensource Loader that does
this so I can see which API methods must be called where to achieve this
correctly.
Thanks.
--
Best regards
Niels Basjes
Re: Example of a Loader returning a typed map
Posted by Niels Basjes <Ni...@basjes.nl>.
Hi,
My bad to copy a streaming example. Sorry. I'm not doing streaming.
I have a working Loader written in Java and I want to return a 'better'
schema for the situation at hand and I'm in need of some example to look
at.
I already have this that implements the LoadMetadata and LoadPushDown stuff.
https://github.com/nielsbasjes/logparser/blob/master/httpdlog/httpdlog-pigloader/src/main/java/nl/basjes/pig/input/apachehttpdlog/Loader.java#L364
For some situations (in my case: parsing the the cookies that are set by
the server) I want to be able to return a map with the key being the name
of the cookie and the value being a record containing all the elements
present when the cookie was set (i.e. value, path, expires, domain).
So far I have not yet found an example on how to pull that off correctly in
both the getSchema (i.e. how do I fill the ResourceSchema) and the getNext
(i.e. how do I fill the Tuple).
In addition an example that shows this in combination with having a
pushProjection that says it only wants a specific field in the record in
the map would be really great.
Has anyone done something like this before (in Java)?
Niels Basjes
On Fri, Jan 30, 2015 at 7:52 PM, Daniel Dai <da...@hortonworks.com> wrote:
> Does streaming python UDF
> (http://pig.apache.org/docs/r0.14.0/udf.html#python-udfs) works? It use
> streaming plus the boilerplate code to convert stdin/stdout to python data
> according to schema.
>
> Daniel
>
> On 1/26/15, 1:02 AM, "Niels Basjes" <Ni...@basjes.nl> wrote:
>
> >Hi,
> >
> >I'm working on a custom Loader and for one of the usecases I want to be
> >able to return a typed map. So essentially a map where each element has a
> >specific schema (tuple/bag).
> >
> >I.e. something like this:
> > http://pig.apache.org/docs/r0.13.0/basic.html#map-schema
> > b = stream a through `cat` as (m:*map[{(i:int,j:chararray)}]*); -- Map
> >value is bag
> >
> >Possibly I will need a deeper nesting level (i.e. tuple in a bag in a
> >bag).
> >
> >What I am looking for is preferably an existing opensource Loader that
> >does
> >this so I can see which API methods must be called where to achieve this
> >correctly.
> >
> >Thanks.
> >
> >--
> >Best regards
> >
> >Niels Basjes
>
>
--
Best regards / Met vriendelijke groeten,
Niels Basjes
Re: Example of a Loader returning a typed map
Posted by Daniel Dai <da...@hortonworks.com>.
Does streaming python UDF
(http://pig.apache.org/docs/r0.14.0/udf.html#python-udfs) works? It use
streaming plus the boilerplate code to convert stdin/stdout to python data
according to schema.
Daniel
On 1/26/15, 1:02 AM, "Niels Basjes" <Ni...@basjes.nl> wrote:
>Hi,
>
>I'm working on a custom Loader and for one of the usecases I want to be
>able to return a typed map. So essentially a map where each element has a
>specific schema (tuple/bag).
>
>I.e. something like this:
> http://pig.apache.org/docs/r0.13.0/basic.html#map-schema
> b = stream a through `cat` as (m:*map[{(i:int,j:chararray)}]*); -- Map
>value is bag
>
>Possibly I will need a deeper nesting level (i.e. tuple in a bag in a
>bag).
>
>What I am looking for is preferably an existing opensource Loader that
>does
>this so I can see which API methods must be called where to achieve this
>correctly.
>
>Thanks.
>
>--
>Best regards
>
>Niels Basjes