You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Alan Gates <ga...@yahoo-inc.com> on 2007/11/06 18:00:47 UTC
Re: PigTypesFunctionalSpec
How the file is read depends on the loader in addition to the spec given
by the user. Custom loaders will be able to either make use of the spec
given by the user or ignore it as they choose.
As for the default loader, PigStorage, we are in the situation where the
vast majority of our data is string, and people are used to reading it
that way. We also don't want pig to force people to specify the data
types to be able to read the data. So PigStorage will operate as you
presume, reading data as text and coercing types. We will also
defaintely want storage functions to store and read data in native types
to avoid the conversions. My assumption at this point is that we'll wait
for Jute's generic serialization routines and use those to implement a
loader that can handle native types.
I'll update the functional spec to clarify how loaders and type
specifications interact.
Alan.
David (Ciemo) Ciemiewicz wrote:
>
> Alan,
>
> I just briefly reviewed http://wiki.apache.org/pig/PigTypesFunctionalSpec
>
> It wasn’t clear if the load statement used the types in the “as”
> clause to read the file, or if it coerced the read data (text) into
> the associated types.
>
> I’m assuming it is the latter. Is this the case?
>
> --- Ciemo
>
RE: PigTypesFunctionalSpec
Posted by "David (Ciemo) Ciemiewicz" <ci...@yahoo-inc.com>.
Thanks!
This makes a lot of sense -- splitting the mechanisms in this way.
-----Original Message-----
From: Alan Gates [mailto:gates@yahoo-inc.com]
Sent: Tuesday, November 06, 2007 9:01 AM
To: David (Ciemo) Ciemiewicz
Cc: pig-dev@incubator.apache.org
Subject: Re: PigTypesFunctionalSpec
How the file is read depends on the loader in addition to the spec given
by the user. Custom loaders will be able to either make use of the spec
given by the user or ignore it as they choose.
As for the default loader, PigStorage, we are in the situation where the
vast majority of our data is string, and people are used to reading it
that way. We also don't want pig to force people to specify the data
types to be able to read the data. So PigStorage will operate as you
presume, reading data as text and coercing types. We will also
defaintely want storage functions to store and read data in native types
to avoid the conversions. My assumption at this point is that we'll wait
for Jute's generic serialization routines and use those to implement a
loader that can handle native types.
I'll update the functional spec to clarify how loaders and type
specifications interact.
Alan.
David (Ciemo) Ciemiewicz wrote:
>
> Alan,
>
> I just briefly reviewed
http://wiki.apache.org/pig/PigTypesFunctionalSpec
>
> It wasn't clear if the load statement used the types in the "as"
> clause to read the file, or if it coerced the read data (text) into
> the associated types.
>
> I'm assuming it is the latter. Is this the case?
>
> --- Ciemo
>