You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Alan Gates <ga...@yahoo-inc.com> on 2007/11/06 18:00:47 UTC

Re: PigTypesFunctionalSpec

How the file is read depends on the loader in addition to the spec given 
by the user. Custom loaders will be able to either make use of the spec 
given by the user or ignore it as they choose.

As for the default loader, PigStorage, we are in the situation where the 
vast majority of our data is string, and people are used to reading it 
that way. We also don't want pig to force people to specify the data 
types to be able to read the data. So PigStorage will operate as you 
presume, reading data as text and coercing types. We will also 
defaintely want storage functions to store and read data in native types 
to avoid the conversions. My assumption at this point is that we'll wait 
for Jute's generic serialization routines and use those to implement a 
loader that can handle native types.

I'll update the functional spec to clarify how loaders and type 
specifications interact.

Alan.

David (Ciemo) Ciemiewicz wrote:
>
> Alan,
>
> I just briefly reviewed http://wiki.apache.org/pig/PigTypesFunctionalSpec
>
> It wasn’t clear if the load statement used the types in the “as” 
> clause to read the file, or if it coerced the read data (text) into 
> the associated types.
>
> I’m assuming it is the latter. Is this the case?
>
> --- Ciemo
>

RE: PigTypesFunctionalSpec

Posted by "David (Ciemo) Ciemiewicz" <ci...@yahoo-inc.com>.
Thanks!

This makes a lot of sense -- splitting the mechanisms in this way.

-----Original Message-----
From: Alan Gates [mailto:gates@yahoo-inc.com] 
Sent: Tuesday, November 06, 2007 9:01 AM
To: David (Ciemo) Ciemiewicz
Cc: pig-dev@incubator.apache.org
Subject: Re: PigTypesFunctionalSpec

How the file is read depends on the loader in addition to the spec given

by the user. Custom loaders will be able to either make use of the spec 
given by the user or ignore it as they choose.

As for the default loader, PigStorage, we are in the situation where the

vast majority of our data is string, and people are used to reading it 
that way. We also don't want pig to force people to specify the data 
types to be able to read the data. So PigStorage will operate as you 
presume, reading data as text and coercing types. We will also 
defaintely want storage functions to store and read data in native types

to avoid the conversions. My assumption at this point is that we'll wait

for Jute's generic serialization routines and use those to implement a 
loader that can handle native types.

I'll update the functional spec to clarify how loaders and type 
specifications interact.

Alan.

David (Ciemo) Ciemiewicz wrote:
>
> Alan,
>
> I just briefly reviewed
http://wiki.apache.org/pig/PigTypesFunctionalSpec
>
> It wasn't clear if the load statement used the types in the "as" 
> clause to read the file, or if it coerced the read data (text) into 
> the associated types.
>
> I'm assuming it is the latter. Is this the case?
>
> --- Ciemo
>