You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Sameer Tilak <sa...@gmail.com> on 2009/04/21 19:53:01 UTC

Question about myImageStorageFunc and PigStore

Hi everyone,

We're working on an image analysis project using Pig. I wrote my UDF:
myImageFilter. However, can someone please point me to info about UDF:
myImageStorageFunc. My images will be in a directory in HDFS. So should this
function need to inmplement reading from/ writing to image files to HDFS? Is
there any existing functionality within Pig to do the same or do I need to
write my own? If second, is there any example code to do this?

imagein = load '/myimages' using myImageStorageFunc();
imageop = foreach imagein generate myImageFilter(*);
store imageop into '/mythumbnails' using myImageStorageFunc();

I've similar question about PigStore function.

Regards,
--ST.

Re: Question about Pig BinaryStorage()

Posted by Alan Gates <ga...@yahoo-inc.com>.
Roger,

Santhosh is on vacation for a couple of weeks, so I'll try to help  
out, though I don't have all the context.  Do you want to pass the  
entire tuple to your UDF?  If so, then the syntax would be:

A = load 'images' using BinaryStorage() split by 'file';
B = edu.sdsc.pig.test.myUDF(*);

Then your UDF would get each tuple (record) of images one at a time.

Alan.

On Apr 24, 2009, at 12:47 PM, Roger Unwin wrote:

>
> Santhosh,
>
> You were spot on. That got the files loading, and they streamed out  
> in binary from the dump.
>
> Now I am trying to work out how to pass the file contents  
> (bytearray?) from the load to a UDF.
>
> I can't find an example on the web to show me how to do this, can  
> you assist me one more time?
>
> Thanks,
>
> Roger
>
> REGISTER ./pigTest.jar;
>
> A = load 'images' using BinaryStorage() split by 'file';
> B = edu.sdsc.pig.test.myUDF(A);
>
> dump B;
>
>
> -------------------
>
>
> unwin@hadoop-n:~/pig-0.2.0$ java -cp pig.jar:/home/unwin/ 
> hadoop-0.18.3/conf org.apache.pig.Main  myScript5
> 2009-04-24 12:46:58,733 [main] INFO   
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
> Connecting to hadoop file system at: hdfs://hadoop-n:54310
> 2009-04-24 12:46:59,033 [main] INFO   
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
> Connecting to map-reduce job tracker at: hadoop-t:54311
> 2009-04-24 12:46:59,270 [main] ERROR  
> org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing.  
> Encountered " "." ". "" at line 5, column 8.
> Was expecting one of:
>    "as" ...
>    ";" ...
>
> Details at logfile: /home/unwin/pig-0.2.0/pig_1240577218499.log


Re: Question about Pig BinaryStorage()

Posted by Alan Gates <ga...@yahoo-inc.com>.
Not that's currently exposed to the UDF developer.  Pig does know, but  
it doesn't expose the information in the interface.  You could hard  
code it in your script.

Alan.

On Apr 24, 2009, at 1:05 PM, Roger Unwin wrote:

> Santhosh,
>
> I forgot to add is there a way to tell the name of the file that the  
> binary data comes from?
>
> Thanks,
>
> Roger
>
>
>
> On Apr 24, 2009, at 12:47 PM, Roger Unwin <un...@sdsc.edu> wrote:
>
>>
>> Santhosh,
>>
>> You were spot on. That got the files loading, and they streamed out  
>> in binary from the dump.
>>
>> Now I am trying to work out how to pass the file contents  
>> (bytearray?) from the load to a UDF.
>>
>> I can't find an example on the web to show me how to do this, can  
>> you assist me one more time?
>>
>> Thanks,
>>
>> Roger
>>
>> REGISTER ./pigTest.jar;
>>
>> A = load 'images' using BinaryStorage() split by 'file';
>> B = edu.sdsc.pig.test.myUDF(A);
>>
>> dump B;
>>
>>
>> -------------------
>>
>>
>> unwin@hadoop-n:~/pig-0.2.0$ java -cp pig.jar:/home/unwin/ 
>> hadoop-0.18.3/conf org.apache.pig.Main  myScript5
>> 2009-04-24 12:46:58,733 [main] INFO   
>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
>> Connecting to hadoop file system at: hdfs://hadoop-n:54310
>> 2009-04-24 12:46:59,033 [main] INFO   
>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
>> Connecting to map-reduce job tracker at: hadoop-t:54311
>> 2009-04-24 12:46:59,270 [main] ERROR  
>> org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during  
>> parsing. Encountered " "." ". "" at line 5, column 8.
>> Was expecting one of:
>>   "as" ...
>>   ";" ...
>>
>> Details at logfile: /home/unwin/pig-0.2.0/pig_1240577218499.log


Re: Question about Pig BinaryStorage()

Posted by Roger Unwin <un...@sdsc.edu>.
Santhosh,

I forgot to add is there a way to tell the name of the file that the  
binary data comes from?

Thanks,

Roger



On Apr 24, 2009, at 12:47 PM, Roger Unwin <un...@sdsc.edu> wrote:

>
> Santhosh,
>
> You were spot on. That got the files loading, and they streamed out  
> in binary from the dump.
>
> Now I am trying to work out how to pass the file contents  
> (bytearray?) from the load to a UDF.
>
> I can't find an example on the web to show me how to do this, can  
> you assist me one more time?
>
> Thanks,
>
> Roger
>
> REGISTER ./pigTest.jar;
>
> A = load 'images' using BinaryStorage() split by 'file';
> B = edu.sdsc.pig.test.myUDF(A);
>
> dump B;
>
>
> -------------------
>
>
> unwin@hadoop-n:~/pig-0.2.0$ java -cp pig.jar:/home/unwin/ 
> hadoop-0.18.3/conf org.apache.pig.Main  myScript5
> 2009-04-24 12:46:58,733 [main] INFO   
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
> Connecting to hadoop file system at: hdfs://hadoop-n:54310
> 2009-04-24 12:46:59,033 [main] INFO   
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
> Connecting to map-reduce job tracker at: hadoop-t:54311
> 2009-04-24 12:46:59,270 [main] ERROR  
> org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing.  
> Encountered " "." ". "" at line 5, column 8.
> Was expecting one of:
>    "as" ...
>    ";" ...
>
> Details at logfile: /home/unwin/pig-0.2.0/pig_1240577218499.log

Re: Question about Pig BinaryStorage()

Posted by Roger Unwin <un...@sdsc.edu>.
Santhosh,

You were spot on. That got the files loading, and they streamed out in  
binary from the dump.

Now I am trying to work out how to pass the file contents (bytearray?)  
from the load to a UDF.

I can't find an example on the web to show me how to do this, can you  
assist me one more time?

Thanks,

Roger

REGISTER ./pigTest.jar;

A = load 'images' using BinaryStorage() split by 'file';
B = edu.sdsc.pig.test.myUDF(A);

dump B;


-------------------


unwin@hadoop-n:~/pig-0.2.0$ java -cp pig.jar:/home/unwin/hadoop-0.18.3/ 
conf org.apache.pig.Main  myScript5
2009-04-24 12:46:58,733 [main] INFO   
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
Connecting to hadoop file system at: hdfs://hadoop-n:54310
2009-04-24 12:46:59,033 [main] INFO   
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
Connecting to map-reduce job tracker at: hadoop-t:54311
2009-04-24 12:46:59,270 [main] ERROR org.apache.pig.tools.grunt.Grunt  
- ERROR 1000: Error during parsing. Encountered " "." ". "" at line 5,  
column 8.
Was expecting one of:
     "as" ...
     ";" ...

Details at logfile: /home/unwin/pig-0.2.0/pig_1240577218499.log

RE: Question about Pig BinaryStorage()

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
Drop the -x local.

java -cp pig.jar:/home/unwin/hadoop-0.19.1/conf org.apache.pig.Main
myScript5 

-----Original Message-----
From: Roger Unwin [mailto:unwin@sdsc.edu] 
Sent: Thursday, April 23, 2009 2:30 PM
To: Santhosh Srinivasan
Cc: pig-user@hadoop.apache.org
Subject: Question about Pig BinaryStorage()

Santhosh,

I am trying to iterate through a group of binary files.  I would like  
the reduce job to get 1 binary file each.  Below is the first part of  
it, trying to read the data in.

I have the following script:

images = load 'images' using BinaryStorage() split by 'file';

dump images;

Here is my invocation:
java -cp pig.jar:/home/unwin/hadoop-0.19.1/conf org.apache.pig.Main -x  
local myScript5
2009-04-23 14:22:38,669 [main] ERROR  
org 
.apache 
.pig 
.backend 
.hadoop.executionengine.physicalLayer.relationalOperators.POStore -  
Received error from storer function:  
org.apache.pig.backend.executionengine.ExecException: ERROR 2081:  
Unable to setup the load function.
2009-04-23 14:22:38,673 [main] INFO   
org.apache.pig.backend.local.executionengine.LocalPigLauncher - Failed  
jobs!!
2009-04-23 14:22:38,674 [main] INFO   
org.apache.pig.backend.local.executionengine.LocalPigLauncher - 1 out  
of 1 failed!
2009-04-23 14:22:38,678 [main] ERROR org.apache.pig.tools.grunt.Grunt  
- ERROR 1066: Unable to open iterator for alias images

Here is where the files are in hadoop:
unwin@hadoop-n:~/pig-0.2.0$ ../hadoop-0.18.3/bin/hadoop dfs -ls 'images'
Found 10 items
-rw-r--r--   2 unwin supergroup     272449 2009-04-22 11:04 /user/ 
unwin/images/IMG_0010.JPG
-rw-r--r--   2 unwin supergroup     267580 2009-04-22 11:04 /user/ 
unwin/images/IMG_0011.JPG
-rw-r--r--   2 unwin supergroup     378000 2009-04-22 11:04 /user/ 
unwin/images/IMG_0012.JPG
-rw-r--r--   2 unwin supergroup     327829 2009-04-22 11:04 /user/ 
unwin/images/IMG_0013.JPG
-rw-r--r--   2 unwin supergroup     476088 2009-04-22 11:04 /user/ 
unwin/images/IMG_0014.JPG
-rw-r--r--   2 unwin supergroup     357258 2009-04-22 11:04 /user/ 
unwin/images/IMG_0015.JPG
-rw-r--r--   2 unwin supergroup     401496 2009-04-22 11:04 /user/ 
unwin/images/IMG_0016.JPG
-rw-r--r--   2 unwin supergroup     377798 2009-04-22 11:04 /user/ 
unwin/images/IMG_0017.JPG
-rw-r--r--   2 unwin supergroup     466437 2009-04-22 11:04 /user/ 
unwin/images/IMG_0018.JPG
-rw-r--r--   2 unwin supergroup     351952 2009-04-22 11:04 /user/ 
unwin/images/IMG_0019.JPG

Do you see anything obvious?, or a better way of iterating?

Thanks,

Roger

Question about Pig BinaryStorage()

Posted by Roger Unwin <un...@sdsc.edu>.
Santhosh,

I am trying to iterate through a group of binary files.  I would like  
the reduce job to get 1 binary file each.  Below is the first part of  
it, trying to read the data in.

I have the following script:

images = load 'images' using BinaryStorage() split by 'file';

dump images;

Here is my invocation:
java -cp pig.jar:/home/unwin/hadoop-0.19.1/conf org.apache.pig.Main -x  
local myScript5
2009-04-23 14:22:38,669 [main] ERROR  
org 
.apache 
.pig 
.backend 
.hadoop.executionengine.physicalLayer.relationalOperators.POStore -  
Received error from storer function:  
org.apache.pig.backend.executionengine.ExecException: ERROR 2081:  
Unable to setup the load function.
2009-04-23 14:22:38,673 [main] INFO   
org.apache.pig.backend.local.executionengine.LocalPigLauncher - Failed  
jobs!!
2009-04-23 14:22:38,674 [main] INFO   
org.apache.pig.backend.local.executionengine.LocalPigLauncher - 1 out  
of 1 failed!
2009-04-23 14:22:38,678 [main] ERROR org.apache.pig.tools.grunt.Grunt  
- ERROR 1066: Unable to open iterator for alias images

Here is where the files are in hadoop:
unwin@hadoop-n:~/pig-0.2.0$ ../hadoop-0.18.3/bin/hadoop dfs -ls 'images'
Found 10 items
-rw-r--r--   2 unwin supergroup     272449 2009-04-22 11:04 /user/ 
unwin/images/IMG_0010.JPG
-rw-r--r--   2 unwin supergroup     267580 2009-04-22 11:04 /user/ 
unwin/images/IMG_0011.JPG
-rw-r--r--   2 unwin supergroup     378000 2009-04-22 11:04 /user/ 
unwin/images/IMG_0012.JPG
-rw-r--r--   2 unwin supergroup     327829 2009-04-22 11:04 /user/ 
unwin/images/IMG_0013.JPG
-rw-r--r--   2 unwin supergroup     476088 2009-04-22 11:04 /user/ 
unwin/images/IMG_0014.JPG
-rw-r--r--   2 unwin supergroup     357258 2009-04-22 11:04 /user/ 
unwin/images/IMG_0015.JPG
-rw-r--r--   2 unwin supergroup     401496 2009-04-22 11:04 /user/ 
unwin/images/IMG_0016.JPG
-rw-r--r--   2 unwin supergroup     377798 2009-04-22 11:04 /user/ 
unwin/images/IMG_0017.JPG
-rw-r--r--   2 unwin supergroup     466437 2009-04-22 11:04 /user/ 
unwin/images/IMG_0018.JPG
-rw-r--r--   2 unwin supergroup     351952 2009-04-22 11:04 /user/ 
unwin/images/IMG_0019.JPG

Do you see anything obvious?, or a better way of iterating?

Thanks,

Roger

Re: Question about myImageStorageFunc and PigStore

Posted by Sameer Tilak <sa...@gmail.com>.
Hi Santosh,

Thanks for the info. I'll let you know if I've any further questions.

On Tue, Apr 21, 2009 at 11:47 AM, Santhosh Srinivasan <sm...@yahoo-inc.com>wrote:

>  Sameer,
>
> You can find the documentation for writing User Defined Functions (UDFs) at
> the following location: http://hadoop.apache.org/pig/docs/r0.2.0/udf.html
> In particular, the load and store function documentation is at:
>
> http://hadoop.apache.org/pig/docs/r0.2.0/udf.html#Load%2FStore+Functions
>
>
> Let us know if you have more questions.
>
> Thanks,
> Santhosh
> **
>
>  ------------------------------
> *From:* Sameer Tilak [mailto:sameer.ucsd@gmail.com]
> *Sent:* Tuesday, April 21, 2009 11:43 AM
> *To:* Santhosh Srinivasan
> *Cc:* pig-user@hadoop.apache.org
> *Subject:* Re: Question about myImageStorageFunc and PigStore
>
> Santosh,
>
> Thanks for your reply.  I was planning to use the following code to read
> individual image.
>
> BufferedImage img = null;
> try {
>     img = ImageIO.read(new File(filename));
> } catch (IOException e) {
> }
>
> However, if I remember corectly, load statement doesn't actually load the
> data, it just sets a handle. So in the first statement:
>
> imagein = load '/myimages' using myImageStorageFunc();
>
> myImageStorageFunc can't load the files correct? Or does myImageStorageFunc
> need to load all the files in myimages directory and then imagein is
> basically an array of file handles?
>
> If this is correct, myImageFilter will take file handles one by one and
> then work directly on thosen opne files?
>
> imageop = foreach imagein generate myImageFilter(*);
>
>
> Many thanks.
>
>
> On Tue, Apr 21, 2009 at 11:28 AM, Santhosh Srinivasan <sm...@yahoo-inc.com>wrote:
>
>> Sameer,
>>
>> You need to write your own UDF to read and write image files to the file
>> system.
>> Have a look at the following built-in load and store functions supported
>> by Pig:
>>
>> PigStorage:
>> http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
>> /PigStorage.java?view=log<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin%0A/PigStorage.java?view=log>
>> BinStorage:
>> http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
>> /BinStorage.java?view=log<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin%0A/BinStorage.java?view=log>
>>
>> PigStorage handles read/write of UTF-8 text data.
>> BinStorage handles read/write of binary data. The binary data format is
>> internal to Pig.
>>
>> Thanks,
>> Santhosh
>>
>> -----Original Message-----
>> From: Sameer Tilak [mailto:sameer.ucsd@gmail.com]
>> Sent: Tuesday, April 21, 2009 10:53 AM
>> To: pig-user@hadoop.apache.org
>> Subject: Question about myImageStorageFunc and PigStore
>>
>> Hi everyone,
>>
>> We're working on an image analysis project using Pig. I wrote my UDF:
>> myImageFilter. However, can someone please point me to info about UDF:
>> myImageStorageFunc. My images will be in a directory in HDFS. So should
>> this
>> function need to inmplement reading from/ writing to image files to
>> HDFS? Is
>> there any existing functionality within Pig to do the same or do I need
>> to
>> write my own? If second, is there any example code to do this?
>>
>> imagein = load '/myimages' using myImageStorageFunc();
>> imageop = foreach imagein generate myImageFilter(*);
>> store imageop into '/mythumbnails' using myImageStorageFunc();
>>
>> I've similar question about PigStore function.
>>
>> Regards,
>> --ST.
>>
>
>

RE: Question about myImageStorageFunc and PigStore

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
Sameer,
 
You can find the documentation for writing User Defined Functions (UDFs)
at the following location:
http://hadoop.apache.org/pig/docs/r0.2.0/udf.html
In particular, the load and store function documentation is at:
 
http://hadoop.apache.org/pig/docs/r0.2.0/udf.html#Load%2FStore+Functions
 
 
Let us know if you have more questions.
 
Thanks,
Santhosh
 

________________________________

From: Sameer Tilak [mailto:sameer.ucsd@gmail.com] 
Sent: Tuesday, April 21, 2009 11:43 AM
To: Santhosh Srinivasan
Cc: pig-user@hadoop.apache.org
Subject: Re: Question about myImageStorageFunc and PigStore


Santosh,

Thanks for your reply.  I was planning to use the following code to read
individual image. 

BufferedImage img = null;
try {
    img = ImageIO.read(new File(filename));
} catch (IOException e) {
}

However, if I remember corectly, load statement doesn't actually load
the data, it just sets a handle. So in the first statement: 

imagein = load '/myimages' using myImageStorageFunc();

myImageStorageFunc can't load the files correct? Or does
myImageStorageFunc need to load all the files in myimages directory and
then imagein is basically an array of file handles? 

If this is correct, myImageFilter will take file handles one by one and
then work directly on thosen opne files?

imageop = foreach imagein generate myImageFilter(*);


Many thanks.



On Tue, Apr 21, 2009 at 11:28 AM, Santhosh Srinivasan
<sm...@yahoo-inc.com> wrote:


	Sameer,
	
	You need to write your own UDF to read and write image files to
the file
	system.
	Have a look at the following built-in load and store functions
supported
	by Pig:
	
	PigStorage:
	
http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
	/PigStorage.java?view=log
<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builti
n%0A/PigStorage.java?view=log> 
	BinStorage:
	
http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
	/BinStorage.java?view=log
<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builti
n%0A/BinStorage.java?view=log> 
	
	PigStorage handles read/write of UTF-8 text data.
	BinStorage handles read/write of binary data. The binary data
format is
	internal to Pig.
	
	Thanks,
	Santhosh
	

	-----Original Message-----
	From: Sameer Tilak [mailto:sameer.ucsd@gmail.com]
	Sent: Tuesday, April 21, 2009 10:53 AM
	To: pig-user@hadoop.apache.org
	Subject: Question about myImageStorageFunc and PigStore
	
	Hi everyone,
	
	We're working on an image analysis project using Pig. I wrote my
UDF:
	myImageFilter. However, can someone please point me to info
about UDF:
	myImageStorageFunc. My images will be in a directory in HDFS. So
should
	this
	function need to inmplement reading from/ writing to image files
to
	HDFS? Is
	there any existing functionality within Pig to do the same or do
I need
	to
	write my own? If second, is there any example code to do this?
	
	imagein = load '/myimages' using myImageStorageFunc();
	imageop = foreach imagein generate myImageFilter(*);
	store imageop into '/mythumbnails' using myImageStorageFunc();
	
	I've similar question about PigStore function.
	
	Regards,
	--ST.
	



Re: Question about myImageStorageFunc and PigStore

Posted by Sameer Tilak <sa...@gmail.com>.
Santosh,

Thanks for your reply.  I was planning to use the following code to read
individual image.

BufferedImage img = null;
try {
    img = ImageIO.read(new File(filename));
} catch (IOException e) {
}

However, if I remember corectly, load statement doesn't actually load the
data, it just sets a handle. So in the first statement:

imagein = load '/myimages' using myImageStorageFunc();

myImageStorageFunc can't load the files correct? Or does myImageStorageFunc
need to load all the files in myimages directory and then imagein is
basically an array of file handles?

If this is correct, myImageFilter will take file handles one by one and then
work directly on thosen opne files?

imageop = foreach imagein generate myImageFilter(*);


Many thanks.


On Tue, Apr 21, 2009 at 11:28 AM, Santhosh Srinivasan <sm...@yahoo-inc.com>wrote:

> Sameer,
>
> You need to write your own UDF to read and write image files to the file
> system.
> Have a look at the following built-in load and store functions supported
> by Pig:
>
> PigStorage:
> http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
> /PigStorage.java?view=log<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin%0A/PigStorage.java?view=log>
> BinStorage:
> http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
> /BinStorage.java?view=log<http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin%0A/BinStorage.java?view=log>
>
> PigStorage handles read/write of UTF-8 text data.
> BinStorage handles read/write of binary data. The binary data format is
> internal to Pig.
>
> Thanks,
> Santhosh
>
> -----Original Message-----
> From: Sameer Tilak [mailto:sameer.ucsd@gmail.com]
> Sent: Tuesday, April 21, 2009 10:53 AM
> To: pig-user@hadoop.apache.org
> Subject: Question about myImageStorageFunc and PigStore
>
> Hi everyone,
>
> We're working on an image analysis project using Pig. I wrote my UDF:
> myImageFilter. However, can someone please point me to info about UDF:
> myImageStorageFunc. My images will be in a directory in HDFS. So should
> this
> function need to inmplement reading from/ writing to image files to
> HDFS? Is
> there any existing functionality within Pig to do the same or do I need
> to
> write my own? If second, is there any example code to do this?
>
> imagein = load '/myimages' using myImageStorageFunc();
> imageop = foreach imagein generate myImageFilter(*);
> store imageop into '/mythumbnails' using myImageStorageFunc();
>
> I've similar question about PigStore function.
>
> Regards,
> --ST.
>

RE: Question about myImageStorageFunc and PigStore

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
Sameer,

You need to write your own UDF to read and write image files to the file
system.
Have a look at the following built-in load and store functions supported
by Pig:

PigStorage:
http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
/PigStorage.java?view=log
BinStorage:
http://svn.apache.org/viewvc/hadoop/pig/trunk/src/org/apache/pig/builtin
/BinStorage.java?view=log

PigStorage handles read/write of UTF-8 text data.
BinStorage handles read/write of binary data. The binary data format is
internal to Pig.

Thanks,
Santhosh 

-----Original Message-----
From: Sameer Tilak [mailto:sameer.ucsd@gmail.com] 
Sent: Tuesday, April 21, 2009 10:53 AM
To: pig-user@hadoop.apache.org
Subject: Question about myImageStorageFunc and PigStore

Hi everyone,

We're working on an image analysis project using Pig. I wrote my UDF:
myImageFilter. However, can someone please point me to info about UDF:
myImageStorageFunc. My images will be in a directory in HDFS. So should
this
function need to inmplement reading from/ writing to image files to
HDFS? Is
there any existing functionality within Pig to do the same or do I need
to
write my own? If second, is there any example code to do this?

imagein = load '/myimages' using myImageStorageFunc();
imageop = foreach imagein generate myImageFilter(*);
store imageop into '/mythumbnails' using myImageStorageFunc();

I've similar question about PigStore function.

Regards,
--ST.