You are viewing a plain text version of this content. The canonical link for it is here.

Posted to general@hadoop.apache.org by Cornelio Iñigo <co...@gmail.com> on 2010/11/15 21:52:22 UTC

hadoop application to pig?

Hi

My name is Cornelio Iñigo and I´m a developer just beginning with this of
hadoop and pig.
I have a doubt about developing an application on pig, I already have my
program on hadoop, this program gets just a column from a dataset (csv file)
and process this data with some functions (like language analisis, analysis
of the content)
 note that in the process of the file I dont use FILTERS COUNTS or any built
in function of Pig, I think that all the fucntions have to be User Defined
Functions

 so Is a good idea (has sense ) to develop this program in Pig?

Thanks in advice

-- 
*Cornelio*

Re: hadoop application to pig?

Posted by Ian Holsman <ha...@holsman.net>.

I think it comes down to if you can translate your existing jobs into udfs and custom steps which can run inside of a pig script, and how much effort that will take. 

Writing pig scripts is much easier than writing m-r ones in my experience. 

---
Ian Holsman - 703 879-3128

I saw the angel in the marble and carved until I set him free -- Michelangelo

On 16/11/2010, at 10:46 AM, David Gruzman <dg...@gmail.com> wrote:

> I am developing the similar application using vanilla map-reduce jobs. I
> think that any specific processing
> can be developed using MapReduce and probabbly will be more efficient.
> Higher level services like Pig or Hive should be used if you need to issue
> "ad hoc" queries to the data.
> With best regards,
> David
> 
> On Mon, Nov 15, 2010 at 10:52 PM, Cornelio Iñigo
> <co...@gmail.com>wrote:
> 
>> Hi
>> 
>> My name is Cornelio Iñigo and I´m a developer just beginning with this of
>> hadoop and pig.
>> I have a doubt about developing an application on pig, I already have my
>> program on hadoop, this program gets just a column from a dataset (csv
>> file)
>> and process this data with some functions (like language analisis, analysis
>> of the content)
>> note that in the process of the file I dont use FILTERS COUNTS or any
>> built
>> in function of Pig, I think that all the fucntions have to be User Defined
>> Functions
>> 
>> so Is a good idea (has sense ) to develop this program in Pig?
>> 
>> Thanks in advice
>> 
>> --
>> *Cornelio*
>>

Re: hadoop application to pig?

Posted by David Gruzman <dg...@gmail.com>.

I am developing the similar application using vanilla map-reduce jobs. I
think that any specific processing
can be developed using MapReduce and probabbly will be more efficient.
Higher level services like Pig or Hive should be used if you need to issue
"ad hoc" queries to the data.
With best regards,
David

On Mon, Nov 15, 2010 at 10:52 PM, Cornelio Iñigo
<co...@gmail.com>wrote:

> Hi
>
> My name is Cornelio Iñigo and I´m a developer just beginning with this of
> hadoop and pig.
> I have a doubt about developing an application on pig, I already have my
> program on hadoop, this program gets just a column from a dataset (csv
> file)
> and process this data with some functions (like language analisis, analysis
> of the content)
>  note that in the process of the file I dont use FILTERS COUNTS or any
> built
> in function of Pig, I think that all the fucntions have to be User Defined
> Functions
>
>  so Is a good idea (has sense ) to develop this program in Pig?
>
> Thanks in advice
>
> --
> *Cornelio*
>