You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Mohammad Tariq <do...@gmail.com> on 2012/12/03 13:05:48 UTC

Calling C inside MR

Hello list,

          I have a tool (written in C) that performs some different types
of operations and can be used as a command line utility. I had to write a
similar tool, as we have moved towards Hadoop platform for most of the
things.

Till now I have taken this tool as reference  and written MR jobs
corresponding to some the modules of this tool and they are working fine.
But I am wasting a lot of time in this. So, I just wanted to ask if it is
possible to call this tool through a MR job?? Somewhat like JNI kinda
thing. (I hope it is, otherwise I have to write rest of things from scratch
and we are running out of time).

Many thanks.

Regards,
    Mohammad Tariq

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you for the help.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 7:31 PM, Simone Leo <si...@crs4.it> wrote:

> On 12/03/2012 02:52 PM, Mohammad Tariq wrote:
>
>> I did not get why did you point out to the API page
>>
>
> Because the package description contains an introduction to what it is and
> how it is used. I added the wiki page as the next step since it shows a
> sample program.
>
> Simone
>
> --
> Simone Leo
> Data Fusion - Distributed Computing
> CRS4
> POLARIS - Building #1
> Piscina Manna
> I-09010 Pula (CA) - Italy
> e-mail: simone.leo@crs4.it
> http://www.crs4.it
>

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you for the help.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 7:31 PM, Simone Leo <si...@crs4.it> wrote:

> On 12/03/2012 02:52 PM, Mohammad Tariq wrote:
>
>> I did not get why did you point out to the API page
>>
>
> Because the package description contains an introduction to what it is and
> how it is used. I added the wiki page as the next step since it shows a
> sample program.
>
> Simone
>
> --
> Simone Leo
> Data Fusion - Distributed Computing
> CRS4
> POLARIS - Building #1
> Piscina Manna
> I-09010 Pula (CA) - Italy
> e-mail: simone.leo@crs4.it
> http://www.crs4.it
>

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you for the help.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 7:31 PM, Simone Leo <si...@crs4.it> wrote:

> On 12/03/2012 02:52 PM, Mohammad Tariq wrote:
>
>> I did not get why did you point out to the API page
>>
>
> Because the package description contains an introduction to what it is and
> how it is used. I added the wiki page as the next step since it shows a
> sample program.
>
> Simone
>
> --
> Simone Leo
> Data Fusion - Distributed Computing
> CRS4
> POLARIS - Building #1
> Piscina Manna
> I-09010 Pula (CA) - Italy
> e-mail: simone.leo@crs4.it
> http://www.crs4.it
>

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you for the help.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 7:31 PM, Simone Leo <si...@crs4.it> wrote:

> On 12/03/2012 02:52 PM, Mohammad Tariq wrote:
>
>> I did not get why did you point out to the API page
>>
>
> Because the package description contains an introduction to what it is and
> how it is used. I added the wiki page as the next step since it shows a
> sample program.
>
> Simone
>
> --
> Simone Leo
> Data Fusion - Distributed Computing
> CRS4
> POLARIS - Building #1
> Piscina Manna
> I-09010 Pula (CA) - Italy
> e-mail: simone.leo@crs4.it
> http://www.crs4.it
>

Re: Calling C inside MR

Posted by Simone Leo <si...@crs4.it>.
On 12/03/2012 02:52 PM, Mohammad Tariq wrote:
> I did not get why did you point out to the API page

Because the package description contains an introduction to what it is 
and how it is used. I added the wiki page as the next step since it 
shows a sample program.

Simone
-- 
Simone Leo
Data Fusion - Distributed Computing
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: simone.leo@crs4.it
http://www.crs4.it

Re: Calling C inside MR

Posted by Brock Noland <br...@cloudera.com>.
Hi,

Here is an example of how to call native methods from Java in a MR context:

https://github.com/brockn/hadoop-thumbnail

The most important item IMHO is that you have a clear separation of
concerns. Meaning that you can test the C code without java and test
the C+Java without MapReduce.

Brock

On Mon, Dec 3, 2012 at 7:52 AM, Mohammad Tariq <do...@gmail.com> wrote:
> Hello Simone,
>
>      Thank you so much for the pointers. I am actually looking for some way
> using which this can be achieved without Streaming or Pipes (If possible at
> all).
>
> And I did not get why did you point out to the API page. Please let me know
> if you know about something that I could relate. Apologies for my ignorance.
>
> Many thanks.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Dec 3, 2012 at 7:11 PM, Simone Leo <si...@crs4.it> wrote:
>>
>> You can use Hadoop Pipes:
>>
>> http://hadoop.apache.org/docs/stable/api/index.html
>> http://wiki.apache.org/hadoop/C%2B%2BWordCount
>>
>> Simone
>>
>>
>> On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
>>>
>>> Hello list,
>>>
>>>            I have a tool (written in C) that performs some different
>>> types of operations and can be used as a command line utility. I had to
>>> write a similar tool, as we have moved towards Hadoop platform for most
>>> of the things.
>>>
>>> Till now I have taken this tool as reference  and written MR jobs
>>> corresponding to some the modules of this tool and they are working
>>> fine. But I am wasting a lot of time in this. So, I just wanted to ask
>>> if it is possible to call this tool through a MR job?? Somewhat like JNI
>>> kinda thing. (I hope it is, otherwise I have to write rest of things
>>> from scratch and we are running out of time).
>>>
>>> Many thanks.
>>>
>>> Regards,
>>>      Mohammad Tariq
>>>
>>
>> --
>> Simone Leo
>> Data Fusion - Distributed Computing
>> CRS4
>> POLARIS - Building #1
>> Piscina Manna
>> I-09010 Pula (CA) - Italy
>> e-mail: simone.leo@crs4.it
>> http://www.crs4.it
>
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Re: Calling C inside MR

Posted by Simone Leo <si...@crs4.it>.
On 12/03/2012 02:52 PM, Mohammad Tariq wrote:
> I did not get why did you point out to the API page

Because the package description contains an introduction to what it is 
and how it is used. I added the wiki page as the next step since it 
shows a sample program.

Simone
-- 
Simone Leo
Data Fusion - Distributed Computing
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: simone.leo@crs4.it
http://www.crs4.it

Re: Calling C inside MR

Posted by Brock Noland <br...@cloudera.com>.
Hi,

Here is an example of how to call native methods from Java in a MR context:

https://github.com/brockn/hadoop-thumbnail

The most important item IMHO is that you have a clear separation of
concerns. Meaning that you can test the C code without java and test
the C+Java without MapReduce.

Brock

On Mon, Dec 3, 2012 at 7:52 AM, Mohammad Tariq <do...@gmail.com> wrote:
> Hello Simone,
>
>      Thank you so much for the pointers. I am actually looking for some way
> using which this can be achieved without Streaming or Pipes (If possible at
> all).
>
> And I did not get why did you point out to the API page. Please let me know
> if you know about something that I could relate. Apologies for my ignorance.
>
> Many thanks.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Dec 3, 2012 at 7:11 PM, Simone Leo <si...@crs4.it> wrote:
>>
>> You can use Hadoop Pipes:
>>
>> http://hadoop.apache.org/docs/stable/api/index.html
>> http://wiki.apache.org/hadoop/C%2B%2BWordCount
>>
>> Simone
>>
>>
>> On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
>>>
>>> Hello list,
>>>
>>>            I have a tool (written in C) that performs some different
>>> types of operations and can be used as a command line utility. I had to
>>> write a similar tool, as we have moved towards Hadoop platform for most
>>> of the things.
>>>
>>> Till now I have taken this tool as reference  and written MR jobs
>>> corresponding to some the modules of this tool and they are working
>>> fine. But I am wasting a lot of time in this. So, I just wanted to ask
>>> if it is possible to call this tool through a MR job?? Somewhat like JNI
>>> kinda thing. (I hope it is, otherwise I have to write rest of things
>>> from scratch and we are running out of time).
>>>
>>> Many thanks.
>>>
>>> Regards,
>>>      Mohammad Tariq
>>>
>>
>> --
>> Simone Leo
>> Data Fusion - Distributed Computing
>> CRS4
>> POLARIS - Building #1
>> Piscina Manna
>> I-09010 Pula (CA) - Italy
>> e-mail: simone.leo@crs4.it
>> http://www.crs4.it
>
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Re: Calling C inside MR

Posted by Simone Leo <si...@crs4.it>.
On 12/03/2012 02:52 PM, Mohammad Tariq wrote:
> I did not get why did you point out to the API page

Because the package description contains an introduction to what it is 
and how it is used. I added the wiki page as the next step since it 
shows a sample program.

Simone
-- 
Simone Leo
Data Fusion - Distributed Computing
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: simone.leo@crs4.it
http://www.crs4.it

Re: Calling C inside MR

Posted by Simone Leo <si...@crs4.it>.
On 12/03/2012 02:52 PM, Mohammad Tariq wrote:
> I did not get why did you point out to the API page

Because the package description contains an introduction to what it is 
and how it is used. I added the wiki page as the next step since it 
shows a sample program.

Simone
-- 
Simone Leo
Data Fusion - Distributed Computing
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: simone.leo@crs4.it
http://www.crs4.it

Re: Calling C inside MR

Posted by Brock Noland <br...@cloudera.com>.
Hi,

Here is an example of how to call native methods from Java in a MR context:

https://github.com/brockn/hadoop-thumbnail

The most important item IMHO is that you have a clear separation of
concerns. Meaning that you can test the C code without java and test
the C+Java without MapReduce.

Brock

On Mon, Dec 3, 2012 at 7:52 AM, Mohammad Tariq <do...@gmail.com> wrote:
> Hello Simone,
>
>      Thank you so much for the pointers. I am actually looking for some way
> using which this can be achieved without Streaming or Pipes (If possible at
> all).
>
> And I did not get why did you point out to the API page. Please let me know
> if you know about something that I could relate. Apologies for my ignorance.
>
> Many thanks.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Dec 3, 2012 at 7:11 PM, Simone Leo <si...@crs4.it> wrote:
>>
>> You can use Hadoop Pipes:
>>
>> http://hadoop.apache.org/docs/stable/api/index.html
>> http://wiki.apache.org/hadoop/C%2B%2BWordCount
>>
>> Simone
>>
>>
>> On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
>>>
>>> Hello list,
>>>
>>>            I have a tool (written in C) that performs some different
>>> types of operations and can be used as a command line utility. I had to
>>> write a similar tool, as we have moved towards Hadoop platform for most
>>> of the things.
>>>
>>> Till now I have taken this tool as reference  and written MR jobs
>>> corresponding to some the modules of this tool and they are working
>>> fine. But I am wasting a lot of time in this. So, I just wanted to ask
>>> if it is possible to call this tool through a MR job?? Somewhat like JNI
>>> kinda thing. (I hope it is, otherwise I have to write rest of things
>>> from scratch and we are running out of time).
>>>
>>> Many thanks.
>>>
>>> Regards,
>>>      Mohammad Tariq
>>>
>>
>> --
>> Simone Leo
>> Data Fusion - Distributed Computing
>> CRS4
>> POLARIS - Building #1
>> Piscina Manna
>> I-09010 Pula (CA) - Italy
>> e-mail: simone.leo@crs4.it
>> http://www.crs4.it
>
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Re: Calling C inside MR

Posted by Brock Noland <br...@cloudera.com>.
Hi,

Here is an example of how to call native methods from Java in a MR context:

https://github.com/brockn/hadoop-thumbnail

The most important item IMHO is that you have a clear separation of
concerns. Meaning that you can test the C code without java and test
the C+Java without MapReduce.

Brock

On Mon, Dec 3, 2012 at 7:52 AM, Mohammad Tariq <do...@gmail.com> wrote:
> Hello Simone,
>
>      Thank you so much for the pointers. I am actually looking for some way
> using which this can be achieved without Streaming or Pipes (If possible at
> all).
>
> And I did not get why did you point out to the API page. Please let me know
> if you know about something that I could relate. Apologies for my ignorance.
>
> Many thanks.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Dec 3, 2012 at 7:11 PM, Simone Leo <si...@crs4.it> wrote:
>>
>> You can use Hadoop Pipes:
>>
>> http://hadoop.apache.org/docs/stable/api/index.html
>> http://wiki.apache.org/hadoop/C%2B%2BWordCount
>>
>> Simone
>>
>>
>> On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
>>>
>>> Hello list,
>>>
>>>            I have a tool (written in C) that performs some different
>>> types of operations and can be used as a command line utility. I had to
>>> write a similar tool, as we have moved towards Hadoop platform for most
>>> of the things.
>>>
>>> Till now I have taken this tool as reference  and written MR jobs
>>> corresponding to some the modules of this tool and they are working
>>> fine. But I am wasting a lot of time in this. So, I just wanted to ask
>>> if it is possible to call this tool through a MR job?? Somewhat like JNI
>>> kinda thing. (I hope it is, otherwise I have to write rest of things
>>> from scratch and we are running out of time).
>>>
>>> Many thanks.
>>>
>>> Regards,
>>>      Mohammad Tariq
>>>
>>
>> --
>> Simone Leo
>> Data Fusion - Distributed Computing
>> CRS4
>> POLARIS - Building #1
>> Piscina Manna
>> I-09010 Pula (CA) - Italy
>> e-mail: simone.leo@crs4.it
>> http://www.crs4.it
>
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Simone,

     Thank you so much for the pointers. I am actually looking for some way
using which this can be achieved without Streaming or Pipes (If possible at
all).

And I did not get why did you point out to the API page. Please let me know
if you know about something that I could relate. Apologies for my ignorance.

Many thanks.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 7:11 PM, Simone Leo <si...@crs4.it> wrote:

> You can use Hadoop Pipes:
>
> http://hadoop.apache.org/docs/**stable/api/index.html<http://hadoop.apache.org/docs/stable/api/index.html>
> http://wiki.apache.org/hadoop/**C%2B%2BWordCount<http://wiki.apache.org/hadoop/C%2B%2BWordCount>
>
> Simone
>
>
> On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
>
>> Hello list,
>>
>>            I have a tool (written in C) that performs some different
>> types of operations and can be used as a command line utility. I had to
>> write a similar tool, as we have moved towards Hadoop platform for most
>> of the things.
>>
>> Till now I have taken this tool as reference  and written MR jobs
>> corresponding to some the modules of this tool and they are working
>> fine. But I am wasting a lot of time in this. So, I just wanted to ask
>> if it is possible to call this tool through a MR job?? Somewhat like JNI
>> kinda thing. (I hope it is, otherwise I have to write rest of things
>> from scratch and we are running out of time).
>>
>> Many thanks.
>>
>> Regards,
>>      Mohammad Tariq
>>
>>
> --
> Simone Leo
> Data Fusion - Distributed Computing
> CRS4
> POLARIS - Building #1
> Piscina Manna
> I-09010 Pula (CA) - Italy
> e-mail: simone.leo@crs4.it
> http://www.crs4.it
>

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Simone,

     Thank you so much for the pointers. I am actually looking for some way
using which this can be achieved without Streaming or Pipes (If possible at
all).

And I did not get why did you point out to the API page. Please let me know
if you know about something that I could relate. Apologies for my ignorance.

Many thanks.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 7:11 PM, Simone Leo <si...@crs4.it> wrote:

> You can use Hadoop Pipes:
>
> http://hadoop.apache.org/docs/**stable/api/index.html<http://hadoop.apache.org/docs/stable/api/index.html>
> http://wiki.apache.org/hadoop/**C%2B%2BWordCount<http://wiki.apache.org/hadoop/C%2B%2BWordCount>
>
> Simone
>
>
> On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
>
>> Hello list,
>>
>>            I have a tool (written in C) that performs some different
>> types of operations and can be used as a command line utility. I had to
>> write a similar tool, as we have moved towards Hadoop platform for most
>> of the things.
>>
>> Till now I have taken this tool as reference  and written MR jobs
>> corresponding to some the modules of this tool and they are working
>> fine. But I am wasting a lot of time in this. So, I just wanted to ask
>> if it is possible to call this tool through a MR job?? Somewhat like JNI
>> kinda thing. (I hope it is, otherwise I have to write rest of things
>> from scratch and we are running out of time).
>>
>> Many thanks.
>>
>> Regards,
>>      Mohammad Tariq
>>
>>
> --
> Simone Leo
> Data Fusion - Distributed Computing
> CRS4
> POLARIS - Building #1
> Piscina Manna
> I-09010 Pula (CA) - Italy
> e-mail: simone.leo@crs4.it
> http://www.crs4.it
>

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Simone,

     Thank you so much for the pointers. I am actually looking for some way
using which this can be achieved without Streaming or Pipes (If possible at
all).

And I did not get why did you point out to the API page. Please let me know
if you know about something that I could relate. Apologies for my ignorance.

Many thanks.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 7:11 PM, Simone Leo <si...@crs4.it> wrote:

> You can use Hadoop Pipes:
>
> http://hadoop.apache.org/docs/**stable/api/index.html<http://hadoop.apache.org/docs/stable/api/index.html>
> http://wiki.apache.org/hadoop/**C%2B%2BWordCount<http://wiki.apache.org/hadoop/C%2B%2BWordCount>
>
> Simone
>
>
> On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
>
>> Hello list,
>>
>>            I have a tool (written in C) that performs some different
>> types of operations and can be used as a command line utility. I had to
>> write a similar tool, as we have moved towards Hadoop platform for most
>> of the things.
>>
>> Till now I have taken this tool as reference  and written MR jobs
>> corresponding to some the modules of this tool and they are working
>> fine. But I am wasting a lot of time in this. So, I just wanted to ask
>> if it is possible to call this tool through a MR job?? Somewhat like JNI
>> kinda thing. (I hope it is, otherwise I have to write rest of things
>> from scratch and we are running out of time).
>>
>> Many thanks.
>>
>> Regards,
>>      Mohammad Tariq
>>
>>
> --
> Simone Leo
> Data Fusion - Distributed Computing
> CRS4
> POLARIS - Building #1
> Piscina Manna
> I-09010 Pula (CA) - Italy
> e-mail: simone.leo@crs4.it
> http://www.crs4.it
>

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Simone,

     Thank you so much for the pointers. I am actually looking for some way
using which this can be achieved without Streaming or Pipes (If possible at
all).

And I did not get why did you point out to the API page. Please let me know
if you know about something that I could relate. Apologies for my ignorance.

Many thanks.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 7:11 PM, Simone Leo <si...@crs4.it> wrote:

> You can use Hadoop Pipes:
>
> http://hadoop.apache.org/docs/**stable/api/index.html<http://hadoop.apache.org/docs/stable/api/index.html>
> http://wiki.apache.org/hadoop/**C%2B%2BWordCount<http://wiki.apache.org/hadoop/C%2B%2BWordCount>
>
> Simone
>
>
> On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
>
>> Hello list,
>>
>>            I have a tool (written in C) that performs some different
>> types of operations and can be used as a command line utility. I had to
>> write a similar tool, as we have moved towards Hadoop platform for most
>> of the things.
>>
>> Till now I have taken this tool as reference  and written MR jobs
>> corresponding to some the modules of this tool and they are working
>> fine. But I am wasting a lot of time in this. So, I just wanted to ask
>> if it is possible to call this tool through a MR job?? Somewhat like JNI
>> kinda thing. (I hope it is, otherwise I have to write rest of things
>> from scratch and we are running out of time).
>>
>> Many thanks.
>>
>> Regards,
>>      Mohammad Tariq
>>
>>
> --
> Simone Leo
> Data Fusion - Distributed Computing
> CRS4
> POLARIS - Building #1
> Piscina Manna
> I-09010 Pula (CA) - Italy
> e-mail: simone.leo@crs4.it
> http://www.crs4.it
>

Re: Calling C inside MR

Posted by Simone Leo <si...@crs4.it>.
You can use Hadoop Pipes:

http://hadoop.apache.org/docs/stable/api/index.html
http://wiki.apache.org/hadoop/C%2B%2BWordCount

Simone

On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
> Hello list,
>
>            I have a tool (written in C) that performs some different
> types of operations and can be used as a command line utility. I had to
> write a similar tool, as we have moved towards Hadoop platform for most
> of the things.
>
> Till now I have taken this tool as reference  and written MR jobs
> corresponding to some the modules of this tool and they are working
> fine. But I am wasting a lot of time in this. So, I just wanted to ask
> if it is possible to call this tool through a MR job?? Somewhat like JNI
> kinda thing. (I hope it is, otherwise I have to write rest of things
> from scratch and we are running out of time).
>
> Many thanks.
>
> Regards,
>      Mohammad Tariq
>

-- 
Simone Leo
Data Fusion - Distributed Computing
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: simone.leo@crs4.it
http://www.crs4.it

Re: Calling C inside MR

Posted by Simone Leo <si...@crs4.it>.
You can use Hadoop Pipes:

http://hadoop.apache.org/docs/stable/api/index.html
http://wiki.apache.org/hadoop/C%2B%2BWordCount

Simone

On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
> Hello list,
>
>            I have a tool (written in C) that performs some different
> types of operations and can be used as a command line utility. I had to
> write a similar tool, as we have moved towards Hadoop platform for most
> of the things.
>
> Till now I have taken this tool as reference  and written MR jobs
> corresponding to some the modules of this tool and they are working
> fine. But I am wasting a lot of time in this. So, I just wanted to ask
> if it is possible to call this tool through a MR job?? Somewhat like JNI
> kinda thing. (I hope it is, otherwise I have to write rest of things
> from scratch and we are running out of time).
>
> Many thanks.
>
> Regards,
>      Mohammad Tariq
>

-- 
Simone Leo
Data Fusion - Distributed Computing
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: simone.leo@crs4.it
http://www.crs4.it

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you so much Bertrand for the quick response.

One quick question, would it affect the MR performance?? I mean, if I write
a MR to do something and write another MR for the same task, but instead of
writing the entire processing logic as part of my MR job, the corresponding
'C' module will be called in the second MR. Will there be a lot of
difference between the MRs (performance or otherwise) ??

Thanks again.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 6:24 PM, Bertrand Dechoux <de...@gmail.com> wrote:

> You provided the answer, JNI is a solution. Another one would be to use
> hadoop streaming if your program can read stdin and write into stdout with
> a good enough format.
> A MR job is, in the end, plain java and does not impact how java can call
> external process.
>
> Bertrand
>
>
> On Mon, Dec 3, 2012 at 1:05 PM, Mohammad Tariq <do...@gmail.com> wrote:
>
>> Hello list,
>>
>>           I have a tool (written in C) that performs some different types
>> of operations and can be used as a command line utility. I had to write a
>> similar tool, as we have moved towards Hadoop platform for most of the
>> things.
>>
>> Till now I have taken this tool as reference  and written MR jobs
>> corresponding to some the modules of this tool and they are working fine.
>> But I am wasting a lot of time in this. So, I just wanted to ask if it is
>> possible to call this tool through a MR job?? Somewhat like JNI kinda
>> thing. (I hope it is, otherwise I have to write rest of things from scratch
>> and we are running out of time).
>>
>> Many thanks.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you so much Bertrand for the quick response.

One quick question, would it affect the MR performance?? I mean, if I write
a MR to do something and write another MR for the same task, but instead of
writing the entire processing logic as part of my MR job, the corresponding
'C' module will be called in the second MR. Will there be a lot of
difference between the MRs (performance or otherwise) ??

Thanks again.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 6:24 PM, Bertrand Dechoux <de...@gmail.com> wrote:

> You provided the answer, JNI is a solution. Another one would be to use
> hadoop streaming if your program can read stdin and write into stdout with
> a good enough format.
> A MR job is, in the end, plain java and does not impact how java can call
> external process.
>
> Bertrand
>
>
> On Mon, Dec 3, 2012 at 1:05 PM, Mohammad Tariq <do...@gmail.com> wrote:
>
>> Hello list,
>>
>>           I have a tool (written in C) that performs some different types
>> of operations and can be used as a command line utility. I had to write a
>> similar tool, as we have moved towards Hadoop platform for most of the
>> things.
>>
>> Till now I have taken this tool as reference  and written MR jobs
>> corresponding to some the modules of this tool and they are working fine.
>> But I am wasting a lot of time in this. So, I just wanted to ask if it is
>> possible to call this tool through a MR job?? Somewhat like JNI kinda
>> thing. (I hope it is, otherwise I have to write rest of things from scratch
>> and we are running out of time).
>>
>> Many thanks.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you so much Bertrand for the quick response.

One quick question, would it affect the MR performance?? I mean, if I write
a MR to do something and write another MR for the same task, but instead of
writing the entire processing logic as part of my MR job, the corresponding
'C' module will be called in the second MR. Will there be a lot of
difference between the MRs (performance or otherwise) ??

Thanks again.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 6:24 PM, Bertrand Dechoux <de...@gmail.com> wrote:

> You provided the answer, JNI is a solution. Another one would be to use
> hadoop streaming if your program can read stdin and write into stdout with
> a good enough format.
> A MR job is, in the end, plain java and does not impact how java can call
> external process.
>
> Bertrand
>
>
> On Mon, Dec 3, 2012 at 1:05 PM, Mohammad Tariq <do...@gmail.com> wrote:
>
>> Hello list,
>>
>>           I have a tool (written in C) that performs some different types
>> of operations and can be used as a command line utility. I had to write a
>> similar tool, as we have moved towards Hadoop platform for most of the
>> things.
>>
>> Till now I have taken this tool as reference  and written MR jobs
>> corresponding to some the modules of this tool and they are working fine.
>> But I am wasting a lot of time in this. So, I just wanted to ask if it is
>> possible to call this tool through a MR job?? Somewhat like JNI kinda
>> thing. (I hope it is, otherwise I have to write rest of things from scratch
>> and we are running out of time).
>>
>> Many thanks.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: Calling C inside MR

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you so much Bertrand for the quick response.

One quick question, would it affect the MR performance?? I mean, if I write
a MR to do something and write another MR for the same task, but instead of
writing the entire processing logic as part of my MR job, the corresponding
'C' module will be called in the second MR. Will there be a lot of
difference between the MRs (performance or otherwise) ??

Thanks again.

Regards,
    Mohammad Tariq



On Mon, Dec 3, 2012 at 6:24 PM, Bertrand Dechoux <de...@gmail.com> wrote:

> You provided the answer, JNI is a solution. Another one would be to use
> hadoop streaming if your program can read stdin and write into stdout with
> a good enough format.
> A MR job is, in the end, plain java and does not impact how java can call
> external process.
>
> Bertrand
>
>
> On Mon, Dec 3, 2012 at 1:05 PM, Mohammad Tariq <do...@gmail.com> wrote:
>
>> Hello list,
>>
>>           I have a tool (written in C) that performs some different types
>> of operations and can be used as a command line utility. I had to write a
>> similar tool, as we have moved towards Hadoop platform for most of the
>> things.
>>
>> Till now I have taken this tool as reference  and written MR jobs
>> corresponding to some the modules of this tool and they are working fine.
>> But I am wasting a lot of time in this. So, I just wanted to ask if it is
>> possible to call this tool through a MR job?? Somewhat like JNI kinda
>> thing. (I hope it is, otherwise I have to write rest of things from scratch
>> and we are running out of time).
>>
>> Many thanks.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: Calling C inside MR

Posted by Bertrand Dechoux <de...@gmail.com>.
You provided the answer, JNI is a solution. Another one would be to use
hadoop streaming if your program can read stdin and write into stdout with
a good enough format.
A MR job is, in the end, plain java and does not impact how java can call
external process.

Bertrand

On Mon, Dec 3, 2012 at 1:05 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello list,
>
>           I have a tool (written in C) that performs some different types
> of operations and can be used as a command line utility. I had to write a
> similar tool, as we have moved towards Hadoop platform for most of the
> things.
>
> Till now I have taken this tool as reference  and written MR jobs
> corresponding to some the modules of this tool and they are working fine.
> But I am wasting a lot of time in this. So, I just wanted to ask if it is
> possible to call this tool through a MR job?? Somewhat like JNI kinda
> thing. (I hope it is, otherwise I have to write rest of things from scratch
> and we are running out of time).
>
> Many thanks.
>
> Regards,
>     Mohammad Tariq
>
>


-- 
Bertrand Dechoux

Re: Calling C inside MR

Posted by Bertrand Dechoux <de...@gmail.com>.
You provided the answer, JNI is a solution. Another one would be to use
hadoop streaming if your program can read stdin and write into stdout with
a good enough format.
A MR job is, in the end, plain java and does not impact how java can call
external process.

Bertrand

On Mon, Dec 3, 2012 at 1:05 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello list,
>
>           I have a tool (written in C) that performs some different types
> of operations and can be used as a command line utility. I had to write a
> similar tool, as we have moved towards Hadoop platform for most of the
> things.
>
> Till now I have taken this tool as reference  and written MR jobs
> corresponding to some the modules of this tool and they are working fine.
> But I am wasting a lot of time in this. So, I just wanted to ask if it is
> possible to call this tool through a MR job?? Somewhat like JNI kinda
> thing. (I hope it is, otherwise I have to write rest of things from scratch
> and we are running out of time).
>
> Many thanks.
>
> Regards,
>     Mohammad Tariq
>
>


-- 
Bertrand Dechoux

Re: Calling C inside MR

Posted by Bertrand Dechoux <de...@gmail.com>.
You provided the answer, JNI is a solution. Another one would be to use
hadoop streaming if your program can read stdin and write into stdout with
a good enough format.
A MR job is, in the end, plain java and does not impact how java can call
external process.

Bertrand

On Mon, Dec 3, 2012 at 1:05 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello list,
>
>           I have a tool (written in C) that performs some different types
> of operations and can be used as a command line utility. I had to write a
> similar tool, as we have moved towards Hadoop platform for most of the
> things.
>
> Till now I have taken this tool as reference  and written MR jobs
> corresponding to some the modules of this tool and they are working fine.
> But I am wasting a lot of time in this. So, I just wanted to ask if it is
> possible to call this tool through a MR job?? Somewhat like JNI kinda
> thing. (I hope it is, otherwise I have to write rest of things from scratch
> and we are running out of time).
>
> Many thanks.
>
> Regards,
>     Mohammad Tariq
>
>


-- 
Bertrand Dechoux

Re: Calling C inside MR

Posted by Bertrand Dechoux <de...@gmail.com>.
You provided the answer, JNI is a solution. Another one would be to use
hadoop streaming if your program can read stdin and write into stdout with
a good enough format.
A MR job is, in the end, plain java and does not impact how java can call
external process.

Bertrand

On Mon, Dec 3, 2012 at 1:05 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello list,
>
>           I have a tool (written in C) that performs some different types
> of operations and can be used as a command line utility. I had to write a
> similar tool, as we have moved towards Hadoop platform for most of the
> things.
>
> Till now I have taken this tool as reference  and written MR jobs
> corresponding to some the modules of this tool and they are working fine.
> But I am wasting a lot of time in this. So, I just wanted to ask if it is
> possible to call this tool through a MR job?? Somewhat like JNI kinda
> thing. (I hope it is, otherwise I have to write rest of things from scratch
> and we are running out of time).
>
> Many thanks.
>
> Regards,
>     Mohammad Tariq
>
>


-- 
Bertrand Dechoux

Re: Calling C inside MR

Posted by Simone Leo <si...@crs4.it>.
You can use Hadoop Pipes:

http://hadoop.apache.org/docs/stable/api/index.html
http://wiki.apache.org/hadoop/C%2B%2BWordCount

Simone

On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
> Hello list,
>
>            I have a tool (written in C) that performs some different
> types of operations and can be used as a command line utility. I had to
> write a similar tool, as we have moved towards Hadoop platform for most
> of the things.
>
> Till now I have taken this tool as reference  and written MR jobs
> corresponding to some the modules of this tool and they are working
> fine. But I am wasting a lot of time in this. So, I just wanted to ask
> if it is possible to call this tool through a MR job?? Somewhat like JNI
> kinda thing. (I hope it is, otherwise I have to write rest of things
> from scratch and we are running out of time).
>
> Many thanks.
>
> Regards,
>      Mohammad Tariq
>

-- 
Simone Leo
Data Fusion - Distributed Computing
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: simone.leo@crs4.it
http://www.crs4.it

Re: Calling C inside MR

Posted by Simone Leo <si...@crs4.it>.
You can use Hadoop Pipes:

http://hadoop.apache.org/docs/stable/api/index.html
http://wiki.apache.org/hadoop/C%2B%2BWordCount

Simone

On 12/03/2012 01:05 PM, Mohammad Tariq wrote:
> Hello list,
>
>            I have a tool (written in C) that performs some different
> types of operations and can be used as a command line utility. I had to
> write a similar tool, as we have moved towards Hadoop platform for most
> of the things.
>
> Till now I have taken this tool as reference  and written MR jobs
> corresponding to some the modules of this tool and they are working
> fine. But I am wasting a lot of time in this. So, I just wanted to ask
> if it is possible to call this tool through a MR job?? Somewhat like JNI
> kinda thing. (I hope it is, otherwise I have to write rest of things
> from scratch and we are running out of time).
>
> Many thanks.
>
> Regards,
>      Mohammad Tariq
>

-- 
Simone Leo
Data Fusion - Distributed Computing
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: simone.leo@crs4.it
http://www.crs4.it