You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Shuja Rehman <sh...@gmail.com> on 2010/11/07 20:49:58 UTC

Job without Output files

Hi

I have a job where i did not need any reducers. I am using only mappers. At
the moment, the output of job is generated in files. But i want to use only
java api to do some calculation and i want that there should be no output
from the mappers. So is it possible to make a job which did not produce any
kind of output?

Thanks
-- 
Regards
Shuja-ur-Rehman Baig
<http://pk.linkedin.com/in/shujamughal>

Re: Job without Output files

Posted by Shuja Rehman <sh...@gmail.com>.

hi all
what does speculative execution of tasks (if it is turned on)?  means??? and
how to turn off it and what is the advantage/disadvantage of it?

I am not using Tableoutput format because i need to use put statement
millions of times in single job and if i use tableoutput format then the
same job take 6-7 hours to complete which can be completed in 1:30- 2:00
hours using java api.

On Mon, Nov 8, 2010 at 5:47 PM, Jeff Zhang <zj...@gmail.com> wrote:

> My guess is that HBase has version on cells, so inserting
> multiple-times is OK, not sure my guessing is correct
>
>
> On Mon, Nov 8, 2010 at 8:32 PM, Harsh J <qw...@gmail.com> wrote:
> > Hi Jeff,
> >
> > On Mon, Nov 8, 2010 at 3:17 PM, Jeff Zhang <zj...@gmail.com> wrote:
> >> Hi Harsh,
> >>
> >> you point is interesting, then how hbase (TableOutputFormat) handle
> >> speculative execution ? which part of code doing this ?
> >>
> >
> > I was under the impression that they do something to avoid speculative
> > execution induced issues. Guess they don't, and probably suggest
> > turning off speculative features off. I might be wrong, but I don't
> > see them utilizing the ${mapred.output.dir} directory for any writes,
> > so they don't at all handle this?
> >
> > --
> > Harsh J
> > www.harshj.com
> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

-- 
Regards
Shuja-ur-Rehman Baig
<http://pk.linkedin.com/in/shujamughal>

Re: Job without Output files

Posted by Rajappa Iyer <rs...@mayin.org>.

Inserting multiple times is indeed OK.  Each new version of the cell will
get a new timestamp.

On Mon, Nov 8, 2010 at 4:47 AM, Jeff Zhang <zj...@gmail.com> wrote:

> My guess is that HBase has version on cells, so inserting
> multiple-times is OK, not sure my guessing is correct
>
>
> On Mon, Nov 8, 2010 at 8:32 PM, Harsh J <qw...@gmail.com> wrote:
> > Hi Jeff,
> >
> > On Mon, Nov 8, 2010 at 3:17 PM, Jeff Zhang <zj...@gmail.com> wrote:
> >> Hi Harsh,
> >>
> >> you point is interesting, then how hbase (TableOutputFormat) handle
> >> speculative execution ? which part of code doing this ?
> >>
> >
> > I was under the impression that they do something to avoid speculative
> > execution induced issues. Guess they don't, and probably suggest
> > turning off speculative features off. I might be wrong, but I don't
> > see them utilizing the ${mapred.output.dir} directory for any writes,
> > so they don't at all handle this?
> >
> > --
> > Harsh J
> > www.harshj.com
> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: Job without Output files

Posted by Jeff Zhang <zj...@gmail.com>.

My guess is that HBase has version on cells, so inserting
multiple-times is OK, not sure my guessing is correct


On Mon, Nov 8, 2010 at 8:32 PM, Harsh J <qw...@gmail.com> wrote:
> Hi Jeff,
>
> On Mon, Nov 8, 2010 at 3:17 PM, Jeff Zhang <zj...@gmail.com> wrote:
>> Hi Harsh,
>>
>> you point is interesting, then how hbase (TableOutputFormat) handle
>> speculative execution ? which part of code doing this ?
>>
>
> I was under the impression that they do something to avoid speculative
> execution induced issues. Guess they don't, and probably suggest
> turning off speculative features off. I might be wrong, but I don't
> see them utilizing the ${mapred.output.dir} directory for any writes,
> so they don't at all handle this?
>
> --
> Harsh J
> www.harshj.com
>



-- 
Best Regards

Jeff Zhang

Re: Job without Output files

Posted by Harsh J <qw...@gmail.com>.

Hi Jeff,

On Mon, Nov 8, 2010 at 3:17 PM, Jeff Zhang <zj...@gmail.com> wrote:
> Hi Harsh,
>
> you point is interesting, then how hbase (TableOutputFormat) handle
> speculative execution ? which part of code doing this ?
>

I was under the impression that they do something to avoid speculative
execution induced issues. Guess they don't, and probably suggest
turning off speculative features off. I might be wrong, but I don't
see them utilizing the ${mapred.output.dir} directory for any writes,
so they don't at all handle this?

-- 
Harsh J
www.harshj.com

Re: Job without Output files

Posted by Jeff Zhang <zj...@gmail.com>.

Hi Harsh,

you point is interesting, then how hbase (TableOutputFormat) handle
speculative execution ? which part of code doing this ?



On Mon, Nov 8, 2010 at 5:41 PM, Harsh J <qw...@gmail.com> wrote:
> Hi again,
>
> On Mon, Nov 8, 2010 at 2:19 PM, Shuja Rehman <sh...@gmail.com> wrote:
>> Jeff,
>>
>> I am using java api to dump the data into hbase and thats why i did not
>> require any output.
>
> I might be out-dated in this regard, but doesn't HBase provide proper
> Input/OutputFormat classes for using table data via Hadoop MapReduce?
> With your current way, how would you handle speculative execution of
> tasks (if it is turned on)?
>
> --
> Harsh J
> www.harshj.com
>



-- 
Best Regards

Jeff Zhang

Re: Job without Output files

Posted by Harsh J <qw...@gmail.com>.

Hi again,

On Mon, Nov 8, 2010 at 2:19 PM, Shuja Rehman <sh...@gmail.com> wrote:
> Jeff,
>
> I am using java api to dump the data into hbase and thats why i did not
> require any output.

I might be out-dated in this regard, but doesn't HBase provide proper
Input/OutputFormat classes for using table data via Hadoop MapReduce?
With your current way, how would you handle speculative execution of
tasks (if it is turned on)?

-- 
Harsh J
www.harshj.com

Re: Job without Output files

Posted by Shuja Rehman <sh...@gmail.com>.

Jeff,

I am using java api to dump the data into hbase and thats why i did not
require any output.

Thanks

On Mon, Nov 8, 2010 at 6:21 AM, Jeff Zhang <zj...@gmail.com> wrote:

> You can have no output files by creating a customized OutputFormat,
> but without output files, how do you get the output of result and
> what's the meaning of this job ?
>
>
> On Mon, Nov 8, 2010 at 3:49 AM, Shuja Rehman <sh...@gmail.com>
> wrote:
> > Hi
> >
> > I have a job where i did not need any reducers. I am using only mappers.
> At
> > the moment, the output of job is generated in files. But i want to use
> only
> > java api to do some calculation and i want that there should be no output
> > from the mappers. So is it possible to make a job which did not produce
> any
> > kind of output?
> >
> > Thanks
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> >
> >
> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>



-- 
Regards
Shuja-ur-Rehman Baig
<http://pk.linkedin.com/in/shujamughal>

Re: Job without Output files

Posted by Jeff Zhang <zj...@gmail.com>.

You can have no output files by creating a customized OutputFormat,
but without output files, how do you get the output of result and
what's the meaning of this job ?


On Mon, Nov 8, 2010 at 3:49 AM, Shuja Rehman <sh...@gmail.com> wrote:
> Hi
>
> I have a job where i did not need any reducers. I am using only mappers. At
> the moment, the output of job is generated in files. But i want to use only
> java api to do some calculation and i want that there should be no output
> from the mappers. So is it possible to make a job which did not produce any
> kind of output?
>
> Thanks
> --
> Regards
> Shuja-ur-Rehman Baig
>
>
>



-- 
Best Regards

Jeff Zhang

Re: Job without Output files

Posted by Shuja Rehman <sh...@gmail.com>.

Thanks Nullouput format works.


On Mon, Nov 8, 2010 at 1:46 PM, Harsh J <qw...@gmail.com> wrote:

> Hi,
>
> On Mon, Nov 8, 2010 at 1:19 AM, Shuja Rehman <sh...@gmail.com>
> wrote:
> > Hi
> >
> > I have a job where i did not need any reducers. I am using only mappers.
> At
> > the moment, the output of job is generated in files. But i want to use
> only
> > java api to do some calculation and i want that there should be no output
> > from the mappers. So is it possible to make a job which did not produce
> any
> > kind of output?
>
> Use org.apache.hadoop.mapreduce.lib.output.NullOutputFormat to have
> dummy output collectors.
>
> >
> > Thanks
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> >
> >
> >
>
>
>
> --
> Harsh J
> www.harshj.com
>



-- 
Regards
Shuja-ur-Rehman Baig
<http://pk.linkedin.com/in/shujamughal>

Re: Job without Output files

Posted by Harsh J <qw...@gmail.com>.

Hi,

On Mon, Nov 8, 2010 at 1:19 AM, Shuja Rehman <sh...@gmail.com> wrote:
> Hi
>
> I have a job where i did not need any reducers. I am using only mappers. At
> the moment, the output of job is generated in files. But i want to use only
> java api to do some calculation and i want that there should be no output
> from the mappers. So is it possible to make a job which did not produce any
> kind of output?

Use org.apache.hadoop.mapreduce.lib.output.NullOutputFormat to have
dummy output collectors.

>
> Thanks
> --
> Regards
> Shuja-ur-Rehman Baig
>
>
>



-- 
Harsh J
www.harshj.com