You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by kenyh <ke...@gmail.com> on 2012/07/26 07:47:18 UTC

MultithreadedMapper

Multithread Mapreduce introduces multithread execution in map task. In hadoop
1.0.2, MultithreadedMapper implements multithread execution in mapper
function. But I found that synchronization is needed for record reading(read
the input Key and Value) and result output. This contention brings heavy
overhead in performance, which increase 50MB wordcount task execution from
40 seconds to 1 minute. I wonder if there are any optimization about the
multithread mapper to decrease the contention of input reading and output? 
-- 
View this message in context: http://old.nabble.com/MultithreadedMapper-tp34213805p34213805.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.


Re: MultithreadedMapper

Posted by kenyh <ke...@gmail.com>.
For multithreaded mapper, it can get more chances to combine the mapper
output. Meanwhile, the locality of some global data will also be better. But
the implementation in Hadoop 1.0.2 uses heavy synchronization, which brings
much overhead. Are there any optimization about multithreaded mapper?


syscokid wrote:
> 
> Why multithread the mapper? Just create more mappers. That way you spread
> the data load as well as the mapping load potentially across multiple
> nodes.
> 
> 
> kenyh wrote:
>> 
>> I wonder if there are any optimization about the multithread mapper to
>> decrease the contention of input reading and output? 
>> 
> 
> 

-- 
View this message in context: http://old.nabble.com/MultithreadedMapper-tp34213805p34219009.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.


Re: MultithreadedMapper

Posted by kenyh <ke...@gmail.com>.
For multithreaded mapper, it can get more chances to combine the mapper
output. Meanwhile, the locality of some global data will also be better. But
the implementation in Hadoop 1.0.2 uses heavy synchronization, which brings
much overhead. Are there any optimization about multithreaded mapper?


syscokid wrote:
> 
> Why multithread the mapper? Just create more mappers. That way you spread
> the data load as well as the mapping load potentially across multiple
> nodes.
> 
> 
> kenyh wrote:
>> 
>> I wonder if there are any optimization about the multithread mapper to
>> decrease the contention of input reading and output? 
>> 
> 
> 

-- 
View this message in context: http://old.nabble.com/MultithreadedMapper-tp34213805p34219011.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.


Re: MultithreadedMapper

Posted by syscokid <ro...@rovicorp.com>.
Why multithread the mapper? Just create more mappers. That way you spread the
data load as well as the mapping load potentially across multiple nodes.


kenyh wrote:
> 
> I wonder if there are any optimization about the multithread mapper to
> decrease the contention of input reading and output? 
> 

-- 
View this message in context: http://old.nabble.com/MultithreadedMapper-tp34213805p34217963.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.


Re: MultithreadedMapper

Posted by Radim Kolar <hs...@filez.com>.
But I found that synchronization is needed for record reading(read
the input Key and Value) and result output.

I use Spring Batch for that. it has io buffering builtin and it is very easy to use and well documented.


Re: 答复: regarding _HOST token replacement in security hadoop

Posted by "Aaron T. Myers" <at...@cloudera.com>.
What do you have set as the fs.defaultFS in your configuration? Make sure
that that is a fully-qualified domain name.

--
Aaron T. Myers
Software Engineer, Cloudera



On Fri, Jul 27, 2012 at 1:57 PM, Arpit Gupta <ar...@hortonworks.com> wrote:

> That does seem to be valid issue. Could you log a jira for it.
>
> Thanks
>
>
> On Thu, Jul 26, 2012 at 7:32 PM, Wangwenli <wa...@huawei.com> wrote:
>
> > Could you spent one minute to check whether below code will cause issue
> or
> > not?
> >
> > In org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(),
> > it use socAddr.getHostName() to get _HOST,
> > But in org.apache.hadoop.security.SecurityUtil.replacePattern(), in
> > getLocalHostName(), it use getCanonicalHostName() to get _HOST
> >
> > Meanwhile I will check what you said. Thank you~
> >
> >
> > -----邮件原件-----
> > 发件人: Arpit Gupta [mailto:arpit@hortonworks.com]
> > 发送时间: 2012年7月27日 10:03
> > 收件人: common-dev@hadoop.apache.org
> > 主题: Re: regarding _HOST token replacement in security hadoop
> >
> > you need to use HTTP/_HOST@site.com as that is the principal needed by
> > spnego. So you would need create the HTTP/_HOST principal and add it to
> the
> > same keytab (/home/hdfs/keytab/nn.service.keytab).
> >
> > --
> > Arpit Gupta
> > Hortonworks Inc.
> > http://hortonworks.com/
> >
> > On Jul 26, 2012, at 6:54 PM, Wangwenli <wa...@huawei.com> wrote:
> >
> > > Thank yours response.
> > > I am using hadoop-2.0.0-alpha from apache site.  In which version it
> > should configure with HTTP/_HOST@site.com?  I think not in
> > hadoop-2.0.0-alpha. Because I login successful with other principal, pls
> > refer below log:
> > >
> > > 2012-07-23 22:48:17,303 INFO
> >
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler:
> > Login using keytab /home/hdfs/keytab/nn.service.keytab, for principal
> > nn/167-52-0-56.site@site
> > > 2012-07-23 22:48:17,310 INFO
> >
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler:
> > Initialized, principal [nn/167-52-0-56.site@site] from keytab
> > [/home/hdfs/keytab/nn.service.keytab]
> > >
> > >
> > > -----邮件原件-----
> > > 发件人: Arpit Gupta [mailto:arpit@hortonworks.com]
> > > 发送时间: 2012年7月27日 9:22
> > > 收件人: common-dev@hadoop.apache.org
> > > 主题: Re: regarding _HOST token replacement in security hadoop
> > >
> > > what version of hadoop are you using?
> > >
> > > also
> > >
> > > dfs.web.authentication.kerberos.principal should be set to HTTP/_
> > HOST@site.com
> > >
> > > --
> > > Arpit Gupta
> > > Hortonworks Inc.
> > > http://hortonworks.com/
> > >
> > > On Jul 26, 2012, at 6:11 PM, Wangwenli <wa...@huawei.com> wrote:
> > >
> > >> Hi all,
> > >>
> > >>  I configured like below in hdfs-site.xml:
> > >>
> > >> <property>
> > >> <name>dfs.namenode.kerberos.principal</name>
> > >> <value>nn/_HOST@site</value>
> > >> </property>
> > >>
> > >>
> > >> <property>
> > >>   <name>dfs.web.authentication.kerberos.principal</name>
> > >>   <value>nn/_HOST@site</value>
> > >> </property>
> > >>
> > >>
> > >>  When  start up namenode, I found, namenode will use principal :
> > nn/167-52-0-56@site to login, but the http server will use
> > nn/167-52-0-56.site@site<ma...@site> to lgin,  so
> it
> > start failed.
> > >>
> > >> I checked the code,
> > >>
> > >> Namenode will use socAddr.getHostName() to get hostname in
> > org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser.
> > >>
> > >>
> > >> But httpserver 's default hostname is 0.0.0.0, so in
> > org.apache.hadoop.security.SecurityUtil.replacePattern, it will get the
> > hostname by invoking getLocalHostName,there it use
> getCanonicalHostName(),
> > >>
> > >> I think this inconsistent is wrong,  can someone confirm this? Need
> > raise one bug ?
> > >>
> > >> Thanks
> > >>
> > >
> >
> >
>

Re: 答复: regarding _HOST token replacement in security hadoop

Posted by Arpit Gupta <ar...@hortonworks.com>.
That does seem to be valid issue. Could you log a jira for it.

Thanks


On Thu, Jul 26, 2012 at 7:32 PM, Wangwenli <wa...@huawei.com> wrote:

> Could you spent one minute to check whether below code will cause issue or
> not?
>
> In org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(),
> it use socAddr.getHostName() to get _HOST,
> But in org.apache.hadoop.security.SecurityUtil.replacePattern(), in
> getLocalHostName(), it use getCanonicalHostName() to get _HOST
>
> Meanwhile I will check what you said. Thank you~
>
>
> -----邮件原件-----
> 发件人: Arpit Gupta [mailto:arpit@hortonworks.com]
> 发送时间: 2012年7月27日 10:03
> 收件人: common-dev@hadoop.apache.org
> 主题: Re: regarding _HOST token replacement in security hadoop
>
> you need to use HTTP/_HOST@site.com as that is the principal needed by
> spnego. So you would need create the HTTP/_HOST principal and add it to the
> same keytab (/home/hdfs/keytab/nn.service.keytab).
>
> --
> Arpit Gupta
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Jul 26, 2012, at 6:54 PM, Wangwenli <wa...@huawei.com> wrote:
>
> > Thank yours response.
> > I am using hadoop-2.0.0-alpha from apache site.  In which version it
> should configure with HTTP/_HOST@site.com?  I think not in
> hadoop-2.0.0-alpha. Because I login successful with other principal, pls
> refer below log:
> >
> > 2012-07-23 22:48:17,303 INFO
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler:
> Login using keytab /home/hdfs/keytab/nn.service.keytab, for principal
> nn/167-52-0-56.site@site
> > 2012-07-23 22:48:17,310 INFO
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler:
> Initialized, principal [nn/167-52-0-56.site@site] from keytab
> [/home/hdfs/keytab/nn.service.keytab]
> >
> >
> > -----邮件原件-----
> > 发件人: Arpit Gupta [mailto:arpit@hortonworks.com]
> > 发送时间: 2012年7月27日 9:22
> > 收件人: common-dev@hadoop.apache.org
> > 主题: Re: regarding _HOST token replacement in security hadoop
> >
> > what version of hadoop are you using?
> >
> > also
> >
> > dfs.web.authentication.kerberos.principal should be set to HTTP/_
> HOST@site.com
> >
> > --
> > Arpit Gupta
> > Hortonworks Inc.
> > http://hortonworks.com/
> >
> > On Jul 26, 2012, at 6:11 PM, Wangwenli <wa...@huawei.com> wrote:
> >
> >> Hi all,
> >>
> >>  I configured like below in hdfs-site.xml:
> >>
> >> <property>
> >> <name>dfs.namenode.kerberos.principal</name>
> >> <value>nn/_HOST@site</value>
> >> </property>
> >>
> >>
> >> <property>
> >>   <name>dfs.web.authentication.kerberos.principal</name>
> >>   <value>nn/_HOST@site</value>
> >> </property>
> >>
> >>
> >>  When  start up namenode, I found, namenode will use principal :
> nn/167-52-0-56@site to login, but the http server will use
> nn/167-52-0-56.site@site<ma...@site> to lgin,  so it
> start failed.
> >>
> >> I checked the code,
> >>
> >> Namenode will use socAddr.getHostName() to get hostname in
> org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser.
> >>
> >>
> >> But httpserver 's default hostname is 0.0.0.0, so in
> org.apache.hadoop.security.SecurityUtil.replacePattern, it will get the
> hostname by invoking getLocalHostName,there it use getCanonicalHostName(),
> >>
> >> I think this inconsistent is wrong,  can someone confirm this? Need
> raise one bug ?
> >>
> >> Thanks
> >>
> >
>
>

答复: regarding _HOST token replacement in security hadoop

Posted by Wangwenli <wa...@huawei.com>.
Could you spent one minute to check whether below code will cause issue or not?

In org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(), it use socAddr.getHostName() to get _HOST, 
But in org.apache.hadoop.security.SecurityUtil.replacePattern(), in getLocalHostName(), it use getCanonicalHostName() to get _HOST

Meanwhile I will check what you said. Thank you~


-----邮件原件-----
发件人: Arpit Gupta [mailto:arpit@hortonworks.com] 
发送时间: 2012年7月27日 10:03
收件人: common-dev@hadoop.apache.org
主题: Re: regarding _HOST token replacement in security hadoop

you need to use HTTP/_HOST@site.com as that is the principal needed by spnego. So you would need create the HTTP/_HOST principal and add it to the same keytab (/home/hdfs/keytab/nn.service.keytab).

--
Arpit Gupta
Hortonworks Inc.
http://hortonworks.com/

On Jul 26, 2012, at 6:54 PM, Wangwenli <wa...@huawei.com> wrote:

> Thank yours response.
> I am using hadoop-2.0.0-alpha from apache site.  In which version it should configure with HTTP/_HOST@site.com?  I think not in hadoop-2.0.0-alpha. Because I login successful with other principal, pls refer below log:
> 
> 2012-07-23 22:48:17,303 INFO org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler: Login using keytab /home/hdfs/keytab/nn.service.keytab, for principal nn/167-52-0-56.site@site
> 2012-07-23 22:48:17,310 INFO org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler: Initialized, principal [nn/167-52-0-56.site@site] from keytab [/home/hdfs/keytab/nn.service.keytab]
> 
> 
> -----邮件原件-----
> 发件人: Arpit Gupta [mailto:arpit@hortonworks.com] 
> 发送时间: 2012年7月27日 9:22
> 收件人: common-dev@hadoop.apache.org
> 主题: Re: regarding _HOST token replacement in security hadoop
> 
> what version of hadoop are you using?
> 
> also
> 
> dfs.web.authentication.kerberos.principal should be set to HTTP/_HOST@site.com
> 
> --
> Arpit Gupta
> Hortonworks Inc.
> http://hortonworks.com/
> 
> On Jul 26, 2012, at 6:11 PM, Wangwenli <wa...@huawei.com> wrote:
> 
>> Hi all,
>> 
>>  I configured like below in hdfs-site.xml:
>> 
>> <property>
>> <name>dfs.namenode.kerberos.principal</name>
>> <value>nn/_HOST@site</value>
>> </property>
>> 
>> 
>> <property>
>>   <name>dfs.web.authentication.kerberos.principal</name>
>>   <value>nn/_HOST@site</value>
>> </property>
>> 
>> 
>>  When  start up namenode, I found, namenode will use principal : nn/167-52-0-56@site to login, but the http server will use nn/167-52-0-56.site@site<ma...@site> to lgin,  so it start failed.
>> 
>> I checked the code,
>> 
>> Namenode will use socAddr.getHostName() to get hostname in org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser.
>> 
>> 
>> But httpserver 's default hostname is 0.0.0.0, so in org.apache.hadoop.security.SecurityUtil.replacePattern, it will get the hostname by invoking getLocalHostName,there it use getCanonicalHostName(),
>> 
>> I think this inconsistent is wrong,  can someone confirm this? Need raise one bug ? 
>> 
>> Thanks
>> 
> 


Re: regarding _HOST token replacement in security hadoop

Posted by Arpit Gupta <ar...@hortonworks.com>.
you need to use HTTP/_HOST@site.com as that is the principal needed by spnego. So you would need create the HTTP/_HOST principal and add it to the same keytab (/home/hdfs/keytab/nn.service.keytab).

--
Arpit Gupta
Hortonworks Inc.
http://hortonworks.com/

On Jul 26, 2012, at 6:54 PM, Wangwenli <wa...@huawei.com> wrote:

> Thank yours response.
> I am using hadoop-2.0.0-alpha from apache site.  In which version it should configure with HTTP/_HOST@site.com?  I think not in hadoop-2.0.0-alpha. Because I login successful with other principal, pls refer below log:
> 
> 2012-07-23 22:48:17,303 INFO org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler: Login using keytab /home/hdfs/keytab/nn.service.keytab, for principal nn/167-52-0-56.site@site
> 2012-07-23 22:48:17,310 INFO org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler: Initialized, principal [nn/167-52-0-56.site@site] from keytab [/home/hdfs/keytab/nn.service.keytab]
> 
> 
> -----邮件原件-----
> 发件人: Arpit Gupta [mailto:arpit@hortonworks.com] 
> 发送时间: 2012年7月27日 9:22
> 收件人: common-dev@hadoop.apache.org
> 主题: Re: regarding _HOST token replacement in security hadoop
> 
> what version of hadoop are you using?
> 
> also
> 
> dfs.web.authentication.kerberos.principal should be set to HTTP/_HOST@site.com
> 
> --
> Arpit Gupta
> Hortonworks Inc.
> http://hortonworks.com/
> 
> On Jul 26, 2012, at 6:11 PM, Wangwenli <wa...@huawei.com> wrote:
> 
>> Hi all,
>> 
>>  I configured like below in hdfs-site.xml:
>> 
>> <property>
>> <name>dfs.namenode.kerberos.principal</name>
>> <value>nn/_HOST@site</value>
>> </property>
>> 
>> 
>> <property>
>>   <name>dfs.web.authentication.kerberos.principal</name>
>>   <value>nn/_HOST@site</value>
>> </property>
>> 
>> 
>>  When  start up namenode, I found, namenode will use principal : nn/167-52-0-56@site to login, but the http server will use nn/167-52-0-56.site@site<ma...@site> to lgin,  so it start failed.
>> 
>> I checked the code,
>> 
>> Namenode will use socAddr.getHostName() to get hostname in org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser.
>> 
>> 
>> But httpserver 's default hostname is 0.0.0.0, so in org.apache.hadoop.security.SecurityUtil.replacePattern, it will get the hostname by invoking getLocalHostName,there it use getCanonicalHostName(),
>> 
>> I think this inconsistent is wrong,  can someone confirm this? Need raise one bug ? 
>> 
>> Thanks
>> 
> 


答复: regarding _HOST token replacement in security hadoop

Posted by Wangwenli <wa...@huawei.com>.
Thank yours response.
I am using hadoop-2.0.0-alpha from apache site.  In which version it should configure with HTTP/_HOST@site.com?  I think not in hadoop-2.0.0-alpha. Because I login successful with other principal, pls refer below log:

2012-07-23 22:48:17,303 INFO org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler: Login using keytab /home/hdfs/keytab/nn.service.keytab, for principal nn/167-52-0-56.site@site
2012-07-23 22:48:17,310 INFO org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler: Initialized, principal [nn/167-52-0-56.site@site] from keytab [/home/hdfs/keytab/nn.service.keytab]


-----邮件原件-----
发件人: Arpit Gupta [mailto:arpit@hortonworks.com] 
发送时间: 2012年7月27日 9:22
收件人: common-dev@hadoop.apache.org
主题: Re: regarding _HOST token replacement in security hadoop

what version of hadoop are you using?

also

dfs.web.authentication.kerberos.principal should be set to HTTP/_HOST@site.com

--
Arpit Gupta
Hortonworks Inc.
http://hortonworks.com/

On Jul 26, 2012, at 6:11 PM, Wangwenli <wa...@huawei.com> wrote:

> Hi all,
> 
>   I configured like below in hdfs-site.xml:
> 
> <property>
>  <name>dfs.namenode.kerberos.principal</name>
>  <value>nn/_HOST@site</value>
> </property>
> 
> 
> <property>
>    <name>dfs.web.authentication.kerberos.principal</name>
>    <value>nn/_HOST@site</value>
> </property>
> 
> 
>   When  start up namenode, I found, namenode will use principal : nn/167-52-0-56@site to login, but the http server will use nn/167-52-0-56.site@site<ma...@site> to lgin,  so it start failed.
> 
> I checked the code,
> 
> Namenode will use socAddr.getHostName() to get hostname in org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser.
> 
> 
> But httpserver 's default hostname is 0.0.0.0, so in org.apache.hadoop.security.SecurityUtil.replacePattern, it will get the hostname by invoking getLocalHostName,there it use getCanonicalHostName(),
> 
> I think this inconsistent is wrong,  can someone confirm this? Need raise one bug ? 
> 
> Thanks
> 


Re: regarding _HOST token replacement in security hadoop

Posted by Arpit Gupta <ar...@hortonworks.com>.
what version of hadoop are you using?

also

dfs.web.authentication.kerberos.principal should be set to HTTP/_HOST@site.com

--
Arpit Gupta
Hortonworks Inc.
http://hortonworks.com/

On Jul 26, 2012, at 6:11 PM, Wangwenli <wa...@huawei.com> wrote:

> Hi all,
> 
>   I configured like below in hdfs-site.xml:
> 
> <property>
>  <name>dfs.namenode.kerberos.principal</name>
>  <value>nn/_HOST@site</value>
> </property>
> 
> 
> <property>
>    <name>dfs.web.authentication.kerberos.principal</name>
>    <value>nn/_HOST@site</value>
> </property>
> 
> 
>   When  start up namenode, I found, namenode will use principal : nn/167-52-0-56@site to login, but the http server will use nn/167-52-0-56.site@site<ma...@site> to lgin,  so it start failed.
> 
> I checked the code,
> 
> Namenode will use socAddr.getHostName() to get hostname in org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser.
> 
> 
> But httpserver 's default hostname is 0.0.0.0, so in org.apache.hadoop.security.SecurityUtil.replacePattern, it will get the hostname by invoking getLocalHostName,there it use getCanonicalHostName(),
> 
> I think this inconsistent is wrong,  can someone confirm this? Need raise one bug ? 
> 
> Thanks
> 


regarding _HOST token replacement in security hadoop

Posted by Wangwenli <wa...@huawei.com>.
Hi all,

   I configured like below in hdfs-site.xml:

<property>
  <name>dfs.namenode.kerberos.principal</name>
  <value>nn/_HOST@site</value>
</property>


<property>
    <name>dfs.web.authentication.kerberos.principal</name>
    <value>nn/_HOST@site</value>
</property>


   When  start up namenode, I found, namenode will use principal : nn/167-52-0-56@site to login, but the http server will use nn/167-52-0-56.site@site<ma...@site> to lgin,  so it start failed.

I checked the code,

Namenode will use socAddr.getHostName() to get hostname in org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser.


But httpserver 's default hostname is 0.0.0.0, so in org.apache.hadoop.security.SecurityUtil.replacePattern, it will get the hostname by invoking getLocalHostName,there it use getCanonicalHostName(),

I think this inconsistent is wrong,  can someone confirm this? Need raise one bug ? 

Thanks


Re: MultithreadedMapper

Posted by Doug Cutting <cu...@apache.org>.
On Thu, Jul 26, 2012 at 7:42 AM, Robert Evans <ev...@yahoo-inc.com> wrote:
> About the only time that
> MultiThreaded mapper makes a lot of since is if there is a lot of
> computation associated with each key/value pair.

Or if the mapper does a lot of i/o to some external resource, e.g., a
web crawler.

Doug

Re: MultithreadedMapper

Posted by Robert Evans <ev...@yahoo-inc.com>.
In general multithreaded does not get you much in traditional Map/Reduce.
If you want the mappers to run faster you can drop the split size and get
a similar result, because you get more parallelism.  This is the use case
that we have typically concentrated on.  About the only time that
MultiThreaded mapper makes a lot of since is if there is a lot of
computation associated with each key/value pair.  Your process is very
compute bound, and not I/O bound.  Wordcount is typically going to be I/O
bound.  I am not aware of any work that is being done to reduce lock
contention in these cases.  If you want to file a generic JIRA for the
lock contention that would be great.

My gut feeling is that the reason the lock is so course is because the
InputFormats themselves are not thread safe.  Perhaps the simplest thing
you could do is to change it so that each thread gets its own "split" of
the actual split, and then if one finishes early there could be some logic
to try and share a "split" among a limited number of threads. But like
with anything in performance never trust your gut, so please profile it
before doing any code changes.

--Bobby Evans

On 7/26/12 12:47 AM, "kenyh" <ke...@gmail.com> wrote:

>
>Multithread Mapreduce introduces multithread execution in map task. In
>hadoop
>1.0.2, MultithreadedMapper implements multithread execution in mapper
>function. But I found that synchronization is needed for record
>reading(read
>the input Key and Value) and result output. This contention brings heavy
>overhead in performance, which increase 50MB wordcount task execution from
>40 seconds to 1 minute. I wonder if there are any optimization about the
>multithread mapper to decrease the contention of input reading and
>output? 
>-- 
>View this message in context:
>http://old.nabble.com/MultithreadedMapper-tp34213805p34213805.html
>Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>