You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Jay Vyas <ja...@gmail.com> on 2012/11/13 00:58:37 UTC

Can dynamic Classes be accessible to Mappers/Reducers?

Hi guys:

Im trying to dynamically create a java class at runtime and submit it as a
hadoop job.

How does the Mapper (or for that matter, Reducer) use the data in the Job
object?  That is, how does it load a class?  Is the job object serialized,
along with all the info necessary to load a class?

The reason im wondering is that, in all reality, the class im creating will
not be on the classpath of JVM's in a distributed environment.  But indeed,
it will exist when the Job is created .  So Im wondering wether simply
"creating"  a dynamic class in side of the job executioner will be
serialized and sent over the wire in such a way that it can be instantiated
in a different JVM or not.

-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Jay Vyas <ja...@gmail.com>.
Wow that's an awesome trick.! Okay thanks.

Jay Vyas 
MMSB
UCHC

On Nov 13, 2012, at 3:56 AM, Bertrand Dechoux <de...@gmail.com> wrote:

> You should look at the job conf file.
> You will see that indeed the class for the mapper and reducer are explicitly written.
> So if you generate the class only on the client, the other machines won't be able to load it indeed.
> 
> You should also look at Cascading which does a bit of what you are trying to do.
> The trick they use is that the mapper and reducer are only deserializer wrapper classes.
> They will read the serialized logic (which could be any graph of serialized objects) from the job conf file.
> 
> Regards
> 
> Bertrand
> 
> On Tue, Nov 13, 2012 at 2:54 AM, Zizon Qiu <zz...@gmail.com> wrote:
>> when submiting a job,the ToolRunnuer or JobClient just distribute your jars to hdfs,
>> so that tasktrackers can launch/"re-run" it.
>> 
>> In your case,you should have your dynamic class re-generate in mapper/reducer`s setup method,
>> or the runtime classloader will miss them all.
>> 
>> On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:
>>> Hi guys: 
>>> 
>>> Im trying to dynamically create a java class at runtime and submit it as a hadoop job.  
>>> 
>>> How does the Mapper (or for that matter, Reducer) use the data in the Job object?  That is, how does it load a class?  Is the job object serialized, along with all the info necessary to load a class?  
>>> 
>>> The reason im wondering is that, in all reality, the class im creating will not be on the classpath of JVM's in a distributed environment.  But indeed, it will exist when the Job is created .  So Im wondering wether simply "creating"  a dynamic class in side of the job executioner will be serialized and sent over the wire in such a way that it can be instantiated in a different JVM or not.
>>> 
>>> -- 
>>> Jay Vyas
>>> http://jayunit100.blogspot.com
> 
> 
> 
> -- 
> Bertrand Dechoux

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Jay Vyas <ja...@gmail.com>.
Wow that's an awesome trick.! Okay thanks.

Jay Vyas 
MMSB
UCHC

On Nov 13, 2012, at 3:56 AM, Bertrand Dechoux <de...@gmail.com> wrote:

> You should look at the job conf file.
> You will see that indeed the class for the mapper and reducer are explicitly written.
> So if you generate the class only on the client, the other machines won't be able to load it indeed.
> 
> You should also look at Cascading which does a bit of what you are trying to do.
> The trick they use is that the mapper and reducer are only deserializer wrapper classes.
> They will read the serialized logic (which could be any graph of serialized objects) from the job conf file.
> 
> Regards
> 
> Bertrand
> 
> On Tue, Nov 13, 2012 at 2:54 AM, Zizon Qiu <zz...@gmail.com> wrote:
>> when submiting a job,the ToolRunnuer or JobClient just distribute your jars to hdfs,
>> so that tasktrackers can launch/"re-run" it.
>> 
>> In your case,you should have your dynamic class re-generate in mapper/reducer`s setup method,
>> or the runtime classloader will miss them all.
>> 
>> On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:
>>> Hi guys: 
>>> 
>>> Im trying to dynamically create a java class at runtime and submit it as a hadoop job.  
>>> 
>>> How does the Mapper (or for that matter, Reducer) use the data in the Job object?  That is, how does it load a class?  Is the job object serialized, along with all the info necessary to load a class?  
>>> 
>>> The reason im wondering is that, in all reality, the class im creating will not be on the classpath of JVM's in a distributed environment.  But indeed, it will exist when the Job is created .  So Im wondering wether simply "creating"  a dynamic class in side of the job executioner will be serialized and sent over the wire in such a way that it can be instantiated in a different JVM or not.
>>> 
>>> -- 
>>> Jay Vyas
>>> http://jayunit100.blogspot.com
> 
> 
> 
> -- 
> Bertrand Dechoux

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Jay Vyas <ja...@gmail.com>.
Wow that's an awesome trick.! Okay thanks.

Jay Vyas 
MMSB
UCHC

On Nov 13, 2012, at 3:56 AM, Bertrand Dechoux <de...@gmail.com> wrote:

> You should look at the job conf file.
> You will see that indeed the class for the mapper and reducer are explicitly written.
> So if you generate the class only on the client, the other machines won't be able to load it indeed.
> 
> You should also look at Cascading which does a bit of what you are trying to do.
> The trick they use is that the mapper and reducer are only deserializer wrapper classes.
> They will read the serialized logic (which could be any graph of serialized objects) from the job conf file.
> 
> Regards
> 
> Bertrand
> 
> On Tue, Nov 13, 2012 at 2:54 AM, Zizon Qiu <zz...@gmail.com> wrote:
>> when submiting a job,the ToolRunnuer or JobClient just distribute your jars to hdfs,
>> so that tasktrackers can launch/"re-run" it.
>> 
>> In your case,you should have your dynamic class re-generate in mapper/reducer`s setup method,
>> or the runtime classloader will miss them all.
>> 
>> On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:
>>> Hi guys: 
>>> 
>>> Im trying to dynamically create a java class at runtime and submit it as a hadoop job.  
>>> 
>>> How does the Mapper (or for that matter, Reducer) use the data in the Job object?  That is, how does it load a class?  Is the job object serialized, along with all the info necessary to load a class?  
>>> 
>>> The reason im wondering is that, in all reality, the class im creating will not be on the classpath of JVM's in a distributed environment.  But indeed, it will exist when the Job is created .  So Im wondering wether simply "creating"  a dynamic class in side of the job executioner will be serialized and sent over the wire in such a way that it can be instantiated in a different JVM or not.
>>> 
>>> -- 
>>> Jay Vyas
>>> http://jayunit100.blogspot.com
> 
> 
> 
> -- 
> Bertrand Dechoux

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Jay Vyas <ja...@gmail.com>.
Wow that's an awesome trick.! Okay thanks.

Jay Vyas 
MMSB
UCHC

On Nov 13, 2012, at 3:56 AM, Bertrand Dechoux <de...@gmail.com> wrote:

> You should look at the job conf file.
> You will see that indeed the class for the mapper and reducer are explicitly written.
> So if you generate the class only on the client, the other machines won't be able to load it indeed.
> 
> You should also look at Cascading which does a bit of what you are trying to do.
> The trick they use is that the mapper and reducer are only deserializer wrapper classes.
> They will read the serialized logic (which could be any graph of serialized objects) from the job conf file.
> 
> Regards
> 
> Bertrand
> 
> On Tue, Nov 13, 2012 at 2:54 AM, Zizon Qiu <zz...@gmail.com> wrote:
>> when submiting a job,the ToolRunnuer or JobClient just distribute your jars to hdfs,
>> so that tasktrackers can launch/"re-run" it.
>> 
>> In your case,you should have your dynamic class re-generate in mapper/reducer`s setup method,
>> or the runtime classloader will miss them all.
>> 
>> On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:
>>> Hi guys: 
>>> 
>>> Im trying to dynamically create a java class at runtime and submit it as a hadoop job.  
>>> 
>>> How does the Mapper (or for that matter, Reducer) use the data in the Job object?  That is, how does it load a class?  Is the job object serialized, along with all the info necessary to load a class?  
>>> 
>>> The reason im wondering is that, in all reality, the class im creating will not be on the classpath of JVM's in a distributed environment.  But indeed, it will exist when the Job is created .  So Im wondering wether simply "creating"  a dynamic class in side of the job executioner will be serialized and sent over the wire in such a way that it can be instantiated in a different JVM or not.
>>> 
>>> -- 
>>> Jay Vyas
>>> http://jayunit100.blogspot.com
> 
> 
> 
> -- 
> Bertrand Dechoux

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Bertrand Dechoux <de...@gmail.com>.
You should look at the job conf file.
You will see that indeed the class for the mapper and reducer are
explicitly written.
So if you generate the class only on the client, the other machines won't
be able to load it indeed.

You should also look at Cascading which does a bit of what you are trying
to do.
The trick they use is that the mapper and reducer are only deserializer
wrapper classes.
They will read the serialized logic (which could be any graph of serialized
objects) from the job conf file.

Regards

Bertrand

On Tue, Nov 13, 2012 at 2:54 AM, Zizon Qiu <zz...@gmail.com> wrote:

> when submiting a job,the ToolRunnuer or JobClient just distribute your
> jars to hdfs,
> so that tasktrackers can launch/"re-run" it.
>
> In your case,you should have your dynamic class re-generate in
> mapper/reducer`s setup method,
> or the runtime classloader will miss them all.
>
> On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:
>
>> Hi guys:
>>
>> Im trying to dynamically create a java class at runtime and submit it as
>> a hadoop job.
>>
>> How does the Mapper (or for that matter, Reducer) use the data in the Job
>> object?  That is, how does it load a class?  Is the job object serialized,
>> along with all the info necessary to load a class?
>>
>> The reason im wondering is that, in all reality, the class im creating
>> will not be on the classpath of JVM's in a distributed environment.  But
>> indeed, it will exist when the Job is created .  So Im wondering wether
>> simply "creating"  a dynamic class in side of the job executioner will be
>> serialized and sent over the wire in such a way that it can be instantiated
>> in a different JVM or not.
>>
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com
>>
>
>


-- 
Bertrand Dechoux

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Bertrand Dechoux <de...@gmail.com>.
You should look at the job conf file.
You will see that indeed the class for the mapper and reducer are
explicitly written.
So if you generate the class only on the client, the other machines won't
be able to load it indeed.

You should also look at Cascading which does a bit of what you are trying
to do.
The trick they use is that the mapper and reducer are only deserializer
wrapper classes.
They will read the serialized logic (which could be any graph of serialized
objects) from the job conf file.

Regards

Bertrand

On Tue, Nov 13, 2012 at 2:54 AM, Zizon Qiu <zz...@gmail.com> wrote:

> when submiting a job,the ToolRunnuer or JobClient just distribute your
> jars to hdfs,
> so that tasktrackers can launch/"re-run" it.
>
> In your case,you should have your dynamic class re-generate in
> mapper/reducer`s setup method,
> or the runtime classloader will miss them all.
>
> On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:
>
>> Hi guys:
>>
>> Im trying to dynamically create a java class at runtime and submit it as
>> a hadoop job.
>>
>> How does the Mapper (or for that matter, Reducer) use the data in the Job
>> object?  That is, how does it load a class?  Is the job object serialized,
>> along with all the info necessary to load a class?
>>
>> The reason im wondering is that, in all reality, the class im creating
>> will not be on the classpath of JVM's in a distributed environment.  But
>> indeed, it will exist when the Job is created .  So Im wondering wether
>> simply "creating"  a dynamic class in side of the job executioner will be
>> serialized and sent over the wire in such a way that it can be instantiated
>> in a different JVM or not.
>>
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com
>>
>
>


-- 
Bertrand Dechoux

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Bertrand Dechoux <de...@gmail.com>.
You should look at the job conf file.
You will see that indeed the class for the mapper and reducer are
explicitly written.
So if you generate the class only on the client, the other machines won't
be able to load it indeed.

You should also look at Cascading which does a bit of what you are trying
to do.
The trick they use is that the mapper and reducer are only deserializer
wrapper classes.
They will read the serialized logic (which could be any graph of serialized
objects) from the job conf file.

Regards

Bertrand

On Tue, Nov 13, 2012 at 2:54 AM, Zizon Qiu <zz...@gmail.com> wrote:

> when submiting a job,the ToolRunnuer or JobClient just distribute your
> jars to hdfs,
> so that tasktrackers can launch/"re-run" it.
>
> In your case,you should have your dynamic class re-generate in
> mapper/reducer`s setup method,
> or the runtime classloader will miss them all.
>
> On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:
>
>> Hi guys:
>>
>> Im trying to dynamically create a java class at runtime and submit it as
>> a hadoop job.
>>
>> How does the Mapper (or for that matter, Reducer) use the data in the Job
>> object?  That is, how does it load a class?  Is the job object serialized,
>> along with all the info necessary to load a class?
>>
>> The reason im wondering is that, in all reality, the class im creating
>> will not be on the classpath of JVM's in a distributed environment.  But
>> indeed, it will exist when the Job is created .  So Im wondering wether
>> simply "creating"  a dynamic class in side of the job executioner will be
>> serialized and sent over the wire in such a way that it can be instantiated
>> in a different JVM or not.
>>
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com
>>
>
>


-- 
Bertrand Dechoux

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Bertrand Dechoux <de...@gmail.com>.
You should look at the job conf file.
You will see that indeed the class for the mapper and reducer are
explicitly written.
So if you generate the class only on the client, the other machines won't
be able to load it indeed.

You should also look at Cascading which does a bit of what you are trying
to do.
The trick they use is that the mapper and reducer are only deserializer
wrapper classes.
They will read the serialized logic (which could be any graph of serialized
objects) from the job conf file.

Regards

Bertrand

On Tue, Nov 13, 2012 at 2:54 AM, Zizon Qiu <zz...@gmail.com> wrote:

> when submiting a job,the ToolRunnuer or JobClient just distribute your
> jars to hdfs,
> so that tasktrackers can launch/"re-run" it.
>
> In your case,you should have your dynamic class re-generate in
> mapper/reducer`s setup method,
> or the runtime classloader will miss them all.
>
> On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:
>
>> Hi guys:
>>
>> Im trying to dynamically create a java class at runtime and submit it as
>> a hadoop job.
>>
>> How does the Mapper (or for that matter, Reducer) use the data in the Job
>> object?  That is, how does it load a class?  Is the job object serialized,
>> along with all the info necessary to load a class?
>>
>> The reason im wondering is that, in all reality, the class im creating
>> will not be on the classpath of JVM's in a distributed environment.  But
>> indeed, it will exist when the Job is created .  So Im wondering wether
>> simply "creating"  a dynamic class in side of the job executioner will be
>> serialized and sent over the wire in such a way that it can be instantiated
>> in a different JVM or not.
>>
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com
>>
>
>


-- 
Bertrand Dechoux

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Zizon Qiu <zz...@gmail.com>.
when submiting a job,the ToolRunnuer or JobClient just distribute your jars
to hdfs,
so that tasktrackers can launch/"re-run" it.

In your case,you should have your dynamic class re-generate in
mapper/reducer`s setup method,
or the runtime classloader will miss them all.

On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:

> Hi guys:
>
> Im trying to dynamically create a java class at runtime and submit it as a
> hadoop job.
>
> How does the Mapper (or for that matter, Reducer) use the data in the Job
> object?  That is, how does it load a class?  Is the job object serialized,
> along with all the info necessary to load a class?
>
> The reason im wondering is that, in all reality, the class im creating
> will not be on the classpath of JVM's in a distributed environment.  But
> indeed, it will exist when the Job is created .  So Im wondering wether
> simply "creating"  a dynamic class in side of the job executioner will be
> serialized and sent over the wire in such a way that it can be instantiated
> in a different JVM or not.
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Zizon Qiu <zz...@gmail.com>.
when submiting a job,the ToolRunnuer or JobClient just distribute your jars
to hdfs,
so that tasktrackers can launch/"re-run" it.

In your case,you should have your dynamic class re-generate in
mapper/reducer`s setup method,
or the runtime classloader will miss them all.

On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:

> Hi guys:
>
> Im trying to dynamically create a java class at runtime and submit it as a
> hadoop job.
>
> How does the Mapper (or for that matter, Reducer) use the data in the Job
> object?  That is, how does it load a class?  Is the job object serialized,
> along with all the info necessary to load a class?
>
> The reason im wondering is that, in all reality, the class im creating
> will not be on the classpath of JVM's in a distributed environment.  But
> indeed, it will exist when the Job is created .  So Im wondering wether
> simply "creating"  a dynamic class in side of the job executioner will be
> serialized and sent over the wire in such a way that it can be instantiated
> in a different JVM or not.
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Zizon Qiu <zz...@gmail.com>.
when submiting a job,the ToolRunnuer or JobClient just distribute your jars
to hdfs,
so that tasktrackers can launch/"re-run" it.

In your case,you should have your dynamic class re-generate in
mapper/reducer`s setup method,
or the runtime classloader will miss them all.

On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:

> Hi guys:
>
> Im trying to dynamically create a java class at runtime and submit it as a
> hadoop job.
>
> How does the Mapper (or for that matter, Reducer) use the data in the Job
> object?  That is, how does it load a class?  Is the job object serialized,
> along with all the info necessary to load a class?
>
> The reason im wondering is that, in all reality, the class im creating
> will not be on the classpath of JVM's in a distributed environment.  But
> indeed, it will exist when the Job is created .  So Im wondering wether
> simply "creating"  a dynamic class in side of the job executioner will be
> serialized and sent over the wire in such a way that it can be instantiated
> in a different JVM or not.
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>

Re: Can dynamic Classes be accessible to Mappers/Reducers?

Posted by Zizon Qiu <zz...@gmail.com>.
when submiting a job,the ToolRunnuer or JobClient just distribute your jars
to hdfs,
so that tasktrackers can launch/"re-run" it.

In your case,you should have your dynamic class re-generate in
mapper/reducer`s setup method,
or the runtime classloader will miss them all.

On Tue, Nov 13, 2012 at 7:58 AM, Jay Vyas <ja...@gmail.com> wrote:

> Hi guys:
>
> Im trying to dynamically create a java class at runtime and submit it as a
> hadoop job.
>
> How does the Mapper (or for that matter, Reducer) use the data in the Job
> object?  That is, how does it load a class?  Is the job object serialized,
> along with all the info necessary to load a class?
>
> The reason im wondering is that, in all reality, the class im creating
> will not be on the classpath of JVM's in a distributed environment.  But
> indeed, it will exist when the Job is created .  So Im wondering wether
> simply "creating"  a dynamic class in side of the job executioner will be
> serialized and sent over the wire in such a way that it can be instantiated
> in a different JVM or not.
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>