You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Jonathan Bernwieser <be...@gmail.com> on 2013/10/17 14:08:57 UTC

generated code in hadoop

Hi there,

I am currently doing my Bachelor thesis at TU Munich, at the Software
Engineering chair of Prof. Broy.



The goal of this thesis is to create a tool to automatically categorize
source code in open source software. Different categories will be “test
code”, “generated code” and “productive code” to better evaluate and use
the results of quality-check techniques. (Static analyses might detect
certain quality problems even though they’re not relevant for a certain
code category. One example would be the amount of clones found in a
project. It has to be checked what kind of category the evaluated code
belongs to as clones aren’t causing quality issues if they occur in
“generated code”.)



In order to create and test heuristics to identify code categories, I first
need to create manually a collection of different projects (or classes to
be more specific) I actually know about what kind of category they belong
to.

While manually going through the hadoop project I found generated files in
the following directories:

·
hadoop\hadoop-1.0.0\src\hadoop-1.0.0\src\contrib\thriftfs\gen-java\org\apache\hadoop\thriftfs\api\

·
hadoop\hadoop-1.0.0\src\hadoop-1.0.0\src\core\org\apache\hadoop\record\compiler\generated\

Are there any other generated classes I didn’t recognize?

Thanks you for your help.

Looking forward to hearing from you,



Regards,





Jonathan

Re: generated code in hadoop

Posted by Ted Yu <yu...@gmail.com>.
You can checkout trunk code. 
See SVN Access section in:
http://wiki.apache.org/hadoop/HowToContribute

After building hadoop, you will find generated code. 

Cheers

On Oct 17, 2013, at 5:08 AM, Jonathan Bernwieser <be...@gmail.com> wrote:

> Hi there,
> 
> I am currently doing my Bachelor thesis at TU Munich, at the Software
> Engineering chair of Prof. Broy.
> 
> 
> 
> The goal of this thesis is to create a tool to automatically categorize
> source code in open source software. Different categories will be “test
> code”, “generated code” and “productive code” to better evaluate and use
> the results of quality-check techniques. (Static analyses might detect
> certain quality problems even though they’re not relevant for a certain
> code category. One example would be the amount of clones found in a
> project. It has to be checked what kind of category the evaluated code
> belongs to as clones aren’t causing quality issues if they occur in
> “generated code”.)
> 
> 
> 
> In order to create and test heuristics to identify code categories, I first
> need to create manually a collection of different projects (or classes to
> be more specific) I actually know about what kind of category they belong
> to.
> 
> While manually going through the hadoop project I found generated files in
> the following directories:
> 
> ·
> hadoop\hadoop-1.0.0\src\hadoop-1.0.0\src\contrib\thriftfs\gen-java\org\apache\hadoop\thriftfs\api\
> 
> ·
> hadoop\hadoop-1.0.0\src\hadoop-1.0.0\src\core\org\apache\hadoop\record\compiler\generated\
> 
> Are there any other generated classes I didn’t recognize?
> 
> Thanks you for your help.
> 
> Looking forward to hearing from you,
> 
> 
> 
> Regards,
> 
> 
> 
> 
> 
> Jonathan