You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Gangumalla, Uma" <um...@intel.com> on 2016/02/16 22:36:32 UTC

Chimera as new component in Apache Commons

Hi Devs,



Recently we worked with Spark community to implement the shuffle encryption. While implementing that, we realized some/most of the code in Apache Hadoop encryption code and this implementation code have to be duplicated. This leads to an idea to create separate reusable library, named it as Chimera (https://github.com/intel-hadoop/chimera). It is an optimized cryptographic library. It provides Java API for both cipher level and Java stream level to help developers implement high performance AES encryption/decryption with the minimum code and effort.



We know that Java has Cipher implementations, but why we need this optimized cryptographic library:

1. Performance is very critical for encryption and decryption. The JDK Cipher implementation of AES are not yet optimized with the modern hardware. For example, the optimized implementation is 17x+ faster than JDK6 implementations for some modes such as CBC decryption, CTR and GCM. Even some optimizations has included JDK7 or JDK8, there are still 5x to 6x gap with the most optimized implementations.

2. Java Stream based API on cryptographic data stream. Cipher API is powerful but a lot of code needs to be written for layered stream processing applications. The design pattern is very common in modern applications such as Hadoop or Spark.



Chimera was originally based Hadoop crypto code but was improved and generalized a lot for supporting wider scope of data encryption needs for more components in the community. The encryption related code in Hadoop was developed a year and so far it is running well. So we feel that code part of stable enough already.



So, we propose to contribute this Chimera (optimized encryption library) code to Apache Commons and we wanted to have independent release cycles for this module like any other modules in Apache Commons. This module is basically provides Java based interfaces for encryption based IO and It will have native based AES-NI encryption integration code.



We already discussed about this proposal in Apache Hadoop dev lists and the discussion conclusion was positive to contribute this module to Apache Commons.



We need your help and support in adopting this code to make as Apache Commons sub module and establish for making it to have its own development community (of course we can discuss more about this factors in this thread). And Hadoop and Spark will be the two visible projects to use it. We do expect there will be more projects using it.



Once Apache Commons PMC agreed to place this module under Commons, I will work on getting the interested developers etc for establishing Chimera development community as part of next steps. Please help on the process.



Regards,

Uma (An Apache Hadoop PMC member)

Re: Chimera as new component in Apache Commons

Posted by "Gangumalla, Uma" <um...@intel.com>.
Hi Benedikt,

Thanks for offering the great help.

Benedikt Wrote:

I'm no crypto expert but I can help with the Apache Commons related tasks,
like moving the code over to Apache Commons, setting up the maven build,
publishing the project website etc.
[UMA] Thank you. We would love to work with you on the further steps,
based on your guidance on these aspects.

Benedikt Wrote:

I'd love see you moving Chimera here.

[UMA] Thanks for the acceptance. :-)

Benedikt Wrote:

1. There are no Apache <Component> sub communities. There is only the
Apache Commons community. This means, there won't be a separat mailing list
for the new component. It is important to understand that we are a
community maintaining a number of components. Not a group of sub
communities.
[UMA] Got it. Thanks for the information.


Benedikt Wrote:

2. The Apache Commons versioning guide lines are very restrictive [1]. We
put great effort into binary compatibility. This is, because we expect our
components to be reused by a lot of other projects and we try our best to
avoid jar hell. Often this means, that greater refactorings simply can not
be implemented since they would break BC. This is usually not a problem for
the major components. But it my be a problem for a young component.
[UMA] Right. 


Benedikt Wrote:

3. Apache Commons components usually have a (boring) descriptive name,
rather then a fancy one. This is the reason why we renamed Apache Commons
Sanselan zu Apache Commons Imaging. People should be able to tell just by
looking at the name of a component what that component is about. IMHO
Chimera falls into the fancy name category, so maybe we will discuss that
Name.
[UMA] Ok. Naming can be self descriptive. No issues on this.

Regards,
Uma (An Apache Hadoop PMC member)





On 2/16/16, 11:56 PM, "Benedikt Ritter" <br...@apache.org> wrote:

>Hello Uma,
>
>welcome to the Apache Commons dev list. It's great to see that two
>projects
>get together to share code via Apache Commons.
>
>2016-02-16 22:36 GMT+01:00 Gangumalla, Uma <um...@intel.com>:
>
>> Hi Devs,
>>
>>
>>
>> Recently we worked with Spark community to implement the shuffle
>> encryption. While implementing that, we realized some/most of the code
>>in
>> Apache Hadoop encryption code and this implementation code have to be
>> duplicated. This leads to an idea to create separate reusable library,
>> named it as Chimera (https://github.com/intel-hadoop/chimera). It is an
>> optimized cryptographic library. It provides Java API for both cipher
>>level
>> and Java stream level to help developers implement high performance AES
>> encryption/decryption with the minimum code and effort.
>>
>>
>>
>> We know that Java has Cipher implementations, but why we need this
>> optimized cryptographic library:
>>
>> 1. Performance is very critical for encryption and decryption. The JDK
>> Cipher implementation of AES are not yet optimized with the modern
>> hardware. For example, the optimized implementation is 17x+ faster than
>> JDK6 implementations for some modes such as CBC decryption, CTR and GCM.
>> Even some optimizations has included JDK7 or JDK8, there are still 5x
>>to 6x
>> gap with the most optimized implementations.
>>
>
>That sounds pretty useful! :-)
>
>
>>
>> 2. Java Stream based API on cryptographic data stream. Cipher API is
>> powerful but a lot of code needs to be written for layered stream
>> processing applications. The design pattern is very common in modern
>> applications such as Hadoop or Spark.
>>
>>
>>
>> Chimera was originally based Hadoop crypto code but was improved and
>> generalized a lot for supporting wider scope of data encryption needs
>>for
>> more components in the community. The encryption related code in Hadoop
>>was
>> developed a year and so far it is running well. So we feel that code
>>part
>> of stable enough already.
>>
>>
>>
>> So, we propose to contribute this Chimera (optimized encryption library)
>> code to Apache Commons and we wanted to have independent release cycles
>>for
>> this module like any other modules in Apache Commons. This module is
>> basically provides Java based interfaces for encryption based IO and It
>> will have native based AES-NI encryption integration code.
>>
>>
>>
>> We already discussed about this proposal in Apache Hadoop dev lists and
>> the discussion conclusion was positive to contribute this module to
>>Apache
>> Commons.
>>
>>
>>
>> We need your help and support in adopting this code to make as Apache
>> Commons sub module and establish for making it to have its own
>>development
>> community (of course we can discuss more about this factors in this
>> thread). And Hadoop and Spark will be the two visible projects to use
>>it.
>> We do expect there will be more projects using it.
>>
>>
>I'm no crypto expert but I can help with the Apache Commons related tasks,
>like moving the code over to Apache Commons, setting up the maven build,
>publishing the project website etc.
>
>
>>
>>
>> Once Apache Commons PMC agreed to place this module under Commons, I
>>will
>> work on getting the interested developers etc for establishing Chimera
>> development community as part of next steps. Please help on the process.
>>
>
>I'd love see you moving Chimera here. however, there are a few things I'd
>like to make you aware of:
>
>1. There are no Apache <Component> sub communities. There is only the
>Apache Commons community. This means, there won't be a separat mailing
>list
>for the new component. It is important to understand that we are a
>community maintaining a number of components. Not a group of sub
>communities.
>2. The Apache Commons versioning guide lines are very restrictive [1]. We
>put great effort into binary compatibility. This is, because we expect our
>components to be reused by a lot of other projects and we try our best to
>avoid jar hell. Often this means, that greater refactorings simply can not
>be implemented since they would break BC. This is usually not a problem
>for
>the major components. But it my be a problem for a young component.
>3. Apache Commons components usually have a (boring) descriptive name,
>rather then a fancy one. This is the reason why we renamed Apache Commons
>Sanselan zu Apache Commons Imaging. People should be able to tell just by
>looking at the name of a component what that component is about. IMHO
>Chimera falls into the fancy name category, so maybe we will discuss that
>name.
>
>I hope these are no blockers for you.
>
>Thank you for your interest and your effort in bringing new code to Apache
>Commons!
>
>Best regards,
>Benedikt
>
>[1] https://commons.apache.org/releases/versioning.html
>
>
>>
>>
>>
>> Regards,
>>
>> Uma (An Apache Hadoop PMC member)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
>-- 
>http://home.apache.org/~britter/
>http://twitter.com/BenediktRitter
>http://github.com/britter


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Re: Chimera as new component in Apache Commons

Posted by "Gangumalla, Uma" <um...@intel.com>.
Hi Benedikt,

Thanks for offering the great help.

Benedikt Wrote:

I'm no crypto expert but I can help with the Apache Commons related tasks,
like moving the code over to Apache Commons, setting up the maven build,
publishing the project website etc.
[UMA] Thank you. We would love to work with you on the further steps,
based on your guidance on these aspects.

Benedikt Wrote:

I'd love see you moving Chimera here.

[UMA] Thanks for the acceptance. :-)

Benedikt Wrote:

1. There are no Apache <Component> sub communities. There is only the
Apache Commons community. This means, there won't be a separat mailing list
for the new component. It is important to understand that we are a
community maintaining a number of components. Not a group of sub
communities.
[UMA] Got it. Thanks for the information.


Benedikt Wrote:

2. The Apache Commons versioning guide lines are very restrictive [1]. We
put great effort into binary compatibility. This is, because we expect our
components to be reused by a lot of other projects and we try our best to
avoid jar hell. Often this means, that greater refactorings simply can not
be implemented since they would break BC. This is usually not a problem for
the major components. But it my be a problem for a young component.
[UMA] Right. 


Benedikt Wrote:

3. Apache Commons components usually have a (boring) descriptive name,
rather then a fancy one. This is the reason why we renamed Apache Commons
Sanselan zu Apache Commons Imaging. People should be able to tell just by
looking at the name of a component what that component is about. IMHO
Chimera falls into the fancy name category, so maybe we will discuss that
Name.
[UMA] Ok. Naming can be self descriptive. No issues on this.

Regards,
Uma (An Apache Hadoop PMC member)





On 2/16/16, 11:56 PM, "Benedikt Ritter" <br...@apache.org> wrote:

>Hello Uma,
>
>welcome to the Apache Commons dev list. It's great to see that two
>projects
>get together to share code via Apache Commons.
>
>2016-02-16 22:36 GMT+01:00 Gangumalla, Uma <um...@intel.com>:
>
>> Hi Devs,
>>
>>
>>
>> Recently we worked with Spark community to implement the shuffle
>> encryption. While implementing that, we realized some/most of the code
>>in
>> Apache Hadoop encryption code and this implementation code have to be
>> duplicated. This leads to an idea to create separate reusable library,
>> named it as Chimera (https://github.com/intel-hadoop/chimera). It is an
>> optimized cryptographic library. It provides Java API for both cipher
>>level
>> and Java stream level to help developers implement high performance AES
>> encryption/decryption with the minimum code and effort.
>>
>>
>>
>> We know that Java has Cipher implementations, but why we need this
>> optimized cryptographic library:
>>
>> 1. Performance is very critical for encryption and decryption. The JDK
>> Cipher implementation of AES are not yet optimized with the modern
>> hardware. For example, the optimized implementation is 17x+ faster than
>> JDK6 implementations for some modes such as CBC decryption, CTR and GCM.
>> Even some optimizations has included JDK7 or JDK8, there are still 5x
>>to 6x
>> gap with the most optimized implementations.
>>
>
>That sounds pretty useful! :-)
>
>
>>
>> 2. Java Stream based API on cryptographic data stream. Cipher API is
>> powerful but a lot of code needs to be written for layered stream
>> processing applications. The design pattern is very common in modern
>> applications such as Hadoop or Spark.
>>
>>
>>
>> Chimera was originally based Hadoop crypto code but was improved and
>> generalized a lot for supporting wider scope of data encryption needs
>>for
>> more components in the community. The encryption related code in Hadoop
>>was
>> developed a year and so far it is running well. So we feel that code
>>part
>> of stable enough already.
>>
>>
>>
>> So, we propose to contribute this Chimera (optimized encryption library)
>> code to Apache Commons and we wanted to have independent release cycles
>>for
>> this module like any other modules in Apache Commons. This module is
>> basically provides Java based interfaces for encryption based IO and It
>> will have native based AES-NI encryption integration code.
>>
>>
>>
>> We already discussed about this proposal in Apache Hadoop dev lists and
>> the discussion conclusion was positive to contribute this module to
>>Apache
>> Commons.
>>
>>
>>
>> We need your help and support in adopting this code to make as Apache
>> Commons sub module and establish for making it to have its own
>>development
>> community (of course we can discuss more about this factors in this
>> thread). And Hadoop and Spark will be the two visible projects to use
>>it.
>> We do expect there will be more projects using it.
>>
>>
>I'm no crypto expert but I can help with the Apache Commons related tasks,
>like moving the code over to Apache Commons, setting up the maven build,
>publishing the project website etc.
>
>
>>
>>
>> Once Apache Commons PMC agreed to place this module under Commons, I
>>will
>> work on getting the interested developers etc for establishing Chimera
>> development community as part of next steps. Please help on the process.
>>
>
>I'd love see you moving Chimera here. however, there are a few things I'd
>like to make you aware of:
>
>1. There are no Apache <Component> sub communities. There is only the
>Apache Commons community. This means, there won't be a separat mailing
>list
>for the new component. It is important to understand that we are a
>community maintaining a number of components. Not a group of sub
>communities.
>2. The Apache Commons versioning guide lines are very restrictive [1]. We
>put great effort into binary compatibility. This is, because we expect our
>components to be reused by a lot of other projects and we try our best to
>avoid jar hell. Often this means, that greater refactorings simply can not
>be implemented since they would break BC. This is usually not a problem
>for
>the major components. But it my be a problem for a young component.
>3. Apache Commons components usually have a (boring) descriptive name,
>rather then a fancy one. This is the reason why we renamed Apache Commons
>Sanselan zu Apache Commons Imaging. People should be able to tell just by
>looking at the name of a component what that component is about. IMHO
>Chimera falls into the fancy name category, so maybe we will discuss that
>name.
>
>I hope these are no blockers for you.
>
>Thank you for your interest and your effort in bringing new code to Apache
>Commons!
>
>Best regards,
>Benedikt
>
>[1] https://commons.apache.org/releases/versioning.html
>
>
>>
>>
>>
>> Regards,
>>
>> Uma (An Apache Hadoop PMC member)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
>-- 
>http://home.apache.org/~britter/
>http://twitter.com/BenediktRitter
>http://github.com/britter


RE: Chimera as new component in Apache Commons

Posted by "Chen, Haifeng" <ha...@intel.com>.
Thanks Benedikt for your support!

>> I'm no crypto expert but I can help with the Apache Commons related tasks, like moving the code over to Apache Commons, setting up the maven build, publishing the project website etc.
This is really great help.

>> 1. There are no Apache <Component> sub communities. There is only the Apache Commons community. This means, there won't be a separat mailing list for the new component. It is important to understand that we are a community maintaining a number of components. Not a group of sub communities.
Makes sense. Sub modules share the same Commons community. This doesn't conflict with the fact that sub modules has its own release cycle and versioning.

>> 2. The Apache Commons versioning guide lines are very restrictive [1]. We put great effort into binary compatibility. This is, because we expect our components to be reused by a lot of other projects and we try our best to avoid jar hell. Often this means, that greater refactorings simply can not be implemented since they would break BC. This is usually not a problem for the major components. But it my be a problem for a young component.
Yes, keeping a stable API is very important for shared libraries. 

3. Apache Commons components usually have a (boring) descriptive name, rather then a fancy one. This is the reason why we renamed Apache Commons Sanselan zu Apache Commons Imaging. People should be able to tell just by looking at the name of a component what that component is about. IMHO Chimera falls into the fancy name category, so maybe we will discuss that name.
There is no problem that Chimera be renamed to something like "crypto".

Thanks,
Haifeng


-----Original Message-----
From: Benedikt Ritter [mailto:britter@apache.org] 
Sent: Wednesday, February 17, 2016 3:57 PM
To: Commons Developers List <de...@commons.apache.org>
Cc: common-dev@hadoop.apache.org
Subject: Re: Chimera as new component in Apache Commons

Hello Uma,

welcome to the Apache Commons dev list. It's great to see that two projects get together to share code via Apache Commons.

2016-02-16 22:36 GMT+01:00 Gangumalla, Uma <um...@intel.com>:

> Hi Devs,
>
>
>
> Recently we worked with Spark community to implement the shuffle 
> encryption. While implementing that, we realized some/most of the code 
> in Apache Hadoop encryption code and this implementation code have to 
> be duplicated. This leads to an idea to create separate reusable 
> library, named it as Chimera 
> (https://github.com/intel-hadoop/chimera). It is an optimized 
> cryptographic library. It provides Java API for both cipher level and 
> Java stream level to help developers implement high performance AES encryption/decryption with the minimum code and effort.
>
>
>
> We know that Java has Cipher implementations, but why we need this 
> optimized cryptographic library:
>
> 1. Performance is very critical for encryption and decryption. The JDK 
> Cipher implementation of AES are not yet optimized with the modern 
> hardware. For example, the optimized implementation is 17x+ faster 
> than
> JDK6 implementations for some modes such as CBC decryption, CTR and GCM.
> Even some optimizations has included JDK7 or JDK8, there are still 5x 
> to 6x gap with the most optimized implementations.
>

That sounds pretty useful! :-)


>
> 2. Java Stream based API on cryptographic data stream. Cipher API is 
> powerful but a lot of code needs to be written for layered stream 
> processing applications. The design pattern is very common in modern 
> applications such as Hadoop or Spark.
>
>
>
> Chimera was originally based Hadoop crypto code but was improved and 
> generalized a lot for supporting wider scope of data encryption needs 
> for more components in the community. The encryption related code in 
> Hadoop was developed a year and so far it is running well. So we feel 
> that code part of stable enough already.
>
>
>
> So, we propose to contribute this Chimera (optimized encryption 
> library) code to Apache Commons and we wanted to have independent 
> release cycles for this module like any other modules in Apache 
> Commons. This module is basically provides Java based interfaces for 
> encryption based IO and It will have native based AES-NI encryption integration code.
>
>
>
> We already discussed about this proposal in Apache Hadoop dev lists 
> and the discussion conclusion was positive to contribute this module 
> to Apache Commons.
>
>
>
> We need your help and support in adopting this code to make as Apache 
> Commons sub module and establish for making it to have its own 
> development community (of course we can discuss more about this 
> factors in this thread). And Hadoop and Spark will be the two visible projects to use it.
> We do expect there will be more projects using it.
>
>
I'm no crypto expert but I can help with the Apache Commons related tasks, like moving the code over to Apache Commons, setting up the maven build, publishing the project website etc.


>
>
> Once Apache Commons PMC agreed to place this module under Commons, I 
> will work on getting the interested developers etc for establishing 
> Chimera development community as part of next steps. Please help on the process.
>

I'd love see you moving Chimera here. however, there are a few things I'd like to make you aware of:

1. There are no Apache <Component> sub communities. There is only the Apache Commons community. This means, there won't be a separat mailing list for the new component. It is important to understand that we are a community maintaining a number of components. Not a group of sub communities.
2. The Apache Commons versioning guide lines are very restrictive [1]. We put great effort into binary compatibility. This is, because we expect our components to be reused by a lot of other projects and we try our best to avoid jar hell. Often this means, that greater refactorings simply can not be implemented since they would break BC. This is usually not a problem for the major components. But it my be a problem for a young component.
3. Apache Commons components usually have a (boring) descriptive name, rather then a fancy one. This is the reason why we renamed Apache Commons Sanselan zu Apache Commons Imaging. People should be able to tell just by looking at the name of a component what that component is about. IMHO Chimera falls into the fancy name category, so maybe we will discuss that name.

I hope these are no blockers for you.

Thank you for your interest and your effort in bringing new code to Apache Commons!

Best regards,
Benedikt

[1] https://commons.apache.org/releases/versioning.html


>
>
>
> Regards,
>
> Uma (An Apache Hadoop PMC member)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


--
http://home.apache.org/~britter/
http://twitter.com/BenediktRitter
http://github.com/britter

RE: Chimera as new component in Apache Commons

Posted by "Chen, Haifeng" <ha...@intel.com>.
Thanks Benedikt for your support!

>> I'm no crypto expert but I can help with the Apache Commons related tasks, like moving the code over to Apache Commons, setting up the maven build, publishing the project website etc.
This is really great help.

>> 1. There are no Apache <Component> sub communities. There is only the Apache Commons community. This means, there won't be a separat mailing list for the new component. It is important to understand that we are a community maintaining a number of components. Not a group of sub communities.
Makes sense. Sub modules share the same Commons community. This doesn't conflict with the fact that sub modules has its own release cycle and versioning.

>> 2. The Apache Commons versioning guide lines are very restrictive [1]. We put great effort into binary compatibility. This is, because we expect our components to be reused by a lot of other projects and we try our best to avoid jar hell. Often this means, that greater refactorings simply can not be implemented since they would break BC. This is usually not a problem for the major components. But it my be a problem for a young component.
Yes, keeping a stable API is very important for shared libraries. 

3. Apache Commons components usually have a (boring) descriptive name, rather then a fancy one. This is the reason why we renamed Apache Commons Sanselan zu Apache Commons Imaging. People should be able to tell just by looking at the name of a component what that component is about. IMHO Chimera falls into the fancy name category, so maybe we will discuss that name.
There is no problem that Chimera be renamed to something like "crypto".

Thanks,
Haifeng


-----Original Message-----
From: Benedikt Ritter [mailto:britter@apache.org] 
Sent: Wednesday, February 17, 2016 3:57 PM
To: Commons Developers List <de...@commons.apache.org>
Cc: common-dev@hadoop.apache.org
Subject: Re: Chimera as new component in Apache Commons

Hello Uma,

welcome to the Apache Commons dev list. It's great to see that two projects get together to share code via Apache Commons.

2016-02-16 22:36 GMT+01:00 Gangumalla, Uma <um...@intel.com>:

> Hi Devs,
>
>
>
> Recently we worked with Spark community to implement the shuffle 
> encryption. While implementing that, we realized some/most of the code 
> in Apache Hadoop encryption code and this implementation code have to 
> be duplicated. This leads to an idea to create separate reusable 
> library, named it as Chimera 
> (https://github.com/intel-hadoop/chimera). It is an optimized 
> cryptographic library. It provides Java API for both cipher level and 
> Java stream level to help developers implement high performance AES encryption/decryption with the minimum code and effort.
>
>
>
> We know that Java has Cipher implementations, but why we need this 
> optimized cryptographic library:
>
> 1. Performance is very critical for encryption and decryption. The JDK 
> Cipher implementation of AES are not yet optimized with the modern 
> hardware. For example, the optimized implementation is 17x+ faster 
> than
> JDK6 implementations for some modes such as CBC decryption, CTR and GCM.
> Even some optimizations has included JDK7 or JDK8, there are still 5x 
> to 6x gap with the most optimized implementations.
>

That sounds pretty useful! :-)


>
> 2. Java Stream based API on cryptographic data stream. Cipher API is 
> powerful but a lot of code needs to be written for layered stream 
> processing applications. The design pattern is very common in modern 
> applications such as Hadoop or Spark.
>
>
>
> Chimera was originally based Hadoop crypto code but was improved and 
> generalized a lot for supporting wider scope of data encryption needs 
> for more components in the community. The encryption related code in 
> Hadoop was developed a year and so far it is running well. So we feel 
> that code part of stable enough already.
>
>
>
> So, we propose to contribute this Chimera (optimized encryption 
> library) code to Apache Commons and we wanted to have independent 
> release cycles for this module like any other modules in Apache 
> Commons. This module is basically provides Java based interfaces for 
> encryption based IO and It will have native based AES-NI encryption integration code.
>
>
>
> We already discussed about this proposal in Apache Hadoop dev lists 
> and the discussion conclusion was positive to contribute this module 
> to Apache Commons.
>
>
>
> We need your help and support in adopting this code to make as Apache 
> Commons sub module and establish for making it to have its own 
> development community (of course we can discuss more about this 
> factors in this thread). And Hadoop and Spark will be the two visible projects to use it.
> We do expect there will be more projects using it.
>
>
I'm no crypto expert but I can help with the Apache Commons related tasks, like moving the code over to Apache Commons, setting up the maven build, publishing the project website etc.


>
>
> Once Apache Commons PMC agreed to place this module under Commons, I 
> will work on getting the interested developers etc for establishing 
> Chimera development community as part of next steps. Please help on the process.
>

I'd love see you moving Chimera here. however, there are a few things I'd like to make you aware of:

1. There are no Apache <Component> sub communities. There is only the Apache Commons community. This means, there won't be a separat mailing list for the new component. It is important to understand that we are a community maintaining a number of components. Not a group of sub communities.
2. The Apache Commons versioning guide lines are very restrictive [1]. We put great effort into binary compatibility. This is, because we expect our components to be reused by a lot of other projects and we try our best to avoid jar hell. Often this means, that greater refactorings simply can not be implemented since they would break BC. This is usually not a problem for the major components. But it my be a problem for a young component.
3. Apache Commons components usually have a (boring) descriptive name, rather then a fancy one. This is the reason why we renamed Apache Commons Sanselan zu Apache Commons Imaging. People should be able to tell just by looking at the name of a component what that component is about. IMHO Chimera falls into the fancy name category, so maybe we will discuss that name.

I hope these are no blockers for you.

Thank you for your interest and your effort in bringing new code to Apache Commons!

Best regards,
Benedikt

[1] https://commons.apache.org/releases/versioning.html


>
>
>
> Regards,
>
> Uma (An Apache Hadoop PMC member)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


--
http://home.apache.org/~britter/
http://twitter.com/BenediktRitter
http://github.com/britter

Re: Chimera as new component in Apache Commons

Posted by Benedikt Ritter <br...@apache.org>.
Hello Uma,

welcome to the Apache Commons dev list. It's great to see that two projects
get together to share code via Apache Commons.

2016-02-16 22:36 GMT+01:00 Gangumalla, Uma <um...@intel.com>:

> Hi Devs,
>
>
>
> Recently we worked with Spark community to implement the shuffle
> encryption. While implementing that, we realized some/most of the code in
> Apache Hadoop encryption code and this implementation code have to be
> duplicated. This leads to an idea to create separate reusable library,
> named it as Chimera (https://github.com/intel-hadoop/chimera). It is an
> optimized cryptographic library. It provides Java API for both cipher level
> and Java stream level to help developers implement high performance AES
> encryption/decryption with the minimum code and effort.
>
>
>
> We know that Java has Cipher implementations, but why we need this
> optimized cryptographic library:
>
> 1. Performance is very critical for encryption and decryption. The JDK
> Cipher implementation of AES are not yet optimized with the modern
> hardware. For example, the optimized implementation is 17x+ faster than
> JDK6 implementations for some modes such as CBC decryption, CTR and GCM.
> Even some optimizations has included JDK7 or JDK8, there are still 5x to 6x
> gap with the most optimized implementations.
>

That sounds pretty useful! :-)


>
> 2. Java Stream based API on cryptographic data stream. Cipher API is
> powerful but a lot of code needs to be written for layered stream
> processing applications. The design pattern is very common in modern
> applications such as Hadoop or Spark.
>
>
>
> Chimera was originally based Hadoop crypto code but was improved and
> generalized a lot for supporting wider scope of data encryption needs for
> more components in the community. The encryption related code in Hadoop was
> developed a year and so far it is running well. So we feel that code part
> of stable enough already.
>
>
>
> So, we propose to contribute this Chimera (optimized encryption library)
> code to Apache Commons and we wanted to have independent release cycles for
> this module like any other modules in Apache Commons. This module is
> basically provides Java based interfaces for encryption based IO and It
> will have native based AES-NI encryption integration code.
>
>
>
> We already discussed about this proposal in Apache Hadoop dev lists and
> the discussion conclusion was positive to contribute this module to Apache
> Commons.
>
>
>
> We need your help and support in adopting this code to make as Apache
> Commons sub module and establish for making it to have its own development
> community (of course we can discuss more about this factors in this
> thread). And Hadoop and Spark will be the two visible projects to use it.
> We do expect there will be more projects using it.
>
>
I'm no crypto expert but I can help with the Apache Commons related tasks,
like moving the code over to Apache Commons, setting up the maven build,
publishing the project website etc.


>
>
> Once Apache Commons PMC agreed to place this module under Commons, I will
> work on getting the interested developers etc for establishing Chimera
> development community as part of next steps. Please help on the process.
>

I'd love see you moving Chimera here. however, there are a few things I'd
like to make you aware of:

1. There are no Apache <Component> sub communities. There is only the
Apache Commons community. This means, there won't be a separat mailing list
for the new component. It is important to understand that we are a
community maintaining a number of components. Not a group of sub
communities.
2. The Apache Commons versioning guide lines are very restrictive [1]. We
put great effort into binary compatibility. This is, because we expect our
components to be reused by a lot of other projects and we try our best to
avoid jar hell. Often this means, that greater refactorings simply can not
be implemented since they would break BC. This is usually not a problem for
the major components. But it my be a problem for a young component.
3. Apache Commons components usually have a (boring) descriptive name,
rather then a fancy one. This is the reason why we renamed Apache Commons
Sanselan zu Apache Commons Imaging. People should be able to tell just by
looking at the name of a component what that component is about. IMHO
Chimera falls into the fancy name category, so maybe we will discuss that
name.

I hope these are no blockers for you.

Thank you for your interest and your effort in bringing new code to Apache
Commons!

Best regards,
Benedikt

[1] https://commons.apache.org/releases/versioning.html


>
>
>
> Regards,
>
> Uma (An Apache Hadoop PMC member)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
http://home.apache.org/~britter/
http://twitter.com/BenediktRitter
http://github.com/britter

Re: Chimera as new component in Apache Commons

Posted by Benedikt Ritter <br...@apache.org>.
Hello Uma,

welcome to the Apache Commons dev list. It's great to see that two projects
get together to share code via Apache Commons.

2016-02-16 22:36 GMT+01:00 Gangumalla, Uma <um...@intel.com>:

> Hi Devs,
>
>
>
> Recently we worked with Spark community to implement the shuffle
> encryption. While implementing that, we realized some/most of the code in
> Apache Hadoop encryption code and this implementation code have to be
> duplicated. This leads to an idea to create separate reusable library,
> named it as Chimera (https://github.com/intel-hadoop/chimera). It is an
> optimized cryptographic library. It provides Java API for both cipher level
> and Java stream level to help developers implement high performance AES
> encryption/decryption with the minimum code and effort.
>
>
>
> We know that Java has Cipher implementations, but why we need this
> optimized cryptographic library:
>
> 1. Performance is very critical for encryption and decryption. The JDK
> Cipher implementation of AES are not yet optimized with the modern
> hardware. For example, the optimized implementation is 17x+ faster than
> JDK6 implementations for some modes such as CBC decryption, CTR and GCM.
> Even some optimizations has included JDK7 or JDK8, there are still 5x to 6x
> gap with the most optimized implementations.
>

That sounds pretty useful! :-)


>
> 2. Java Stream based API on cryptographic data stream. Cipher API is
> powerful but a lot of code needs to be written for layered stream
> processing applications. The design pattern is very common in modern
> applications such as Hadoop or Spark.
>
>
>
> Chimera was originally based Hadoop crypto code but was improved and
> generalized a lot for supporting wider scope of data encryption needs for
> more components in the community. The encryption related code in Hadoop was
> developed a year and so far it is running well. So we feel that code part
> of stable enough already.
>
>
>
> So, we propose to contribute this Chimera (optimized encryption library)
> code to Apache Commons and we wanted to have independent release cycles for
> this module like any other modules in Apache Commons. This module is
> basically provides Java based interfaces for encryption based IO and It
> will have native based AES-NI encryption integration code.
>
>
>
> We already discussed about this proposal in Apache Hadoop dev lists and
> the discussion conclusion was positive to contribute this module to Apache
> Commons.
>
>
>
> We need your help and support in adopting this code to make as Apache
> Commons sub module and establish for making it to have its own development
> community (of course we can discuss more about this factors in this
> thread). And Hadoop and Spark will be the two visible projects to use it.
> We do expect there will be more projects using it.
>
>
I'm no crypto expert but I can help with the Apache Commons related tasks,
like moving the code over to Apache Commons, setting up the maven build,
publishing the project website etc.


>
>
> Once Apache Commons PMC agreed to place this module under Commons, I will
> work on getting the interested developers etc for establishing Chimera
> development community as part of next steps. Please help on the process.
>

I'd love see you moving Chimera here. however, there are a few things I'd
like to make you aware of:

1. There are no Apache <Component> sub communities. There is only the
Apache Commons community. This means, there won't be a separat mailing list
for the new component. It is important to understand that we are a
community maintaining a number of components. Not a group of sub
communities.
2. The Apache Commons versioning guide lines are very restrictive [1]. We
put great effort into binary compatibility. This is, because we expect our
components to be reused by a lot of other projects and we try our best to
avoid jar hell. Often this means, that greater refactorings simply can not
be implemented since they would break BC. This is usually not a problem for
the major components. But it my be a problem for a young component.
3. Apache Commons components usually have a (boring) descriptive name,
rather then a fancy one. This is the reason why we renamed Apache Commons
Sanselan zu Apache Commons Imaging. People should be able to tell just by
looking at the name of a component what that component is about. IMHO
Chimera falls into the fancy name category, so maybe we will discuss that
name.

I hope these are no blockers for you.

Thank you for your interest and your effort in bringing new code to Apache
Commons!

Best regards,
Benedikt

[1] https://commons.apache.org/releases/versioning.html


>
>
>
> Regards,
>
> Uma (An Apache Hadoop PMC member)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
http://home.apache.org/~britter/
http://twitter.com/BenediktRitter
http://github.com/britter