You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by GitBox <gi...@apache.org> on 2021/07/20 02:36:09 UTC

[GitHub] [orc] belugabehr opened a new pull request #755: Orc 853

belugabehr opened a new pull request #755:
URL: https://github.com/apache/orc/pull/755


   
   
   ### What changes were proposed in this pull request?
   Use the optimized Apache Avro implementation of writing a `double` to a stream.  Remove extra method which supports this functionality; condense it into a single method.
   
   
   ### Why are the changes needed?
   Performance. Had a measurable impact on the Driver performance harness for writing `double` values.
   
   
   ### How was this patch tested?
   No change in functionality. Use existing unit tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] belugabehr edited a comment on pull request #755: ORC-853: Optimize writeDouble Implementation

Posted by GitBox <gi...@apache.org>.
belugabehr edited a comment on pull request #755:
URL: https://github.com/apache/orc/pull/755#issuecomment-885638318


   @pgaref Thanks for the review.  I'll add additional comments.
   
   Yes. I only posted the results of one run, but it was representative of the greater series of tests.  Anyway, the reason I even thought to do this was because I have worked a bit with Avro in the pass and this strange implementation really stuck with me.  Some time ago, I found it super interesting and I tried different ways to out-perform it, but I was never able to.  Now, TBH there is some amount of YMMV here, as this may be sensitive to CPU architecture, but the person that implemented this in the first place found it was optimal, and I could only confirm that with my personal setup, so there is some merit to it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] belugabehr edited a comment on pull request #755: ORC-853: Optimize writeDouble Implementation

Posted by GitBox <gi...@apache.org>.
belugabehr edited a comment on pull request #755:
URL: https://github.com/apache/orc/pull/755#issuecomment-885638318


   @pgaref Thanks for the review.  I'll add additional comments.
   
   Yes. I only posted the results of one run, but it was representative of the greater series of tests.  Anyway, the reason I even thought to do this was because I have worked a bit with Avro in the pass and this strange implementation really stuck with me.  Some time ago, I found it super interesting and I tried different ways to out-perform it, but I was never able to.  Now, TBH there is some amount of YMMV here, as this may be sensitive to CPU architecture, but the person that implemented this in the first place found it was optimal, and I could only confirm that with my personal setup, so there is some merit to it.
   
   I also assume that by condensing this code into a single method, it avoids a jump, and that may be helpful too when processing millions of rows.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] pgaref closed pull request #755: ORC-853: Optimize writeDouble Implementation

Posted by GitBox <gi...@apache.org>.
pgaref closed pull request #755:
URL: https://github.com/apache/orc/pull/755


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] belugabehr commented on pull request #755: ORC-853: Optimize writeDouble Implementation

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #755:
URL: https://github.com/apache/orc/pull/755#issuecomment-885638318


   @pgaref Thanks for the review.  I'll add additional comments.
   
   Yes. I only posted the results of one run, but it was representative of the greater series of tests.  Anyway, the reason I even thought to do this was because I have worked a bit with Avro in the pass and this strange implementation really stuck with me.  Some time ago, I found it super interesting and I tried different ways to out-perform it, but I was never able to.  Now, TBH there is some amount of YMMV here, as this may be sensitive to CPU architecture, but the person that implemented this in the first place found it was optimal, and I could only confirm that with my personal setup, so there is some merit to it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] belugabehr edited a comment on pull request #755: ORC-853: Optimize writeDouble Implementation

Posted by GitBox <gi...@apache.org>.
belugabehr edited a comment on pull request #755:
URL: https://github.com/apache/orc/pull/755#issuecomment-885638318


   @pgaref Thanks for the review.  I'll add additional comments.
   
   Yes. I only posted the results of one run, but it was representative of the greater series of tests.  Anyway, the reason I even thought to do this was because I have worked a bit with Avro in the pass and this strange implementation really stuck with me.  Some time ago, I found it super interesting and I tried different ways to out-perform it, but I was never able to.  Now, TBH there is some amount of YMMV here, as this may be sensitive to CPU architecture, but the person that implemented this in the first place found it was optimal, and I could only confirm that with my personal setup, so there is some merit to it.
   
   I also assume that by condensing this down into a single method, the code probably avoids a jump, and that helps too when you're processing millions of records.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] pgaref commented on pull request #755: ORC-853: Optimize writeDouble Implementation

Posted by GitBox <gi...@apache.org>.
pgaref commented on pull request #755:
URL: https://github.com/apache/orc/pull/755#issuecomment-885652359


   Merged, Thanks @belugabehr !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org