You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/20 09:23:44 UTC

[GitHub] [spark] beliefer opened a new pull request #24647: [SPARK-27776][SQL]Avoid duplicate Java reflection in DataSource.

beliefer opened a new pull request #24647: [SPARK-27776][SQL]Avoid duplicate Java reflection in DataSource.
URL: https://github.com/apache/spark/pull/24647
 
 
   ## What changes were proposed in this pull request?
   
   I checked the code of 
   `org.apache.spark.sql.execution.datasources.DataSource`
   , there exists duplicate Java reflection.
   `sourceSchema`,`createSource`,`createSink`,`resolveRelation`,`writeAndRead`, all the methods call the `providingClass.getConstructor().newInstance()`.
   The instance of `providingClass` is stateless, such as:
   `KafkaSourceProvider`
   `RateSourceProvider`
   `TextSocketSourceProvider`
   `JdbcRelationProvider`
   `ConsoleSinkProvider`
   
   AFAIK, Java reflection will result in significant performance issue.
   The oracle website [https://docs.oracle.com/javase/tutorial/reflect/index.html](https://docs.oracle.com/javase/tutorial/reflect/index.html) contains some performance description about Java reflection:
   
   ```
   Performance Overhead
   Because reflection involves types that are dynamically resolved, certain Java virtual machine optimizations can not be performed. Consequently, reflective operations have slower performance than their non-reflective counterparts, and should be avoided in sections of code which are called frequently in performance-sensitive applications.
   ```
   
   I have found some performance cost test of Java reflection as follows:
   [https://blog.frankel.ch/performance-cost-of-reflection/](https://blog.frankel.ch/performance-cost-of-reflection/) contains performance cost test.
   [https://stackoverflow.com/questions/435553/java-reflection-performance](https://stackoverflow.com/questions/435553/java-reflection-performance) has a discussion of java reflection.
   
   So I think should avoid it.
   
   ## How was this patch tested?
   
   Exists UT.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org