You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/21 05:43:09 UTC

[GitHub] [spark] JoshRosen opened a new pull request #24655: [SPARK-27786] Fix Sha1, Md5, and Base64 codegen when commons-codec is shaded

JoshRosen opened a new pull request #24655: [SPARK-27786] Fix Sha1, Md5, and Base64 codegen when commons-codec is shaded
URL: https://github.com/apache/spark/pull/24655
 
 
   ## What changes were proposed in this pull request?
   
   When running a custom build of Spark which shades `commons-codec`, the `Sha1` expression generates code which fails to compile:
   
   ```
   org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 47, Column 93: A method named "sha1Hex" is not declared in any enclosing class nor any supertype, nor through a static import
   ```
   
   This is caused by an interaction between Spark's code generator and the shading: the current codegen template includes the string `org.apache.commons.codec.digest.DigestUtils.sha1Hex` as part of a larger string literal, preventing JarJarLinks from being able to replace the class name with the shaded class's name. As a result, the generated code still references the original unshaded class name name, triggering an error in case the original unshaded dependency isn't on the path.
   
   This problem impacts the `Sha1`, `Md5`, and `Base64` expressions.
   
   To fix this problem and allow for proper shading, this PR updates the codegen templates to replace the hardcoded class names with `${classof[<name>].getName}` calls.
   
   ## How was this patch tested?
   
   Existing tests.
   
   To ensure that I found all occurrences of this problem, I used IntelliJ's "Find in Path" to search for lines matching the regex `^(?!import|package).*(org|com|net|io)\.(?!apache\.spark)` and then filtered matches to inspect only non-test "Usage in string constants" cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org