You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ranger.apache.org by "Lei Yao (Jira)" <ji...@apache.org> on 2023/04/13 10:39:00 UTC

[jira] [Updated] (RANGER-4166) Ranger2.x may build failed

     [ https://issues.apache.org/jira/browse/RANGER-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lei Yao updated RANGER-4166:
----------------------------
    Summary: Ranger2.x may build failed  (was: ranger2  build failed)

> Ranger2.x may build failed
> --------------------------
>
>                 Key: RANGER-4166
>                 URL: https://issues.apache.org/jira/browse/RANGER-4166
>             Project: Ranger
>          Issue Type: Bug
>          Components: Ranger
>    Affects Versions: 2.3.0, 2.4.0
>            Reporter: caijialiang
>            Assignee: caijialiang
>            Priority: Major
>             Fix For: 3.0.0, 2.4.1
>
>         Attachments: 0001-RANGER-4166-fix-the-old-version-of-the-assembly-plug.patch, image-2023-04-01-18-31-58-091.png, image-2023-04-01-18-33-29-756.png, image-2023-04-04-10-28-23-029.png, image-2023-04-04-10-29-20-811.png, image-2023-04-04-10-29-26-077.png, image-2023-04-04-10-29-41-802.png, image-2023-04-04-10-29-56-998.png, image-2023-04-04-10-30-06-393.png, image-2023-04-04-10-30-48-140.png, image-2023-04-04-10-31-13-216.png, image-2023-04-04-10-31-37-426.png, image-2023-04-04-10-31-59-019.png, image-2023-04-04-10-32-09-582.png, image-2023-04-04-10-32-19-271.png, image-2023-04-04-11-05-05-371.png
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
>  
> summary:Here we mainly discuss how to reason and reproduce this compilation error stably.
> environment
> [root@gs-server-12223 ~]# locale
> LANG=zh_CN.UTF-8
> LC_CTYPE="zh_CN.UTF-8"
> LC_NUMERIC="zh_CN.UTF-8"
> LC_TIME="zh_CN.UTF-8"
> LC_COLLATE="zh_CN.UTF-8"
> LC_MONETARY="zh_CN.UTF-8"
> LC_MESSAGES="zh_CN.UTF-8"
> LC_PAPER="zh_CN.UTF-8"
> LC_NAME="zh_CN.UTF-8"
> LC_ADDRESS="zh_CN.UTF-8"
> LC_TELEPHONE="zh_CN.UTF-8"
> LC_MEASUREMENT="zh_CN.UTF-8"
> LC_IDENTIFICATION="zh_CN.UTF-8"
> LC_ALL=zh_CN.UTF-8
> lsb_release -a
> LSB Version: :core-4.1-amd64:core-4.1-noarch
> Distributor ID: CentOS
> Description: CentOS Linux release 7.4.1708 (Core)
> Release: 7.4.1708
> Codename: Core
> uname -a
> Linux gs-server-12223 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> maven version 3.6.3
>  
> description:
> There are compilation errors when building Ranger 2.3 and Ranger 2.4 in a Linux environment.
> Compilation command:
> mvn -Pall clean compile package install -Dmaven.test.skip=true -DskipTests=true -Dfindbugs.skip=true -Dcheckstyle.skip=true -Djacoco.skip=true -Dpmd.skip=true -Drat.skip=true -Dspotbugs.skip=true -Dhadoop.version=3.3.4 -Dhbase.version=2.4.13 -Dhive.version=3.1.3 -Dkafka.version=2.8.1 -Dsolr.version=8.11.2 -Dzookeeper.version=3.6.4
> The following two patches were applied to ranger2.3 in order to compile successfully.
> git apply ../ranger/patch1-RANGER-3818.diff
> git apply ../patch0-RANGER-3373.diff
> *The compilation of ranger 2.3 fails with the following error:*
> {code:java}
> [ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.6:single (default) on project ranger-distro: Failed to create assembly: Error creating assembly archive schema-registry-plugin: Problem creating jar: jar:file:/home/jialiang/prjs/ranger/distro/target/ranger-distro-2.3.0.jar!/META-INF/maven/org.apache.ranger/ranger-distro/pom.xml: JAR entry META-INF/maven/org.apache.ranger/ranger-distro/pom.xml not found in /home/jialiang/prjs/ranger/distro/target/ranger-distro-2.3.0.jar -> [Help 1] {code}
>  
> *ranger2.4 did not apply any patches, and compilation errors are as follows:*
> {code:java}
> [ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.6:single (default) on project ranger-distro: Failed to create assembly: Error creating assembly archive schema-registry-plugin: IOException when zipping rMETA-INF/maven/org.apache.ranger/ranger-distro/pom.properties: invalid code lengths set -> [Help 1]{code}
> According to the compilation error message of ranger2.4, it is suspected that the issue is related to encoding. After checking the encoding format of the corresponding file, it is found to be ASCII, while Linux defaults to UTF-8
>  
> file ./distro/target/maven-archiver/pom.properties
> ./distro/target/maven-archiver/pom.properties: ASCII text
>  
> Therefore, it is possible that it is a encoding problem. In addition, the error message mentions "Error creating assembly archive." The Maven Assembly Plugin is executed during the package phase of Maven, after compilation, testing, and other operations are completed, to prepare the build artifacts for distribution as archive files.
> This error occurs when the Assembly Plugin is creating a distributable archive, such as a zip or tar.gz format, from the build artifacts. Therefore, it is related to how the archive tool used by Maven Assembly Plugin handles encoding.
> In both ranger2.3 and ranger2.4, the <assembly.plugin.version>2.6</assembly.plugin.version> is used. Hence, it is necessary to investigate the code of this version of the Assembly Plugin."
> [https://github.com/apache/maven-assembly-plugin],
> [https://github.com/apache/maven-assembly-plugin/blob/maven-assembly-plugin-2.6/pom.xml]
> From the pom file and the compression logic in the code, it can be concluded that the compression tool used is plexus-archiver, version 3.0.1.
> !image-2023-04-04-10-29-41-802.png!
> The release note for plexus-archiver is as follows
> [https://github.com/codehaus-plexus/plexus-archiver/blob/master/ReleaseNotes.md]
> Searching for the keyword 'encod' in the release note reveals that many encoding-related issues have been fixed since version 3.0, including
>  * [Issue #37|https://github.com/codehaus-plexus/plexus-archiver/issues/37] - Deprecate Manifest(Reader) and update all related Implemenation does not properly map characters to map and makes assumptions about character encoding which might lead to failures. Deprecate and rely on Java Manifest reader to do the right thing.
>  * [Issue #39|https://github.com/codehaus-plexus/plexus-archiver/issues/39] - Updated to stop falling back to the unicode path extra field policy NOT_ENCODEABLE. If a name is not encodeable in UTF-8, it also is not encodeable in the extra field. Updated to always add the Info-ZIP Unicode Path Extra Field when creating an archive using an encoding different from UTF-8 instead of only when a name is not encodeable. Additionally support that extra field when unarchiving.
>  * [Pull Request #73|https://github.com/codehaus-plexus/plexus-archiver/pull/73] - Symbolic links not properly encoded in ZIP archives
> then download the plexus-archiver code and search for the error message 'IOException when zipping' in the source code
> !image-2023-04-04-10-29-26-077.png!
> !image-2023-04-04-10-29-20-811.png!
> By reading the plexus-archiver code, it was found that setting encoding is necessary when creating a jar file using plexus-archiver, because the jar file contains text files such as the manifest file, which may have non-ASCII characters and need to be correctly encoded to avoid potential issues. Therefore, setting the encoding ensures that the text files in the jar file are properly encoded.
> However, when creating a tar.gz file using plexus-archiver, there is no need for the setEncoding() method, because tar.gz files do not have a text encoding format. They are binary files that contain compressed data.
> At this point, we can explain why only the schema-registry in the distro packaging will have an error. The descriptor of the schema-registry is specified as follows:
> <descriptor>src/main/assembly/plugin-schema-registry.xml</descriptor> the format specified is jar!
> !image-2023-04-04-10-29-56-998.png!
> And all other formats specified in the assembly, except for this one, are tar.gz
> !image-2023-04-04-10-30-06-393.png!
> We can use the file command to check the encoding format of all files generated during the compilation of all modules:
> bashCopy code
> file ./xxx/target/maven-archiver/pom.properties
> And all of them are encoded in ASCII. This is why all of them are encoded in ASCII and only assembly packaging of schema-registry will result in an error.
> Based on the above inference, I modified the 'format' in plugin-schema-registry.xml from 'jar' to 'tar' and it passed the compilation smoothly. Adding the line '<encoding>UTF-8</encoding>' in the distro's pom file also allowed it to pass the compilation.
> !image-2023-04-04-10-30-48-140.png!
> However, these are not the fundamental solutions. The root cause is a bug in plexus-archiver that re-encodes when packaging jars. This bug has been fixed in the latest version of plexus-archiver. Our assembly plugin was using an older version of plexus-archiver, causing the issue. Therefore, upgrading to the latest version can solve the problem.
> By checking the pom file of the assembly plugin, I found that the maven-assembly-plugin-3.4.2 uses plexus-archiver 4.4. Therefore, I updated the ranger's <assembly.plugin.version>2.6</assembly.plugin.version> to <assembly.plugin.version>3.4.2</assembly.plugin.version> and the compilation problem was also solved.
> !image-2023-04-04-10-31-13-216.png!
> I have tested both ranger 2.3 and ranger 2.4, and upgrading the assembly plugin and modifying the encoding can solve the compilation issue on Linux.
> https://issues.apache.org/jira/browse/RANGER-2721
> Therefore, this issue does not solve the problem of compilation errors. Here we are just avoiding using the assembly command to prevent triggering this compilation error 100% of the time. In reality, even if assembly is removed, many environments will still encounter compilation errors in the final step.
> How to reproduce and test stably: We use ranger2.4 for testing because it does not require a patch to be applied. Before testing, clear the ranger directory installed in the Maven M2 repository.
> ranger2.4
> 1.To reproduce the error, compile using the following command without making any modifications.
> {code:java}
> [root@gs-server-12223 ranger]# git branch -vv master 460a176 [origin/master] RANGER-4085: Search filter hint is not available where you search for policy * ranger-2.4 50ad9c1 [origin/ranger-2.4] RANGER-4155 : Structure of resource(UI) hierarchy in policy form not proper formatted for multiple values. release-ranger-2.3.0 ce3339c RANGER-3730: use reload4j to replace log4j-1.2 [root@gs-server-12223 ranger]
> # git diff [root@gs-server-12223 ranger]# rm -rf /home/jzhou/m2/org/apache/ranger 
> [root@gs-server-12223 ranger]# /usr/local/src/apache-maven-3.6.3/bin/mvn -Pall clean compile package install assembly:single -Dmaven.test.skip=true -DskipTests=true -Dfindbugs.skip=true -Dcheckstyle.skip=true -Djacoco.skip=true -Dpmd.skip=true -Drat.skip=true -Dspotbugs.skip=true -Dhadoop.version=3.3.4 -Dhbase.version=2.4.13 -Dhive.version=3.1.3 -Dkafka.version=2.8.1 -Dsolr.version=8.11.2 -Dzookeeper.version=3.6.4 {code}
>  
> !image-2023-04-04-10-31-37-426.png!
> 2.Upgrade the assembly.plugin.version in the ranger project to 3.4.2, and continue to compile using the above command. The error disappears and the compilation can proceed smoothly.
> !image-2023-04-04-10-31-59-019.png!
> !image-2023-04-04-10-32-09-582.png!
> 3.Reverting the changes still cannot compile successfully.
> !image-2023-04-04-10-32-19-271.png!
> A regrettable point here is that it has not yet been figured out which line of code, under what circumstances, causes the compilation problem to occur, as well as the reason why the issue cannot be stably reproduced without adding assembly:single. If someone is interested, they can continue to dig deeper, and the answer may be in the maven-assembly-plugin, plexus-archiver, and commons-compress libraries.
> [https://github.com/apache/maven-assembly-plugin]
> [https://github.com/codehaus-plexus/plexus-archiver/|https://github.com/codehaus-plexus/plexus-archiver/blob/master/ReleaseNotes.md]
> [https://github.com/apache/commons-compress]
> [^0001-RANGER-4166-fix-the-old-version-of-the-assembly-plug.patch]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)