You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Dénes Bodó (Jira)" <ji...@apache.org> on 2022/02/07 13:40:00 UTC
[jira] [Commented] (OOZIE-3652) Oozie launcher should retry directory listing when NoSuchFileException occurs
[ https://issues.apache.org/jira/browse/OOZIE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488119#comment-17488119 ]
Dénes Bodó commented on OOZIE-3652:
-----------------------------------
Hey [~aajisaka]
Thanks for the patch.
One minor comment: unfortunately Oozie Yarn applications use stdout and stderr instead of a logger to print. Could you please follow this pattern in your change?
Otherwise looks good to me.
Thanks,
Denes
> Oozie launcher should retry directory listing when NoSuchFileException occurs
> -----------------------------------------------------------------------------
>
> Key: OOZIE-3652
> URL: https://issues.apache.org/jira/browse/OOZIE-3652
> Project: Oozie
> Issue Type: Bug
> Reporter: Akira Ajisaka
> Assignee: Akira Ajisaka
> Priority: Major
> Attachments: OOZIE-3652-001.patch, OOZIE-3652-002.patch
>
>
> Oozie launcher lists the CWD and prints the file name and directory name. When there is a java agent attach to Oozie launcher JVM, a temporary attach_pid<pid> file is created in the CWD by default. If the temporary file is deleted between directory listing and printing, NoSuchFileException occurs.
> {code}
> 2022-01-13 06:11:05,569 WARN Hive2ActionExecutor:523 - SERVER[<host>] USER[admin] GROUP[-] TOKEN[] APP[Hive Sample] JOB[0000061-220113033638267-oozie-oozi-W] ACTION[0000061-220113033638267-oozie-oozi-W@hive-dbea] Launcher exception: ./.attach_pid1517
> java.nio.file.NoSuchFileException: ./.attach_pid1517
> at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
> at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
> at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
> at java.nio.file.Files.readAttributes(Files.java:1737)
> at java.nio.file.FileTreeWalker.getAttributes(FileTreeWalker.java:225)
> at java.nio.file.FileTreeWalker.visit(FileTreeWalker.java:276)
> at java.nio.file.FileTreeWalker.next(FileTreeWalker.java:372)
> at java.nio.file.Files.walkFileTree(Files.java:2706)
> at org.apache.oozie.action.hadoop.LocalFsOperations.printContentsOfDir(LocalFsOperations.java:59)
> at org.apache.oozie.action.hadoop.LauncherAM.printDebugInfo(LauncherAM.java:293)
> at org.apache.oozie.action.hadoop.LauncherAM.access$100(LauncherAM.java:55)
> at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:221)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
> at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217)
> at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
> at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141)
> {code}
> After the [log4j hotpatch provided by Correto|https://github.com/corretto/hotpatch-for-apache-log4j2] enabled, we faced this issue intermittently. Disabling the hotpatch will fix the issue, however, there maybe another agent and it would use the CWD by default, so I suppose it can be fixed in Oozie side.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)