You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by ekovacs <gi...@git.apache.org> on 2018/08/28 15:31:09 UTC

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

GitHub user ekovacs opened a pull request:

    https://github.com/apache/nifi/pull/2971

    NIFI-5557: handling expired ticket by rollback and penalization

    Thank you for submitting a contribution to Apache NiFi.
    
    In order to streamline the review of the contribution we ask you
    to ensure the following steps have been taken:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
         in the commit message?
    
    - [x] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    
    - [x] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    - [x] Is your initial contribution a single, squashed commit?
    
    ### For code changes:
    - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
    - [ ] Have you written or updated unit tests to verify your changes?
    - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
    - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
    - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
    - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?
    
    ### For documentation related changes:
    - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
    
    ### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ekovacs/nifi NIFI-5557

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/2971.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2971
    
----
commit 32c42bdd971937cb6904939452420c8754f64287
Author: Endre Zoltan Kovacs <ek...@...>
Date:   2018-08-28T08:47:59Z

    NIFI-5557: handling expired ticket by rollback and penalization

----


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by ekovacs <gi...@git.apache.org>.
Github user ekovacs commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r220236629
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -389,16 +380,24 @@ public void process(InputStream in) throws IOException {
                         session.transfer(putFlowFile, REL_SUCCESS);
     
                     } catch (final Throwable t) {
    -                    if (tempDotCopyFile != null) {
    -                        try {
    -                            hdfs.delete(tempDotCopyFile, false);
    -                        } catch (Exception e) {
    -                            getLogger().error("Unable to remove temporary file {} due to {}", new Object[]{tempDotCopyFile, e});
    -                        }
    +                   Optional<GSSException> causeOptional = findCause(t, GSSException.class, gsse -> GSSException.NO_CRED == gsse.getMajor());
    --- End diff --
    
    yes. it makes sense.


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by jtstorck <gi...@git.apache.org>.
Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r214136451
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -266,6 +268,13 @@ public Object run() {
                                 throw new IOException(configuredRootDirPath.toString() + " could not be created");
                             }
                             changeOwner(context, hdfs, configuredRootDirPath, flowFile);
    +                    } catch (IOException e) {
    +                        if (!Strings.isNullOrEmpty(e.getMessage()) && e.getMessage().contains(String.format("Couldn't setup connection for %s", ugi.getUserName()))) {
    --- End diff --
    
    @ekovacs I think we should be more selective in this check.  I don't think there's a better way to detect this error scenario than string matching at this point, but the exception stack should be inspected to see if you can find the GSSException as the root cause:
    `Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) `
    If you iterate through the causes when PutHDFS encounters an IOException, and see that GSSException, we can do a penalize with a session rollback.
    Otherwise, we'd want to pass the flowfile to the failure relationship.


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/nifi/pull/2971


---

[GitHub] nifi issue #2971: NIFI-5557: handling expired ticket by rollback and penaliz...

Posted by jtstorck <gi...@git.apache.org>.
Github user jtstorck commented on the issue:

    https://github.com/apache/nifi/pull/2971
  
    +1, merged to master.  Had to resolve some conflicts in my commit for PutHDFSTest after rebasing to master.
    
    Thanks for your contribution, @ekovacs!


---

[GitHub] nifi issue #2971: NIFI-5557: handling expired ticket by rollback and penaliz...

Posted by joewitt <gi...@git.apache.org>.
Github user joewitt commented on the issue:

    https://github.com/apache/nifi/pull/2971
  
    @ekovacs thanks for your contribution.  Are we sure there is no other course for us than to parse the exception cause for this case?  If not, that is unfortunate but ok.  The other question then is if this is a particular case we need the HDFS client to tell us have we asked the Hadoop community to improve this or filed a JIRA to that effect?  If not we definitely should so that we can check this in the future in a way that is safer/more reliable.  Thanks!


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by jtstorck <gi...@git.apache.org>.
Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r219377632
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -266,6 +271,16 @@ public Object run() {
                                 throw new IOException(configuredRootDirPath.toString() + " could not be created");
                             }
                             changeOwner(context, hdfs, configuredRootDirPath, flowFile);
    +                    } catch (IOException e) {
    --- End diff --
    
    Thanks for changing this to use GSSException.getMajor().   I haven't tested a ticket expiration occurring during the execution of a call to ugi.doAs (as opposed to the ticket having expired before ugi.doAs is invoked), but I think it would be a good idea to move this catch block to the top level try/catch block of the PrivelegedExceptionAction passed to ugi.doAs().


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by jtstorck <gi...@git.apache.org>.
Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r219380028
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -266,6 +271,16 @@ public Object run() {
                                 throw new IOException(configuredRootDirPath.toString() + " could not be created");
                             }
                             changeOwner(context, hdfs, configuredRootDirPath, flowFile);
    +                    } catch (IOException e) {
    +                      boolean tgtExpired = hasCause(e, GSSException.class, gsse -> GSSException.NO_CRED == gsse.getMajor());
    +                      if (tgtExpired) {
    +                        getLogger().error(String.format("An error occured while connecting to HDFS. Rolling back session, and penalizing flow file %s",
    --- End diff --
    
    The exception be logged here, in addition to the flowfile UUID.  It might be useful to have the stack trace and exception class available in the log, and we shouldn't suppress/omit the actual GSSException from the logging.
    
    It might also be a good idea to log this at the "warn" level, so that the user can choose to not have these show as bulletins on the processor in the UI.  Since the flowfile is being rolled back, and hadoop-client will implicitly acquire a new ticket, I don't think this should show as an error.  @mcgilman, @bbende, do either of you have a preference here?


---

[GitHub] nifi issue #2971: NIFI-5557: handling expired ticket by rollback and penaliz...

Posted by jtstorck <gi...@git.apache.org>.
Github user jtstorck commented on the issue:

    https://github.com/apache/nifi/pull/2971
  
    Reviewing...


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by jtstorck <gi...@git.apache.org>.
Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r220204503
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -389,16 +380,24 @@ public void process(InputStream in) throws IOException {
                         session.transfer(putFlowFile, REL_SUCCESS);
     
                     } catch (final Throwable t) {
    -                    if (tempDotCopyFile != null) {
    -                        try {
    -                            hdfs.delete(tempDotCopyFile, false);
    -                        } catch (Exception e) {
    -                            getLogger().error("Unable to remove temporary file {} due to {}", new Object[]{tempDotCopyFile, e});
    -                        }
    +                   Optional<GSSException> causeOptional = findCause(t, GSSException.class, gsse -> GSSException.NO_CRED == gsse.getMajor());
    --- End diff --
    
    My previous comment was a bit ambiguous, I apologize.  Having this logic in this catch for all Throwables is fine, but you could move this bit into a separate catch(IOException e) block of this try/catch.


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by jtstorck <gi...@git.apache.org>.
Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r220206853
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -389,16 +380,24 @@ public void process(InputStream in) throws IOException {
                         session.transfer(putFlowFile, REL_SUCCESS);
     
                     } catch (final Throwable t) {
    -                    if (tempDotCopyFile != null) {
    -                        try {
    -                            hdfs.delete(tempDotCopyFile, false);
    -                        } catch (Exception e) {
    -                            getLogger().error("Unable to remove temporary file {} due to {}", new Object[]{tempDotCopyFile, e});
    -                        }
    +                   Optional<GSSException> causeOptional = findCause(t, GSSException.class, gsse -> GSSException.NO_CRED == gsse.getMajor());
    +                    if (causeOptional.isPresent()) {
    +                      getLogger().warn(String.format("An error occured while connecting to HDFS. "
    --- End diff --
    
    This could be changed to:
    ```java
    getLogger().warn("An error occured while connecting to HDFS. Rolling back session, and penalizing flow file {}",
        new Object[] {putFlowFile.getAttribute(CoreAttributes.UUID.key()),
            causeOptional.get()});
    ```


---

[GitHub] nifi issue #2971: NIFI-5557: handling expired ticket by rollback and penaliz...

Posted by ekovacs <gi...@git.apache.org>.
Github user ekovacs commented on the issue:

    https://github.com/apache/nifi/pull/2971
  
    @jtstorck could you take a look?


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by jtstorck <gi...@git.apache.org>.
Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r216037639
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -269,13 +272,15 @@ public Object run() {
                             }
                             changeOwner(context, hdfs, configuredRootDirPath, flowFile);
                         } catch (IOException e) {
    -                        if (!Strings.isNullOrEmpty(e.getMessage()) && e.getMessage().contains(String.format("Couldn't setup connection for %s", ugi.getUserName()))) {
    -                          getLogger().error(String.format("An error occured while connecting to HDFS. Rolling back session, and penalizing flowfile %s",
    -                              flowFile.getAttribute(CoreAttributes.UUID.key())));
    -                          session.rollback(true);
    -                        } else {
    -                          throw e;
    -                        }
    +                      boolean tgtExpired = hasCause(e, GSSException.class, gsse -> "Failed to find any Kerberos tgt".equals(gsse.getMinorString()));
    --- End diff --
    
    @ekovacs After seeing the use of getMinorString here, I looked at GSSException, and it looks like there's some error codes that could be used to detect the actual cause, rather than string matching.  Are getMajor and getMinor returning ints when these exceptions happen?


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by ekovacs <gi...@git.apache.org>.
Github user ekovacs commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r220236670
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -389,16 +380,24 @@ public void process(InputStream in) throws IOException {
                         session.transfer(putFlowFile, REL_SUCCESS);
     
                     } catch (final Throwable t) {
    -                    if (tempDotCopyFile != null) {
    -                        try {
    -                            hdfs.delete(tempDotCopyFile, false);
    -                        } catch (Exception e) {
    -                            getLogger().error("Unable to remove temporary file {} due to {}", new Object[]{tempDotCopyFile, e});
    -                        }
    +                   Optional<GSSException> causeOptional = findCause(t, GSSException.class, gsse -> GSSException.NO_CRED == gsse.getMajor());
    +                    if (causeOptional.isPresent()) {
    +                      getLogger().warn(String.format("An error occured while connecting to HDFS. "
    --- End diff --
    
    indeed.


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by jtstorck <gi...@git.apache.org>.
Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r216031690
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -269,13 +272,15 @@ public Object run() {
                             }
                             changeOwner(context, hdfs, configuredRootDirPath, flowFile);
                         } catch (IOException e) {
    -                        if (!Strings.isNullOrEmpty(e.getMessage()) && e.getMessage().contains(String.format("Couldn't setup connection for %s", ugi.getUserName()))) {
    -                          getLogger().error(String.format("An error occured while connecting to HDFS. Rolling back session, and penalizing flowfile %s",
    -                              flowFile.getAttribute(CoreAttributes.UUID.key())));
    -                          session.rollback(true);
    -                        } else {
    -                          throw e;
    -                        }
    +                      boolean tgtExpired = hasCause(e, GSSException.class, gsse -> "Failed to find any Kerberos tgt".equals(gsse.getMinorString()));
    +                      if (tgtExpired) {
    +                        getLogger().error(String.format("An error occured while connecting to HDFS. Rolling back session, and penalizing flow file %s",
    +                            putFlowFile.getAttribute(CoreAttributes.UUID.key())));
    +                        session.rollback(true);
    +                      } else {
    +                        getLogger().error("Failed to access HDFS due to {}", new Object[]{e});
    +                        session.transfer(session.penalize(putFlowFile), REL_FAILURE);
    --- End diff --
    
    @ekovacs I don't think we need to penalize on the transfer to failure here.


---

[GitHub] nifi pull request #2971: NIFI-5557: handling expired ticket by rollback and ...

Posted by ekovacs <gi...@git.apache.org>.
Github user ekovacs commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r214677358
  
    --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java ---
    @@ -266,6 +268,13 @@ public Object run() {
                                 throw new IOException(configuredRootDirPath.toString() + " could not be created");
                             }
                             changeOwner(context, hdfs, configuredRootDirPath, flowFile);
    +                    } catch (IOException e) {
    +                        if (!Strings.isNullOrEmpty(e.getMessage()) && e.getMessage().contains(String.format("Couldn't setup connection for %s", ugi.getUserName()))) {
    --- End diff --
    
    Thanks @jtstorck you are right.
    I pushed a commit to handle the way you suggested


---