You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by mattyb149 <gi...@git.apache.org> on 2017/09/26 02:26:31 UTC

[GitHub] nifi pull request #2175: NIFI-4410: Improved error handling/logging in PutEl...

GitHub user mattyb149 opened a pull request:

    https://github.com/apache/nifi/pull/2175

    NIFI-4410: Improved error handling/logging in PutElasticsearchHttp processors

    Thank you for submitting a contribution to Apache NiFi.
    
    In order to streamline the review of the contribution we ask you
    to ensure the following steps have been taken:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
         in the commit message?
    
    - [x] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    
    - [x] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    - [x] Is your initial contribution a single, squashed commit?
    
    ### For code changes:
    - [x] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
    - [x] Have you written or updated unit tests to verify your changes?
    - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
    - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
    - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
    - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?
    
    ### For documentation related changes:
    - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
    
    ### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mattyb149/nifi NIFI-4410

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/2175.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2175
    
----
commit e21bf18d3683a37b78b3da4279043b1be6498ca9
Author: Matthew Burgess <ma...@apache.org>
Date:   2017-09-26T02:25:29Z

    NIFI-4410: Improved error handling/logging in PutElasticsearchHttp processors

----


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by mattyb149 <gi...@git.apache.org>.
Github user mattyb149 commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    This PR was about error handling, could you try 10k bad documents/operations?


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    > Are you seeing "not-found-updates" as failures, and "not-found-deletes" as successful? If so then are you ok with that behavior? I'm tending towards keeping it that way.
    
    Absolutely. I think it is natural behavior and makes validating flows easier. For example, if you run through a bulk delete once and then replay it to verify nothing was left, you don't want a success message in the first case and a failure message in the latter if both deletes cleanly executed.


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by mattyb149 <gi...@git.apache.org>.
Github user mattyb149 commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    So it is Elasticsearch that is treating "not found while deleting" differently than "not found while updating", they return different status codes with the "error" message in different fields in the JSON. This PR checks for both. I'm waffling on whether to treat them as errors, I think I'd like to keep them as not errors (see my comment above).
    @MikeThomsen Are you seeing "not-found-updates" as failures, and "not-found-deletes" as successful? If so then are you ok with that behavior? I'm tending towards keeping it that way.


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by mattyb149 <gi...@git.apache.org>.
Github user mattyb149 commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    Failure in the case of not-found-while-deleting is subjective IMO. You can look at it as "the processor did not perform the task I issued", or "the processor accomplished the end goal of the task (i.e. the document is not there after the delete)". It seems ES is choosing the latter, but if a general user would expect a "failed delete" along the failure relationship, we can treat them as such.


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    I found one area that this doesn't handle properly at all, and that's deleting documents where the ID cannot be found. The reason your patch fails there is that the check around the declaration of `boolean errors` ~ line 353 doesn't expect ElasticSearch to treat this as an inconsequential problem instead of an error. See this JSON I got in the debugger when running a flow:
    
    {
    	"took": 346,
    	"errors": false,
    	"items": [{
    		"delete": {
    			"found": false,
    			"_index": "events",
    			"_type": "event",
    			"_id": "abcd_1509454409510",
    			"_version": 1,
    			"result": "not_found",
    			"_shards": {
    				"total": 2,
    				"successful": 1,
    				"failed": 0
    			},
    			"status": 404
    		}
    	}]
    }
    
    I try indexing some really bad (syntatically valid JSON that is not what Elastic expects) and see if that goes to the error queue.
    
    My guess is that an update on a missing document ID will return the same sort of JSON as the delete.


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    @mattyb149 Do you want to keep reviewing this or close it out?


---

[GitHub] nifi pull request #2175: NIFI-4410: Improved error handling/logging in PutEl...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/nifi/pull/2175


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    So yeah, +1 on merge.


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by mattyb149 <gi...@git.apache.org>.
Github user mattyb149 commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    Before the fix, you would probably see a log statement for each failed doc. Also for some "not found" errors, such as maybe deleting a doc that doesn't exist? you would see an empty reason. The fix includes only logging one error per batch, and filling in the empty reason for the "not found" errors.


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    I'm using 5.5.2


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by mattyb149 <gi...@git.apache.org>.
Github user mattyb149 commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    Hmm, that's interesting, deleting with ID not found was my whole point of using "result" instead of "reason". What version of ES are you using to test against? I wonder if the behavior has changed at some point to have errors = false when it was true during my testing? Also now I'm rethinking whether we should treat "not-found deletes" as errors since ES doesn't seem to.


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    Ok. I just went back and tried out PutElasticSearchHttp (don't have time to retry the record one right now), and the behavior was as expected. It treats any delete that doesn't result in a server-side exception as a success. 


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    I forgot to mention that when you run into that scenario, it actually sends the output to Success, which it shouldn't since most of the documents failed to be deleted.


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    Going forward, I think < 5.X support should be marked as deprecated. By v7 they've claimed that "types" are going to be gone completely and it's going to resemble Solr in being a flat kv pair index.


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by mattyb149 <gi...@git.apache.org>.
Github user mattyb149 commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    If you need to raise the logging level, you can add that processor to logback.xml with a setting of INFO, otherwise you can change all processors to INFO with the following sed statement (works on Mac, don't need the parameter after -i if not on Mac):
    
    `sed -i '' "s/\"org.apache.nifi.processors\" level=\"WARN\"/\"org.apache.nifi.processors\" level=\"INFO\"/" conf/logback.xml`


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by mattyb149 <gi...@git.apache.org>.
Github user mattyb149 commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    This is my PR. If you give it a +1 we can have a committer merge it, thanks!


---

[GitHub] nifi issue #2175: NIFI-4410: Improved error handling/logging in PutElasticse...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2175
  
    I actually had a bad batch go through, and it sent them out properly. What log settings need to be set in the configuration files to make sure I see your changes?


---