You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "egalpin (via GitHub)" <gi...@apache.org> on 2023/08/09 16:41:15 UTC

[GitHub] [beam] egalpin commented on a diff in pull request #26424: Handle response exception from Elasticsearch client

egalpin commented on code in PR #26424:
URL: https://github.com/apache/beam/pull/26424#discussion_r1288883681


##########
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java:
##########
@@ -1224,26 +1241,9 @@ static class DefaultRetryPredicate implements RetryPredicate {
         this(429);
       }
 
-      /** Returns true if the response has the error code for any mutation. */
-      private static boolean errorCodePresent(HttpEntity responseEntity, int errorCode) {

Review Comment:
   Thanks @mrkm4ntr for your patience, I have kept you waiting too many times.
   
   This change is definitely interesting.  I can see that in some cases, such as successfully _creating_ a document via _bulk, the http status code is `200` while the `status` field under `items` in the response body is `201` (updating the doc results in `status` field being `200`:
   
   ```json
   {
       "took": 383,
       "errors": false,
       "items": [
           {
               "update": {
                   "_index": "foo",
                   "_type": "_doc",
                   "_id": "2",
                   "_version": 1,
                   "result": "created",
                   "_shards": {
                       "total": 2,
                       "successful": 1,
                       "failed": 0
                   },
                   "_seq_no": 0,
                   "_primary_term": 1,
                   "status": 201
               }
           }
       ]
   }
   ```
   
   It's also the case, more importantly, that when one of the operations of the bulk payload fails, but another succeeds, there can be an http status code of 200 but one of the items can have `status` field with non-200 code:
   
   ```json
   {
       "took": 19,
       "errors": true,
       "items": [
           {
               "update": {
                   "_index": "foo",
                   "_type": "_doc",
                   "_id": "2",
                   "_version": 7,
                   "result": "updated",
                   "_shards": {
                       "total": 2,
                       "successful": 2,
                       "failed": 0
                   },
                   "_seq_no": 7,
                   "_primary_term": 1,
                   "status": 200
               }
           },
           {
               "update": {
                   "_index": "foo",
                   "_type": "_doc",
                   "_id": "2",
                   "status": 400,
                   "error": {
                       "type": "illegal_argument_exception",
                       "reason": "failed to execute script",
                       "caused_by": {
                           "type": "script_exception",
                           "reason": "compile error",
                           "script_stack": [
                               "if(ctx._source.group !=== null) { ctx._source.gro ...",
                               "                        ^---- HERE"
                           ],
                           "script": "if(ctx._source.group !=== null) { ctx._source.group = params.id % 2 } else { ctx._source.group = 0 }",
                           "lang": "painless",
                           "position": {
                               "offset": 24,
                               "start": 0,
                               "end": 49
                           },
                           "caused_by": {
                               "type": "illegal_argument_exception",
                               "reason": "invalid sequence of tokens near ['='].",
                               "caused_by": {
                                   "type": "no_viable_alt_exception",
                                   "reason": "no_viable_alt_exception: null"
                               }
                           }
                       }
                   }
               }
           }
       ]
   }
   ```
   
   So the status code alone cannot be "trusted" to inform of all failures, since there could be false negatives.  However, it seems that the http status code can be used as a first check: if the http status code is >=400, we _know_ a failure has occurred and we don't need to iterate through the `items` necessarily.  
   
   Checking the http status code first might be a viable path forward for properly handling the 429 case.
   
   Thoughts?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org