You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/04/09 00:16:00 UTC

[jira] [Commented] (NIFI-5953) GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted

    [ https://issues.apache.org/jira/browse/NIFI-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812890#comment-16812890 ] 

ASF subversion and git services commented on NIFI-5953:
-------------------------------------------------------

Commit cded30b3d2dc0497cf3d06051f55ca304c744250 in nifi's branch refs/heads/master from Kourge
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=cded30b ]

NIFI-5953 Manage GetTwitter connection retries on '420 Enhance Your Calm' exceptions

NIFI-5953 Manage GetTwitter connection retries on '420 Enhance Your Calm' exceptions

Update "Max Client Error Retries" parameter name.
reintriduce client.reconnect() on HTTP_ERROR 420

This closes #3276.

Signed-off-by: Koji Kawamura <ij...@apache.org>


> GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted
> ------------------------------------------------------------------------------------------
>
>                 Key: NIFI-5953
>                 URL: https://issues.apache.org/jira/browse/NIFI-5953
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 1.7.0
>            Reporter: Kourge
>            Assignee: Kourge
>            Priority: Major
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Hi,
> I am using the GetTwitter processor, with the Filter Endpoint.
>  The issue is that I am often getting series of `*Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect*` exceptions.
>  These are followed by one `*Received error STOPPED_BY_ERROR: Retries exhausted due to null. Will not attempt to reconnect*` exception and then the processor don't get any more tweet from Twitter endpoint.
> I am getting rate limited by Twitter API. I am running a NiFi cluster so I am running GetTwitter process on the Primary Node only to prevent using the same credentials several times in parallel.
> I tried to apply the configuration recommendation from this mailing list:
>  <[https://lists.apache.org/thread.html/ed397f42a26760280363e9cc1f64f6654c635110005e24ab9486bf19@%3Cdev.nifi.apache.org%3E]
> But raising "run schedule" parameter to 60 seconds does not help in my case since I target reading between 100 and 200 tweets per minute. Setting "run schedule" to 60 seconds will let NiFi poll only 1 tweet per minute and won't be able to consume Twitter API tweets queue.
> +*Proposed solution*+
> I analyzed the `*GetTwitter.java*` implementation and noticed that the `*onTrigger()*`method reconnects (`*client.reconnect();*`) to the Twitter endpoint on `*HTTP_ERROR*`.
>  The issue here is that `*HTTP/1.1 420 Enhance Your Calm*` messages are `*HTTP_ERROR*` but the Twitter HBC library client (com.twitter.hbc) already manage reconnection.
>  Twitter HBC library client is making retries with an increasing wait delay by its own; with 5 retries by default.
> More, it seam that the `*client.reconnect();*` don't work in my case and this brings to be kicked off the Twitter API earlier because that method is called too often.
> My proposed solution is the following (tested on my local development)
> *1. Letting Twitter HBC library client making the connection retries on `HTTP/1.1 420 Enhance Your Calm` messages.*
> The `*onTrigger()*` method should be updated to not try to reconnect in case of `*HTTP_ERROR*` with message equal to `*HTTP/1.1 420 Enhance Your Calm*`:
> {code:java}
>  case HTTP_ERROR:
>      if (!event.getMessage().equals("HTTP/1.1 420 Enhance Your Calm")) {
>          getLogger().error("Received error {}: {}. Will attempt to reconnect", new Object[{event.getEventType(), event.getMessage()});
>          client.reconnect();
>      }
>      else {
>      getLogger().error("Received error {}: {}. Will not attempt to reconnect", new Object[]{event.getEventType(), event.getMessage()});
>      }
>  break;
> {code}
> *2. Parameterize maximum number of connection retries*
> I also noticed that the default number of retries on the Twitter HBC library is sometimes too low (5 times).
>  So it would be useful to add a GetTwitter processor property named `*Max Connection Retries*`. In my usage I found that `*10*` is a good value.
> Then update the `*onSchedule()*` method with this line (replacing `*10*` by the value of `*Max Connection Retries*`)
> {code:java}
> clientBuilder.retries(10); // default value is 5
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)