You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Koji Kawamura (JIRA)" <ji...@apache.org> on 2019/04/09 00:17:00 UTC

[jira] [Resolved] (NIFI-5953) GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted

     [ https://issues.apache.org/jira/browse/NIFI-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Kawamura resolved NIFI-5953.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 1.10.0

> GetTwitter processor throws Enhance Your Calm exceptions then fails with Retries exhausted
> ------------------------------------------------------------------------------------------
>
>                 Key: NIFI-5953
>                 URL: https://issues.apache.org/jira/browse/NIFI-5953
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 1.7.0
>            Reporter: Kourge
>            Assignee: Kourge
>            Priority: Major
>             Fix For: 1.10.0
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Hi,
> I am using the GetTwitter processor, with the Filter Endpoint.
>  The issue is that I am often getting series of `*Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect*` exceptions.
>  These are followed by one `*Received error STOPPED_BY_ERROR: Retries exhausted due to null. Will not attempt to reconnect*` exception and then the processor don't get any more tweet from Twitter endpoint.
> I am getting rate limited by Twitter API. I am running a NiFi cluster so I am running GetTwitter process on the Primary Node only to prevent using the same credentials several times in parallel.
> I tried to apply the configuration recommendation from this mailing list:
>  <[https://lists.apache.org/thread.html/ed397f42a26760280363e9cc1f64f6654c635110005e24ab9486bf19@%3Cdev.nifi.apache.org%3E]
> But raising "run schedule" parameter to 60 seconds does not help in my case since I target reading between 100 and 200 tweets per minute. Setting "run schedule" to 60 seconds will let NiFi poll only 1 tweet per minute and won't be able to consume Twitter API tweets queue.
> +*Proposed solution*+
> I analyzed the `*GetTwitter.java*` implementation and noticed that the `*onTrigger()*`method reconnects (`*client.reconnect();*`) to the Twitter endpoint on `*HTTP_ERROR*`.
>  The issue here is that `*HTTP/1.1 420 Enhance Your Calm*` messages are `*HTTP_ERROR*` but the Twitter HBC library client (com.twitter.hbc) already manage reconnection.
>  Twitter HBC library client is making retries with an increasing wait delay by its own; with 5 retries by default.
> More, it seam that the `*client.reconnect();*` don't work in my case and this brings to be kicked off the Twitter API earlier because that method is called too often.
> My proposed solution is the following (tested on my local development)
> *1. Letting Twitter HBC library client making the connection retries on `HTTP/1.1 420 Enhance Your Calm` messages.*
> The `*onTrigger()*` method should be updated to not try to reconnect in case of `*HTTP_ERROR*` with message equal to `*HTTP/1.1 420 Enhance Your Calm*`:
> {code:java}
>  case HTTP_ERROR:
>      if (!event.getMessage().equals("HTTP/1.1 420 Enhance Your Calm")) {
>          getLogger().error("Received error {}: {}. Will attempt to reconnect", new Object[{event.getEventType(), event.getMessage()});
>          client.reconnect();
>      }
>      else {
>      getLogger().error("Received error {}: {}. Will not attempt to reconnect", new Object[]{event.getEventType(), event.getMessage()});
>      }
>  break;
> {code}
> *2. Parameterize maximum number of connection retries*
> I also noticed that the default number of retries on the Twitter HBC library is sometimes too low (5 times).
>  So it would be useful to add a GetTwitter processor property named `*Max Connection Retries*`. In my usage I found that `*10*` is a good value.
> Then update the `*onSchedule()*` method with this line (replacing `*10*` by the value of `*Max Connection Retries*`)
> {code:java}
> clientBuilder.retries(10); // default value is 5
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)