You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Vijay Balakrishnan <bv...@gmail.com> on 2018/12/12 22:52:53 UTC

Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Hi,
Using FlinkKinesisConsumer in a long running Flink Streaming app consuming
from a Kinesis Stream.
Encountered the following Expired Iterator exception in getRecords():
 org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer [] -
Encountered an unexpected expired iterator

The error on the console ends up being a misleading one: "Caused by:
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException:
1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at
'startingSequenceNumber' failed to satisfy constraint: Member must satisfy
regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis;
Status Code: 400; Error Code: ValidationException; Request ID: ..)
"

How do I increase the *ClientConfiguration.clientExecutiontimeout* to avoid
this issue or is this the right way to handle this issue ? I don't want the
FlinkKinesisConsumer streaming app to fail just because there might be no
records in the Kinesis Stream. I am using TRIM_HORIZON to read from the
start of the Kinesis Stream.

 TIA,

Re: Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Posted by Vijay Balakrishnan <bv...@gmail.com>.
I have 2 shards in the Kinesis Streams- need to figure out how to check
from the logs if records are being written to both shards .
Not sure if this is what you are looking for in terms of # of shards read-
seems like 1 from the logs below:

DEBUG org.apache.http.wire                                         [] -
http-outgoing-12 <<
"{"took":2,"errors":false,"items":[{"index":{"_index":"mlist-5sec-comp-idx","_type":"mlist_5sec_comp_schema","_id":"66zEqWcBmN9Gpk7gdac7","_version":1,"result":"created","_
*shard*
s":{"total":2,"successful":1,"failed":0},"_seq_no":34,"_primary_term":1,"status":201}}]}"

14:51:23,842 [I/O dispatcher 36] DEBUG org.apache.http.wire
                        [] - http-outgoing-9 <<
"{"took":1,"errors":false,"items":[{"index":{"_index":"mlist-5sec-inst-idx","_type":"mlist_5sec_inst_schema","_id":"7KzEqWcBmN9Gpk7gdadA","_version":1,"result":"created","_
*shard*
s":{"total":2,"successful":1,"failed":0},"_seq_no":43,"_primary_term":1,"status":201}}]}"


On Fri, Dec 14, 2018 at 12:28 AM Tzu-Li (Gordon) Tai <tz...@apache.org>
wrote:

> Hi,
>
> I’m suspecting that this is the issue:
> https://issues.apache.org/jira/browse/FLINK-11164.
>
> One more thing to clarify to be sure of this:
> Do you have multiple shards in the Kinesis stream, and if yes, are some of
> them actually empty?
> Meaning that, even though you mentioned some records were written to the
> Kinesis stream, some shards actually weren’t written any records.
>
> Cheers,
> Gordon
>
>
> On 14 December 2018 at 4:10:30 AM, Vijay Balakrishnan (bvijaykr@gmail.com)
> wrote:
>
> Hi Gordon,
>
> My use-case was slightly different.
>
> 1.  Started a Kinesis connector source, with TRIM_HORIZON as the startup
> position.
> 2. Only a few Records were written to the Kinesis stream
> 3. The FlinkKinesisConsumer reads the records from Kinesis stream. Then
> after a period of time of not reading anymore Kinesis Stream records, it received
> the “Encountered an unexpected expired iterator” warning in the logs, and
> the job failed with the misleading AmazonKinesisException?
>
> Also, in 1 with LATEST  as the startup position, I have not been able to
> read any records from the Kinesis Stream.Still trying to pinpoint what i am
> doing wrong. For sure, I am not using checkpoints and not sure if this
> causes any issues with LATEST option.
> TIA,
> Vijay
>
> On Thu, Dec 13, 2018 at 2:59 AM Tzu-Li (Gordon) Tai <tz...@apache.org>
> wrote:
>
>> Hi!
>>
>> Thanks for reporting this.
>>
>> This looks like an overlooked corner case that the Kinesis connector
>> doesn’t handle properly.
>>
>> First, let me clarify the case and how it can be reproduced. Please let
>> me know if the following is correct:
>> 1. You started a Kinesis connector source, with TRIM_HORIZON as the
>> startup position.
>> 2. No records were written to the Kinesis stream at all.
>> 3. After a period of time, you received the “Encountered an unexpected
>> expired iterator” warning in the logs, and the job failed with the
>> misleading AmazonKinesisException?
>>
>> Cheers,
>> Gordon
>>
>> On 13 December 2018 at 6:53:11 AM, Vijay Balakrishnan (bvijaykr@gmail.com)
>> wrote:
>>
>> Hi,
>> Using FlinkKinesisConsumer in a long running Flink Streaming app
>> consuming from a Kinesis Stream.
>> Encountered the following Expired Iterator exception in getRecords():
>>  org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer []
>> - Encountered an unexpected expired iterator
>>
>> The error on the console ends up being a misleading one: "Caused by:
>> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException:
>> 1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at
>> 'startingSequenceNumber' failed to satisfy constraint: Member must satisfy
>> regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis;
>> Status Code: 400; Error Code: ValidationException; Request ID: ..)
>> "
>>
>> How do I increase the *ClientConfiguration.clientExecutiontimeout* to
>> avoid this issue or is this the right way to handle this issue ? I don't
>> want the FlinkKinesisConsumer streaming app to fail just because there
>> might be no records in the Kinesis Stream. I am using TRIM_HORIZON to read
>> from the start of the Kinesis Stream.
>>
>>  TIA,
>>
>>

Re: Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi,

I’m suspecting that this is the issue: https://issues.apache.org/jira/browse/FLINK-11164.

One more thing to clarify to be sure of this:
Do you have multiple shards in the Kinesis stream, and if yes, are some of them actually empty?
Meaning that, even though you mentioned some records were written to the Kinesis stream, some shards actually weren’t written any records.

Cheers,
Gordon


On 14 December 2018 at 4:10:30 AM, Vijay Balakrishnan (bvijaykr@gmail.com) wrote:

Hi Gordon,

My use-case was slightly different.

1.  Started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. Only a few Records were written to the Kinesis stream
3. The FlinkKinesisConsumer reads the records from Kinesis stream. Then after a period of time of not reading anymore Kinesis Stream records, it received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Also, in 1 with LATEST  as the startup position, I have not been able to read any records from the Kinesis Stream.Still trying to pinpoint what i am doing wrong. For sure, I am not using checkpoints and not sure if this causes any issues with LATEST option.
TIA,
Vijay

On Thu, Dec 13, 2018 at 2:59 AM Tzu-Li (Gordon) Tai <tz...@apache.org> wrote:
Hi!

Thanks for reporting this.

This looks like an overlooked corner case that the Kinesis connector doesn’t handle properly.

First, let me clarify the case and how it can be reproduced. Please let me know if the following is correct:
1. You started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. No records were written to the Kinesis stream at all.
3. After a period of time, you received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Cheers,
Gordon

On 13 December 2018 at 6:53:11 AM, Vijay Balakrishnan (bvijaykr@gmail.com) wrote:

Hi,
Using FlinkKinesisConsumer in a long running Flink Streaming app consuming from a Kinesis Stream. 
Encountered the following Expired Iterator exception in getRecords():
 org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer [] - Encountered an unexpected expired iterator 
 
The error on the console ends up being a misleading one: "Caused by: org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException: 1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at 'startingSequenceNumber' failed to satisfy constraint: Member must satisfy regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis; Status Code: 400; Error Code: ValidationException; Request ID: ..)
"
 
How do I increase the ClientConfiguration.clientExecutiontimeout to avoid this issue or is this the right way to handle this issue ? I don't want the FlinkKinesisConsumer streaming app to fail just because there might be no records in the Kinesis Stream. I am using TRIM_HORIZON to read from the start of the Kinesis Stream.
 
 TIA,

Re: Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Posted by Vijay Balakrishnan <bv...@gmail.com>.
Hi Gordon,

My use-case was slightly different.

1.  Started a Kinesis connector source, with TRIM_HORIZON as the startup
position.
2. Only a few Records were written to the Kinesis stream
3. The FlinkKinesisConsumer reads the records from Kinesis stream. Then
after a period of time of not reading anymore Kinesis Stream records,
it received
the “Encountered an unexpected expired iterator” warning in the logs, and
the job failed with the misleading AmazonKinesisException?

Also, in 1 with LATEST  as the startup position, I have not been able to
read any records from the Kinesis Stream.Still trying to pinpoint what i am
doing wrong. For sure, I am not using checkpoints and not sure if this
causes any issues with LATEST option.
TIA,
Vijay

On Thu, Dec 13, 2018 at 2:59 AM Tzu-Li (Gordon) Tai <tz...@apache.org>
wrote:

> Hi!
>
> Thanks for reporting this.
>
> This looks like an overlooked corner case that the Kinesis connector
> doesn’t handle properly.
>
> First, let me clarify the case and how it can be reproduced. Please let me
> know if the following is correct:
> 1. You started a Kinesis connector source, with TRIM_HORIZON as the
> startup position.
> 2. No records were written to the Kinesis stream at all.
> 3. After a period of time, you received the “Encountered an unexpected
> expired iterator” warning in the logs, and the job failed with the
> misleading AmazonKinesisException?
>
> Cheers,
> Gordon
>
> On 13 December 2018 at 6:53:11 AM, Vijay Balakrishnan (bvijaykr@gmail.com)
> wrote:
>
> Hi,
> Using FlinkKinesisConsumer in a long running Flink Streaming app consuming
> from a Kinesis Stream.
> Encountered the following Expired Iterator exception in getRecords():
>  org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer []
> - Encountered an unexpected expired iterator
>
> The error on the console ends up being a misleading one: "Caused by:
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException:
> 1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at
> 'startingSequenceNumber' failed to satisfy constraint: Member must satisfy
> regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis;
> Status Code: 400; Error Code: ValidationException; Request ID: ..)
> "
>
> How do I increase the *ClientConfiguration.clientExecutiontimeout* to
> avoid this issue or is this the right way to handle this issue ? I don't
> want the FlinkKinesisConsumer streaming app to fail just because there
> might be no records in the Kinesis Stream. I am using TRIM_HORIZON to read
> from the start of the Kinesis Stream.
>
>  TIA,
>
>

Re: Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi!

Thanks for reporting this.

This looks like an overlooked corner case that the Kinesis connector doesn’t handle properly.

First, let me clarify the case and how it can be reproduced. Please let me know if the following is correct:
1. You started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. No records were written to the Kinesis stream at all.
3. After a period of time, you received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Cheers,
Gordon
On 13 December 2018 at 6:53:11 AM, Vijay Balakrishnan (bvijaykr@gmail.com) wrote:

Hi,
Using FlinkKinesisConsumer in a long running Flink Streaming app consuming from a Kinesis Stream. 
Encountered the following Expired Iterator exception in getRecords():
 org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer [] - Encountered an unexpected expired iterator 
 
The error on the console ends up being a misleading one: "Caused by: org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException: 1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at 'startingSequenceNumber' failed to satisfy constraint: Member must satisfy regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis; Status Code: 400; Error Code: ValidationException; Request ID: ..)
"
 
How do I increase the ClientConfiguration.clientExecutiontimeout to avoid this issue or is this the right way to handle this issue ? I don't want the FlinkKinesisConsumer streaming app to fail just because there might be no records in the Kinesis Stream. I am using TRIM_HORIZON to read from the start of the Kinesis Stream.
 
 TIA,