You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/08/24 14:20:25 UTC

[GitHub] [druid] abhishekagarwal87 opened a new pull request #10315: Handle internal kinesis sequence numbers when reporting lag

abhishekagarwal87 opened a new pull request #10315:
URL: https://github.com/apache/druid/pull/10315


   Handles a scenario in which the result from `getHighestCurrentOffsets` has expired shards. In such a case, the sequence for a shard can be "EXPIRED". The lag reporter calls the AWS API with this offset and runs into an exception since "EXPIRED" string is not a valid sequence number. 
   
   <hr>
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist above are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   <hr>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #10315: Handle internal kinesis sequence numbers when reporting lag

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #10315:
URL: https://github.com/apache/druid/pull/10315#discussion_r476149085



##########
File path: extensions-core/kinesis-indexing-service/src/main/java/org/apache/druid/indexing/kinesis/KinesisRecordSupplier.java
##########
@@ -718,9 +718,11 @@ public String getEarliestSequenceNumber(StreamPartition<String> partition)
   {
     Map<String, Long> partitionLag = Maps.newHashMapWithExpectedSize(currentOffsets.size());
     for (Map.Entry<String, String> partitionOffset : currentOffsets.entrySet()) {
-      StreamPartition<String> partition = new StreamPartition<>(stream, partitionOffset.getKey());
-      long currentLag = getPartitionTimeLag(partition, partitionOffset.getValue());
-      partitionLag.put(partitionOffset.getKey(), currentLag);
+      if (KinesisSequenceNumber.isValidAWSKinesisSequence(partitionOffset.getValue())) {

Review comment:
       Can we add some tests for this? I don't know if we can simulate this scenario in an integration test, but can we add a unit test for this

##########
File path: extensions-core/kinesis-indexing-service/src/main/java/org/apache/druid/indexing/kinesis/KinesisSequenceNumber.java
##########
@@ -62,7 +62,9 @@
   private KinesisSequenceNumber(String sequenceNumber, boolean isExclusive)
   {
     super(sequenceNumber, isExclusive);
-    if (END_OF_SHARD_MARKER.equals(sequenceNumber) || NO_END_SEQUENCE_NUMBER.equals(sequenceNumber)) {
+    if (END_OF_SHARD_MARKER.equals(sequenceNumber)
+        || NO_END_SEQUENCE_NUMBER.equals(sequenceNumber)
+        || EXPIRED_MARKER.equals(sequenceNumber)) {

Review comment:
       nit: use `isValidAWSKinesisSequence` instead
   ```suggestion
       if (!isValidAWSKinesisSequence(sequenceNumber)) {
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10315: Handle internal kinesis sequence numbers when reporting lag

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on a change in pull request #10315:
URL: https://github.com/apache/druid/pull/10315#discussion_r476422387



##########
File path: extensions-core/kinesis-indexing-service/src/main/java/org/apache/druid/indexing/kinesis/KinesisRecordSupplier.java
##########
@@ -718,9 +718,11 @@ public String getEarliestSequenceNumber(StreamPartition<String> partition)
   {
     Map<String, Long> partitionLag = Maps.newHashMapWithExpectedSize(currentOffsets.size());
     for (Map.Entry<String, String> partitionOffset : currentOffsets.entrySet()) {
-      StreamPartition<String> partition = new StreamPartition<>(stream, partitionOffset.getKey());
-      long currentLag = getPartitionTimeLag(partition, partitionOffset.getValue());
-      partitionLag.put(partitionOffset.getKey(), currentLag);
+      if (KinesisSequenceNumber.isValidAWSKinesisSequence(partitionOffset.getValue())) {

Review comment:
       Ack




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #10315: Handle internal kinesis sequence numbers when reporting lag

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #10315:
URL: https://github.com/apache/druid/pull/10315#discussion_r476578951



##########
File path: extensions-core/kinesis-indexing-service/src/main/java/org/apache/druid/indexing/kinesis/KinesisSequenceNumber.java
##########
@@ -62,7 +62,9 @@
   private KinesisSequenceNumber(String sequenceNumber, boolean isExclusive)
   {
     super(sequenceNumber, isExclusive);
-    if (END_OF_SHARD_MARKER.equals(sequenceNumber) || NO_END_SEQUENCE_NUMBER.equals(sequenceNumber)) {
+    if (END_OF_SHARD_MARKER.equals(sequenceNumber)
+        || NO_END_SEQUENCE_NUMBER.equals(sequenceNumber)
+        || EXPIRED_MARKER.equals(sequenceNumber)) {

Review comment:
       makes sense 👍 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on pull request #10315: Handle internal kinesis sequence numbers when reporting lag

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on pull request #10315:
URL: https://github.com/apache/druid/pull/10315#issuecomment-680027726


   > Makes sense to me... Do you know if this issue can be simulated in an integration test?
   
   I am not sure. It can be simulated by suspending a supervisor since then offsets are loaded from metadata storage and they can contain expired offsets. I feel unit tests should suffice here 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #10315: Handle internal kinesis sequence numbers when reporting lag

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #10315:
URL: https://github.com/apache/druid/pull/10315#issuecomment-681060976


   Added `Area - Metrics/ event Emitting` and `Bug` tag, because I think before this change, an EXPIRED shard would cause the lag metrics to stop being emitted.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s merged pull request #10315: Handle internal kinesis sequence numbers when reporting lag

Posted by GitBox <gi...@apache.org>.
suneet-s merged pull request #10315:
URL: https://github.com/apache/druid/pull/10315


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10315: Handle internal kinesis sequence numbers when reporting lag

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on a change in pull request #10315:
URL: https://github.com/apache/druid/pull/10315#discussion_r476422246



##########
File path: extensions-core/kinesis-indexing-service/src/main/java/org/apache/druid/indexing/kinesis/KinesisSequenceNumber.java
##########
@@ -62,7 +62,9 @@
   private KinesisSequenceNumber(String sequenceNumber, boolean isExclusive)
   {
     super(sequenceNumber, isExclusive);
-    if (END_OF_SHARD_MARKER.equals(sequenceNumber) || NO_END_SEQUENCE_NUMBER.equals(sequenceNumber)) {
+    if (END_OF_SHARD_MARKER.equals(sequenceNumber)
+        || NO_END_SEQUENCE_NUMBER.equals(sequenceNumber)
+        || EXPIRED_MARKER.equals(sequenceNumber)) {

Review comment:
       I avoided that deliberately since we may have a sequence number that does not indicate max sequence but is not a valid kinesis sequence number either. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jon-wei commented on pull request #10315: Handle internal kinesis sequence numbers when reporting lag

Posted by GitBox <gi...@apache.org>.
jon-wei commented on pull request #10315:
URL: https://github.com/apache/druid/pull/10315#issuecomment-680322907


   >  Do you know if this issue can be simulated in an integration test?
   
   I think the unit test is sufficient here, to replicate it completely you would need to wait for the Kinesis retention period for the shards to expire (a minimum of 24 hours), and a different approach with putting the expired offsets in metadata for simulation purposes I don't think adds much more testing value than just the unit test.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org