You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/18 11:34:00 UTC

[jira] [Commented] (DRILL-7388) Apache Drill Kafka Storage module fails to return results for partitions containing single offset record

    [ https://issues.apache.org/jira/browse/DRILL-7388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976464#comment-16976464 ] 

ASF GitHub Bot commented on DRILL-7388:
---------------------------------------

arina-ielchiieva commented on pull request #1901: DRILL-7388: Kafka improvements
URL: https://github.com/apache/drill/pull/1901
 
 
   Jira - [DRILL-7388](https://issues.apache.org/jira/browse/DRILL-7388).
   
   1. Upgraded Kafka libraries to 2.3.1 (DRILL-6739).
   2. Added new options to support the same features as native JSON reader:
     a. store.kafka.reader.skip_invalid_records, default: false (DRILL-6723);
     b. store.kafka.reader.allow_nan_inf, default: true;
     c. store.kafka.reader.allow_escape_any_char, default: false.
   3. Fixed issue when Kafka topic contains only one message (DRILL-7388).
   4. Replaced Gson parser with Jackson to parse JSON in the same manner as Drill native Json reader.
   5. Performance improvements: Kafka consumers will be closed async, fixed issue with resource leak (DRILL-7290), moved to debug unnecessary info logging.
   6. Updated bootstrap-storage-plugins.json to reflect actual Kafka connection properties.
   7. Added unit tests.
   8. Refactoring and code clean up.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Apache Drill Kafka Storage module fails to return results for partitions containing single offset record
> --------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-7388
>                 URL: https://issues.apache.org/jira/browse/DRILL-7388
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.16.0
>            Reporter: daniel kelly
>            Assignee: Arina Ielchiieva
>            Priority: Major
>             Fix For: 1.17.0
>
>
> If a partition only contains one record - e.g.
> [topicName=myTopic, partitionId=117, startOffset=0, endOffset=1]
> no data is returned.
> I fixed this locally with the following code change in contrib/storage-kafka :-
> {code:java}
> git diff src/main/java/org/apache/drill/exec/store/kafka/KafkaRecordReader.java
> @@ -109,7 +109,7 @@ public class KafkaRecordReader extends AbstractRecordReader {
>      currentMessageCount = 0;
>  
>      try {
> -      while (currentOffset < subScanSpec.getEndOffset() - 1 && msgItr.hasNext()) {
> +      while (currentOffset < subScanSpec.getEndOffset() && msgItr.hasNext()) {
>          ConsumerRecord<byte[], byte[]> consumerRecord = msgItr.next();
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)