You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/09/24 00:36:05 UTC

[GitHub] [pulsar] JointHero opened a new issue #5262: pulsar SQL (Presto) can only query data less than 100000

JointHero opened a new issue #5262: pulsar SQL (Presto) can only query data less than 100000 
URL: https://github.com/apache/pulsar/issues/5262
 
 
   **Describe the bug**
   I create a persistent topic named "persistent://public/default/tusr1", use a nifi workflow to read user data from https://randomuser.me/ and push the userdata in to "persistent://public/default/tusr1", then I use pulsar sql to get the message count.
   the sql is simply: [ SELECT COUNT(ssn) FROM pulsar."public/default"."tusr1"; ]
   I Expected the return value is the message count saved in the bookkeeper ;
   but when the messages count up to 100000+, the sql will return 50000+
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. push messages to a persistent topic
   2. keep querying "select count(*) from topic name " in presto
   3. wait the messages count up to 100000+
   4. the "select count(*) from topic name " result will back to 50000+
   
   **Expected behavior**
   I expected the query will return the right message count with the "pulsar-admin persistent stats-internal" command
   
   **Screenshots**
   I use "pulsar-admin persistent stats-internal persistent://public/default/tusr1" to get the internal stats , got below :
   {
   "entriesAddedCounter" : 149000,
   "numberOfEntries" : 11999,
   "totalSize" : 1670626,
   "currentLedgerEntries" : 2999,
   "currentLedgerSize" : 417685,
   "lastLedgerCreatedTimestamp" : "2019-09-23T00:16:38.965Z",
   "waitingCursorsCount" : 0,
   "pendingAddEntriesCount" : 0,
   "lastConfirmedEntry" : "8088:2998",
   "state" : "LedgerOpened",
   "ledgers" : [ {
   "ledgerId" : 5638,
   "entries" : 9000,
   "size" : 1252941,
   "offloaded" : false
   }, {
   "ledgerId" : 8088,
   "entries" : 0,
   "size" : 0,
   "offloaded" : false
   } ],
   "cursors" : { }
   }
   
   but "SELECT COUNT(ssn) FROM pulsar."public/default"."tusr1";" return 50000+
   
   **Desktop (please complete the following information):**
    - OS: [linux]  pulsar 2.4.1 presto pulsar edition with 2.4.1
   
   
   **Additional context**
   I guess the messages all stored in bookkeeper, but presto only can query 2 segment? is that right ?
   
   anyone can help me to explain this ? and how chan I query all of the messages?
   
   thank you very much
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services