You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/02/15 22:18:06 UTC

[GitHub] gianm commented on a change in pull request #7024: Time Ordering Option on Small-Result-Set Scan Queries

gianm commented on a change in pull request #7024: Time Ordering Option on Small-Result-Set Scan Queries
URL: https://github.com/apache/incubator-druid/pull/7024#discussion_r256928233
 
 

 ##########
 File path: docs/content/querying/scan-query.md
 ##########
 @@ -24,7 +24,13 @@ title: "Scan query"
 
 # Scan query
 
-Scan query returns raw Druid rows in streaming mode.
+The Scan query returns raw Druid rows in streaming mode.  The biggest difference between the Select query and the Scan
+query is that the Scan query does not retain all the returned rows in memory before they are returned to the client
+(except when time-ordering is used).  The Select query _will_ retain the rows in memory, causing memory pressure if too
+many rows are returned.  The Scan query can return all the rows without issuing another pagination query, which is
+extremely useful when directly querying against historical or realtime nodes.
 
 Review comment:
   We're trying to harmonize language in this area (see #6916); in that context "Historical processes or streaming ingestion tasks" is more Ministry of Truth approved than "historical or realtime nodes". For clarification I'd add something to call out expected usage. One way to tie it all together is:
   
   > In addition to straightforward usage where a Scan query is issued to the Broker, the Scan query can also be issued directly to Historical processes or streaming ingestion tasks. This can be useful if you want to retrieve large amounts of data in parallel.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org