You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@geode.apache.org by Jinmei Liao <ji...@pivotal.io> on 2017/07/11 00:13:46 UTC

refactor query command

Hi, all gfsh-users,

In our refactor week, we are trying to refactor how multi-step command is
implemented. The currently implementation is hard to understand to begin
with. The implementation breaks the OO design principals in multiple ways.
It's not thread-safe either. This is an internal command type, and and only
our "query" command uses it.

This is how our current "query" command works:
1) user issues a "query --query='select * from /A'" command,
2) server retrieves the first 1000 (fetch-size, not configurable) rows,
3) if the query mode is NOT interactive, it sends back all the result at
one.
4) if they query mode is interactive, it sends the first 20 (page-size, not
configurable) records. and user uses "n" to go to the next page, once it
hits the last page (showing all 1000 record or get to the end of the result
set), the command finishes.

we would like to ask how useful is this interactive feature. Is it critical
for you? Would the following simplification be sufficient?

1) query command always returns the entire fetch size. We can make it
configurable through environment variables, default to be 100, and you can
also reset it in each individual query command using "query --query='select
* from /A limit 10'

2) provide an option for you to specify a file where we can dump all the
query result in and you can use shell pagination to list the content of the
file.

Please let us know your thoughts/comments. Thanks!


-- 
Cheers

Jinmei

Re: refactor query command

Posted by Jinmei Liao <ji...@pivotal.io>.

Agreed!

On Wed, Jul 12, 2017 at 10:46 AM, Michael Stolz <ms...@pivotal.io> wrote:

> I'm fine with imposing limits on queries from within our own tooling, but
> we cannot impose arbitrary limits on queries that are performed by
> application code.
>
> That would be a silent breaking change to existing behavior at any
> customer who has large queries. There is no way to know by examining code
> or queries if the query is supposed to return 10,000 rows, so only by
> testing every query they have could they determine if the imposed limit
> breaks the intent of the query.
>
> Silent breaking changes to public APIs are not acceptable.
>
> --
> Mike Stolz
> Principal Engineer, GemFire Product Manager
> Mobile: +1-631-835-4771 <(631)%20835-4771>
>
> On Wed, Jul 12, 2017 at 1:29 PM, jiliao@pivotal.io <ji...@pivotal.io>
> wrote:
>
>> We would like to avoid letting user accidentally issues a query that
>> would yield large result set even if they are dumping the result into a
>> file for performance reasons. If they want a large result set sent back by
>> gfsh, they have to do so consciously by adding a large limit in the query
>> themselves.
>>
>>
>>
>> -------- Original Message --------
>> Subject: Re: refactor query command
>> From: Swapnil Bawaskar
>> To: user@geode.apache.org
>> CC:
>>
>>
>> +1
>> One suggestion I would like to make is that if the user specifies that
>> the query results should go to a file, we should not apply the limit clause
>> on the server.
>>
>> On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <ji...@pivotal.io> wrote:
>>
>>> Basically, our reasoning is client-side pagination is not as useful as
>>> people would think, you can either get all the results dumped to the
>>> console, and use scroll bar to move back and forth, or dump it into a file,
>>> and uses whatever piping mechanism supported by your environment. The
>>> server side retrieves everything at once anyway and saves the entire result
>>> set in the backend. It's not like we are saving any server side work here.
>>>
>>> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>>>
>>>> Currently the way it's implementing the client-side pagination is
>>>> convoluted and doubtfully useful. We are proposing to get rid of the
>>>> client-side pagination and only have the server side impose a limit (and
>>>> maybe implement pagination on the server side later on).
>>>>
>>>> The new behavior should look like this:
>>>>
>>>> gfsh> set  APP_FETCH_SIZE  50;
>>>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>>>
>>>> Result : true
>>>> Limit  : 50
>>>> Rows   : 3
>>>>
>>>> Result
>>>> --------
>>>> value1
>>>> value2
>>>> value3
>>>>
>>>>
>>>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>>>
>>>> Result : true
>>>> Limit  : 50
>>>> Rows   : 50
>>>>
>>>> Result
>>>> --------
>>>> value1
>>>> ...
>>>> value50
>>>>
>>>> gfsh> query --query="select * from /A limit 100"  // suppose entry size
>>>> is 1000
>>>> Result : true
>>>> Rows   : 100
>>>>
>>>> Result
>>>> --------
>>>> value1
>>>> ...
>>>> value100
>>>>
>>>>
>>>> gfsh> query --query="select * from /A limit 500" --file="output.txt"
>>>>  // suppose entry size is 1000
>>>> Result : true
>>>> Rows   : 500
>>>>
>>>> Query results output to /var/tempFolder/output.txt
>>>>
>>>> (And the output.txt content to be:
>>>> Result
>>>> --------
>>>> value1
>>>> ...
>>>> value500)
>>>>
>>>>
>>>> Bear in mind that we are trying to get rid of client side pagination,
>>>> so the --page-size or --limit option would not apply anymore. Only the
>>>> limit inside the query will be honored by the server side. If they query
>>>> does not have a limit clause, the server side will impose a limit (default
>>>> to 100). The limit can only be explicitly overridden if user chooses to do
>>>> so. So that user would not accidentally execute a query that would result
>>>> in a large result set.
>>>>
>>>> Would this be sufficient to replace the client-side pagination?
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <agingade@pivotal.io
>>>> > wrote:
>>>>
>>>>> To make it clear, gfsh could print the query it sent to server in the
>>>>> result summary (it shows if it got executed with the limit):
>>>>> Query     :
>>>>> Result     : true
>>>>> startCount : 0
>>>>> endCount   : 20
>>>>> Rows       : 1
>>>>>
>>>>> -Anil.
>>>>>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>>>>
>>>>>> I think it might be worth differentiating the result "LIMIT" (as
>>>>>> used in the OQL query statement like so... "SELECT * FROM /Region
>>>>>> WHERE ... LIMIT 1000")  from what is actually "streamed" back to
>>>>>> *Gfsh* as the default (e.g. 100).
>>>>>>
>>>>>> Clearly sending all the results back is quite expensive depending on
>>>>>> the number of results/LIMIT specified.  Therefore, whatever "--option"
>>>>>> is provided to the `query` command is a further reduction in what is
>>>>>> actually streamed back to the client (e.g. *Gfsh*) initially, sort
>>>>>> of like paging, therefore ... `gfsh> query --query="SELECT * FROM
>>>>>> /Region WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>>>>>
>>>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>>>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>>>>> only be effectively applied to the OQL as it determines what results the
>>>>>> query actually returns.
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <
>>>>>> agingade@pivotal.io> wrote:
>>>>>>
>>>>>>> >> Actually a really nice thing would be to put the pagination
>>>>>>> feature into the OQL engine where it belongs.
>>>>>>> +1 on this.
>>>>>>>
>>>>>>> >> if they query mode is interactive, it sends the first 20
>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>> page,
>>>>>>> >> once it hits the last page (showing all 1000 record or get to the
>>>>>>> end of the result set), the command finishes.
>>>>>>>
>>>>>>> We could provide one more option to end user to quit getting to next
>>>>>>> page and go-back to gfsh command for new commands (if its not there).
>>>>>>>
>>>>>>> I think providing multiple options to view large result set, is a
>>>>>>> nice feature from tooling perspective (interactive result batching, dumping
>>>>>>> into an external file, etc...)
>>>>>>>
>>>>>>> >> It’s fairly common in query tooling to be able to set a result
>>>>>>> set limit.
>>>>>>> Yes...many of the interactive query tools allow pagination/batching
>>>>>>> as part of the result display.
>>>>>>>
>>>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>> We need to make sure that user can differentiate query commands from
>>>>>>> options provided by tool.
>>>>>>>
>>>>>>> -Anil.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>
>>>>>>>> The way I read this is: One is limiting on the server side, the
>>>>>>>> other is limiting the client side.  IOW within the query string is acting
>>>>>>>> on server side.
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> what if user wants to do:
>>>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>>>
>>>>>>>>> What's the difference between put it inside the query string or
>>>>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>>>>
>>>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> It’s fairly common in query tooling to be able to set a result
>>>>>>>>>> set limit.  I would make this a first class option within gfsh instead of
>>>>>>>>>> an environment variable.
>>>>>>>>>>
>>>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>>>
>>>>>>>>>> or
>>>>>>>>>>
>>>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>>>
>>>>>>>>>> The result set limit is semantically different from specifying a
>>>>>>>>>> LIMIT on the OQL query itself.
>>>>>>>>>>
>>>>>>>>>> Anthony
>>>>>>>>>>
>>>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be
>>>>>>>>>> interesting to explore at least a couple output formats, csv being one of
>>>>>>>>>> the most common for people that wants to import or analyze the data using
>>>>>>>>>> other tools.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <mstolz@pivotal.io
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Actually a really nice thing would be to put the pagination
>>>>>>>>>>> feature into the OQL engine where it belongs. Clients shouldn't have to
>>>>>>>>>>> implement pagination.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Mike Stolz
>>>>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>>>>> that the results might be huge. Thus I find the combination of #1 and #2 to
>>>>>>>>>>>> be sufficient for me.
>>>>>>>>>>>>
>>>>>>>>>>>> Sarge
>>>>>>>>>>>>
>>>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>>>> >
>>>>>>>>>>>> > In our refactor week, we are trying to refactor how
>>>>>>>>>>>> multi-step command is implemented. The currently implementation is hard to
>>>>>>>>>>>> understand to begin with. The implementation breaks the OO design
>>>>>>>>>>>> principals in multiple ways. It's not thread-safe either. This is an
>>>>>>>>>>>> internal command type, and and only our "query" command uses it.
>>>>>>>>>>>> >
>>>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>>>>> configurable) rows,
>>>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all
>>>>>>>>>>>> the result at one.
>>>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>>>>>> of the result set), the command finishes.
>>>>>>>>>>>> >
>>>>>>>>>>>> > we would like to ask how useful is this interactive feature.
>>>>>>>>>>>> Is it critical for you? Would the following simplification be sufficient?
>>>>>>>>>>>> >
>>>>>>>>>>>> > 1) query command always returns the entire fetch size. We can
>>>>>>>>>>>> make it configurable through environment variables, default to be 100, and
>>>>>>>>>>>> you can also reset it in each individual query command using "query
>>>>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>>>>> >
>>>>>>>>>>>> > 2) provide an option for you to specify a file where we can
>>>>>>>>>>>> dump all the query result in and you can use shell pagination to list the
>>>>>>>>>>>> content of the file.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> > --
>>>>>>>>>>>> > Cheers
>>>>>>>>>>>> >
>>>>>>>>>>>> > Jinmei
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> ~/William
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>> Jinmei
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ~/William
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -John
>>>>>> john.blum10101 (skype)
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Cheers
>>>>
>>>> Jinmei
>>>>
>>>
>>>
>>>
>>> --
>>> Cheers
>>>
>>> Jinmei
>>>
>>
>


-- 
Cheers

Jinmei

Re: refactor query command

Posted by Wayne Lund <wl...@pivotal.io>.

Agreed. 

Wayne Lund
Advisory Platform Architect
916.296.1893
wlund@pivotal.io

> On Jul 12, 2017, at 10:46 AM, Michael Stolz <ms...@pivotal.io> wrote:
> 
> I'm fine with imposing limits on queries from within our own tooling, but we cannot impose arbitrary limits on queries that are performed by application code. 
> 
> That would be a silent breaking change to existing behavior at any customer who has large queries. There is no way to know by examining code or queries if the query is supposed to return 10,000 rows, so only by testing every query they have could they determine if the imposed limit breaks the intent of the query. 
> 
> Silent breaking changes to public APIs are not acceptable.
> 
> --
> Mike Stolz
> Principal Engineer, GemFire Product Manager 
> Mobile: +1-631-835-4771
> 
> On Wed, Jul 12, 2017 at 1:29 PM, jiliao@pivotal.io <ma...@pivotal.io> <jiliao@pivotal.io <ma...@pivotal.io>> wrote:
> We would like to avoid letting user accidentally issues a query that would yield large result set even if they are dumping the result into a file for performance reasons. If they want a large result set sent back by gfsh, they have to do so consciously by adding a large limit in the query themselves.
> 
> 
> 
> -------- Original Message --------
> Subject: Re: refactor query command
> From: Swapnil Bawaskar 
> To: user@geode.apache.org <ma...@geode.apache.org>
> CC: 
> 
> 
> +1
> One suggestion I would like to make is that if the user specifies that the query results should go to a file, we should not apply the limit clause on the server.
> 
> On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <jiliao@pivotal.io <ma...@pivotal.io>> wrote:
> Basically, our reasoning is client-side pagination is not as useful as people would think, you can either get all the results dumped to the console, and use scroll bar to move back and forth, or dump it into a file, and uses whatever piping mechanism supported by your environment. The server side retrieves everything at once anyway and saves the entire result set in the backend. It's not like we are saving any server side work here.
> 
> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <jiliao@pivotal.io <ma...@pivotal.io>> wrote:
> Currently the way it's implementing the client-side pagination is convoluted and doubtfully useful. We are proposing to get rid of the client-side pagination and only have the server side impose a limit (and maybe implement pagination on the server side later on).
> 
> The new behavior should look like this:
> 
> gfsh> set  APP_FETCH_SIZE  50;
> gfsh> query --query="select * from /A"  // suppose entry size is 3
> 
> Result : true
> Limit  : 50
> Rows   : 3
> 
> Result
> --------
> value1
> value2
> value3
> 
> 
> gfsh> query --query="select * from /A"  // suppose entry size is 1000
> 
> Result : true
> Limit  : 50
> Rows   : 50
> 
> Result
> --------
> value1
> ...
> value50
> 
> gfsh> query --query="select * from /A limit 100"  // suppose entry size is 1000
> Result : true
> Rows   : 100
> 
> Result
> --------
> value1
> ...
> value100
> 
> 
> gfsh> query --query="select * from /A limit 500" --file="output.txt"  // suppose entry size is 1000
> Result : true
> Rows   : 500
> 
> Query results output to /var/tempFolder/output.txt
> 
> (And the output.txt content to be: 
> Result
> --------
> value1
> ...
> value500)
> 
> 
> Bear in mind that we are trying to get rid of client side pagination, so the --page-size or --limit option would not apply anymore. Only the limit inside the query will be honored by the server side. If they query does not have a limit clause, the server side will impose a limit (default to 100). The limit can only be explicitly overridden if user chooses to do so. So that user would not accidentally execute a query that would result in a large result set.
> 
> Would this be sufficient to replace the client-side pagination?
> 
> 
> 
> 
> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <agingade@pivotal.io <ma...@pivotal.io>> wrote:
> To make it clear, gfsh could print the query it sent to server in the result summary (it shows if it got executed with the limit):
> Query     :
> Result     : true
> startCount : 0
> endCount   : 20
> Rows       : 1
> 
> -Anil.
> 
> 
> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jblum@pivotal.io <ma...@pivotal.io>> wrote:
> I think it might be worth differentiating the result "LIMIT" (as used in the OQL query statement like so... "SELECT * FROM /Region WHERE ... LIMIT 1000")  from what is actually "streamed" back to Gfsh as the default (e.g. 100).
> 
> Clearly sending all the results back is quite expensive depending on the number of results/LIMIT specified.  Therefore, whatever "--option" is provided to the `query` command is a further reduction in what is actually streamed back to the client (e.g. Gfsh) initially, sort of like paging, therefore ... `gfsh> query --query="SELECT * FROM /Region WHERE ... LIMIT 1000" --page-size=25`... perhaps?
> 
> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit option would just be confusing to users.  LIMIT like sort (ORDER BY) can only be effectively applied to the OQL as it determines what results the query actually returns.
> 
> 
> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <agingade@pivotal.io <ma...@pivotal.io>> wrote:
> >> Actually a really nice thing would be to put the pagination feature into the OQL engine where it belongs. 
> +1 on this.
> 
> >> if they query mode is interactive, it sends the first 20 (page-size, not configurable) records. and user uses "n" to go to the next page, 
> >> once it hits the last page (showing all 1000 record or get to the end of the result set), the command finishes.
> 
> We could provide one more option to end user to quit getting to next page and go-back to gfsh command for new commands (if its not there).
> 
> I think providing multiple options to view large result set, is a nice feature from tooling perspective (interactive result batching, dumping into an external file, etc...)
> 
> >> It’s fairly common in query tooling to be able to set a result set limit. 
> Yes...many of the interactive query tools allow pagination/batching as part of the result display.
> 
> >> gfsh> query --query='select * from /A limit 10' --limit=100
> We need to make sure that user can differentiate query commands from options provided by tool.
> 
> -Anil.
> 
> 
> 
> 
> 
> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <william.markito@gmail.com <ma...@gmail.com>> wrote:
> The way I read this is: One is limiting on the server side, the other is limiting the client side.  IOW within the query string is acting on server side. 
> 
> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <jiliao@pivotal.io <ma...@pivotal.io>> wrote:
> what if user wants to do:
> gfsh> query --query='select * from /A limit 10' --limit=100
> 
> What's the difference between put it inside the query string or outside? I think eventually it's adding the limit clause to the query.
> 
> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <abaker@pivotal.io <ma...@pivotal.io>> wrote:
> It’s fairly common in query tooling to be able to set a result set limit.  I would make this a first class option within gfsh instead of an environment variable.
> 
> gfsh> set query-limit=1000
> 
> or
> 
> gfsh> query --query='select * from /A’ --limit=1000
> 
> The result set limit is semantically different from specifying a LIMIT on the OQL query itself.
> 
> Anthony
> 
>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <william.markito@gmail.com <ma...@gmail.com>> wrote:
>> 
>> +1 for the combination of 1 and 2 as well.  It would be interesting to explore at least a couple output formats, csv being one of the most common for people that wants to import or analyze the data using other tools. 
>> 
>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <mstolz@pivotal.io <ma...@pivotal.io>> wrote:
>> Actually a really nice thing would be to put the pagination feature into the OQL engine where it belongs. Clients shouldn't have to implement pagination.
>> 
>> --
>> Mike Stolz
>> Principal Engineer, GemFire Product Manager 
>> Mobile: +1-631-835-4771 <tel:(631)%20835-4771>
>> 
>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <mdodge@pivotal.io <ma...@pivotal.io>> wrote:
>> I prefer to redirect output to a file when there is any chance that the results might be huge. Thus I find the combination of #1 and #2 to be sufficient for me.
>> 
>> Sarge
>> 
>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <jiliao@pivotal.io <ma...@pivotal.io>> wrote:
>> >
>> > Hi, all gfsh-users,
>> >
>> > In our refactor week, we are trying to refactor how multi-step command is implemented. The currently implementation is hard to understand to begin with. The implementation breaks the OO design principals in multiple ways. It's not thread-safe either. This is an internal command type, and and only our "query" command uses it.
>> >
>> > This is how our current "query" command works:
>> > 1) user issues a "query --query='select * from /A'" command,
>> > 2) server retrieves the first 1000 (fetch-size, not configurable) rows,
>> > 3) if the query mode is NOT interactive, it sends back all the result at one.
>> > 4) if they query mode is interactive, it sends the first 20 (page-size, not configurable) records. and user uses "n" to go to the next page, once it hits the last page (showing all 1000 record or get to the end of the result set), the command finishes.
>> >
>> > we would like to ask how useful is this interactive feature. Is it critical for you? Would the following simplification be sufficient?
>> >
>> > 1) query command always returns the entire fetch size. We can make it configurable through environment variables, default to be 100, and you can also reset it in each individual query command using "query --query='select * from /A limit 10'
>> >
>> > 2) provide an option for you to specify a file where we can dump all the query result in and you can use shell pagination to list the content of the file.
>> >
>> > Please let us know your thoughts/comments. Thanks!
>> >
>> >
>> > --
>> > Cheers
>> >
>> > Jinmei
>> 
>> 
>> 
>> 
>> 
>> -- 
>> ~/William
> 
> 
> 
> 
> -- 
> Cheers
> 
> Jinmei
> 
> 
> 
> -- 
> ~/William
> 
> 
> 
> 
> -- 
> -John
> john.blum10101 (skype)
> 
> 
> 
> 
> -- 
> Cheers
> 
> Jinmei
> 
> 
> 
> -- 
> Cheers
> 
> Jinmei
>

Re: refactor query command

Posted by Jinmei Liao <ji...@pivotal.io>.

With the current JLineShell features, we haven't found a way to impose a
general option to all the commands yet. But this is a good idea, adding a
generic option.

On Wed, Jul 12, 2017 at 11:22 AM, Dan Smith <ds...@pivotal.io> wrote:

> (1) and (2) seem like good options to me.
>
> One thing I think we should think about is not doing things like
> pagination or writing the output to a file specifically for one command. It
> seems like maybe these should be generic features of the shell that work
> with any command. For example maybe I would also want to dump the output of
> a lucene query to a file.
>
> -Dan
>
> On Wed, Jul 12, 2017 at 10:46 AM, Michael Stolz <ms...@pivotal.io> wrote:
>
>> I'm fine with imposing limits on queries from within our own tooling, but
>> we cannot impose arbitrary limits on queries that are performed by
>> application code.
>>
>> That would be a silent breaking change to existing behavior at any
>> customer who has large queries. There is no way to know by examining code
>> or queries if the query is supposed to return 10,000 rows, so only by
>> testing every query they have could they determine if the imposed limit
>> breaks the intent of the query.
>>
>> Silent breaking changes to public APIs are not acceptable.
>>
>> --
>> Mike Stolz
>> Principal Engineer, GemFire Product Manager
>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>
>> On Wed, Jul 12, 2017 at 1:29 PM, jiliao@pivotal.io <ji...@pivotal.io>
>> wrote:
>>
>>> We would like to avoid letting user accidentally issues a query that
>>> would yield large result set even if they are dumping the result into a
>>> file for performance reasons. If they want a large result set sent back by
>>> gfsh, they have to do so consciously by adding a large limit in the query
>>> themselves.
>>>
>>>
>>>
>>> -------- Original Message --------
>>> Subject: Re: refactor query command
>>> From: Swapnil Bawaskar
>>> To: user@geode.apache.org
>>> CC:
>>>
>>>
>>> +1
>>> One suggestion I would like to make is that if the user specifies that
>>> the query results should go to a file, we should not apply the limit clause
>>> on the server.
>>>
>>> On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <ji...@pivotal.io> wrote:
>>>
>>>> Basically, our reasoning is client-side pagination is not as useful as
>>>> people would think, you can either get all the results dumped to the
>>>> console, and use scroll bar to move back and forth, or dump it into a file,
>>>> and uses whatever piping mechanism supported by your environment. The
>>>> server side retrieves everything at once anyway and saves the entire result
>>>> set in the backend. It's not like we are saving any server side work here.
>>>>
>>>> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>>>>
>>>>> Currently the way it's implementing the client-side pagination is
>>>>> convoluted and doubtfully useful. We are proposing to get rid of the
>>>>> client-side pagination and only have the server side impose a limit (and
>>>>> maybe implement pagination on the server side later on).
>>>>>
>>>>> The new behavior should look like this:
>>>>>
>>>>> gfsh> set  APP_FETCH_SIZE  50;
>>>>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>>>>
>>>>> Result : true
>>>>> Limit  : 50
>>>>> Rows   : 3
>>>>>
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> value2
>>>>> value3
>>>>>
>>>>>
>>>>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>>>>
>>>>> Result : true
>>>>> Limit  : 50
>>>>> Rows   : 50
>>>>>
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> ...
>>>>> value50
>>>>>
>>>>> gfsh> query --query="select * from /A limit 100"  // suppose entry
>>>>> size is 1000
>>>>> Result : true
>>>>> Rows   : 100
>>>>>
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> ...
>>>>> value100
>>>>>
>>>>>
>>>>> gfsh> query --query="select * from /A limit 500" --file="output.txt"
>>>>>  // suppose entry size is 1000
>>>>> Result : true
>>>>> Rows   : 500
>>>>>
>>>>> Query results output to /var/tempFolder/output.txt
>>>>>
>>>>> (And the output.txt content to be:
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> ...
>>>>> value500)
>>>>>
>>>>>
>>>>> Bear in mind that we are trying to get rid of client side pagination,
>>>>> so the --page-size or --limit option would not apply anymore. Only the
>>>>> limit inside the query will be honored by the server side. If they query
>>>>> does not have a limit clause, the server side will impose a limit (default
>>>>> to 100). The limit can only be explicitly overridden if user chooses to do
>>>>> so. So that user would not accidentally execute a query that would result
>>>>> in a large result set.
>>>>>
>>>>> Would this be sufficient to replace the client-side pagination?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <
>>>>> agingade@pivotal.io> wrote:
>>>>>
>>>>>> To make it clear, gfsh could print the query it sent to server in the
>>>>>> result summary (it shows if it got executed with the limit):
>>>>>> Query     :
>>>>>> Result     : true
>>>>>> startCount : 0
>>>>>> endCount   : 20
>>>>>> Rows       : 1
>>>>>>
>>>>>> -Anil.
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>>>>>
>>>>>>> I think it might be worth differentiating the result "LIMIT" (as
>>>>>>> used in the OQL query statement like so... "SELECT * FROM /Region
>>>>>>> WHERE ... LIMIT 1000")  from what is actually "streamed" back to
>>>>>>> *Gfsh* as the default (e.g. 100).
>>>>>>>
>>>>>>> Clearly sending all the results back is quite expensive depending on
>>>>>>> the number of results/LIMIT specified.  Therefore, whatever "
>>>>>>> --option" is provided to the `query` command is a further reduction
>>>>>>> in what is actually streamed back to the client (e.g. *Gfsh*)
>>>>>>> initially, sort of like paging, therefore ... `gfsh> query
>>>>>>> --query="SELECT * FROM /Region WHERE ... LIMIT 1000" --page-size=25`...
>>>>>>> perhaps?
>>>>>>>
>>>>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>>>>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>>>>>> only be effectively applied to the OQL as it determines what results the
>>>>>>> query actually returns.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <
>>>>>>> agingade@pivotal.io> wrote:
>>>>>>>
>>>>>>>> >> Actually a really nice thing would be to put the pagination
>>>>>>>> feature into the OQL engine where it belongs.
>>>>>>>> +1 on this.
>>>>>>>>
>>>>>>>> >> if they query mode is interactive, it sends the first 20
>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>> page,
>>>>>>>> >> once it hits the last page (showing all 1000 record or get to
>>>>>>>> the end of the result set), the command finishes.
>>>>>>>>
>>>>>>>> We could provide one more option to end user to quit getting to
>>>>>>>> next page and go-back to gfsh command for new commands (if its not there).
>>>>>>>>
>>>>>>>> I think providing multiple options to view large result set, is a
>>>>>>>> nice feature from tooling perspective (interactive result batching, dumping
>>>>>>>> into an external file, etc...)
>>>>>>>>
>>>>>>>> >> It’s fairly common in query tooling to be able to set a result
>>>>>>>> set limit.
>>>>>>>> Yes...many of the interactive query tools allow pagination/batching
>>>>>>>> as part of the result display.
>>>>>>>>
>>>>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>> We need to make sure that user can differentiate query commands
>>>>>>>> from options provided by tool.
>>>>>>>>
>>>>>>>> -Anil.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> The way I read this is: One is limiting on the server side, the
>>>>>>>>> other is limiting the client side.  IOW within the query string is acting
>>>>>>>>> on server side.
>>>>>>>>>
>>>>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> what if user wants to do:
>>>>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>>>>
>>>>>>>>>> What's the difference between put it inside the query string or
>>>>>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <abaker@pivotal.io
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> It’s fairly common in query tooling to be able to set a result
>>>>>>>>>>> set limit.  I would make this a first class option within gfsh instead of
>>>>>>>>>>> an environment variable.
>>>>>>>>>>>
>>>>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>>>>
>>>>>>>>>>> or
>>>>>>>>>>>
>>>>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>>>>
>>>>>>>>>>> The result set limit is semantically different from specifying a
>>>>>>>>>>> LIMIT on the OQL query itself.
>>>>>>>>>>>
>>>>>>>>>>> Anthony
>>>>>>>>>>>
>>>>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be
>>>>>>>>>>> interesting to explore at least a couple output formats, csv being one of
>>>>>>>>>>> the most common for people that wants to import or analyze the data using
>>>>>>>>>>> other tools.
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <
>>>>>>>>>>> mstolz@pivotal.io> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Actually a really nice thing would be to put the pagination
>>>>>>>>>>>> feature into the OQL engine where it belongs. Clients shouldn't have to
>>>>>>>>>>>> implement pagination.
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Mike Stolz
>>>>>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>>>>>> that the results might be huge. Thus I find the combination of #1 and #2 to
>>>>>>>>>>>>> be sufficient for me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sarge
>>>>>>>>>>>>>
>>>>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > In our refactor week, we are trying to refactor how
>>>>>>>>>>>>> multi-step command is implemented. The currently implementation is hard to
>>>>>>>>>>>>> understand to begin with. The implementation breaks the OO design
>>>>>>>>>>>>> principals in multiple ways. It's not thread-safe either. This is an
>>>>>>>>>>>>> internal command type, and and only our "query" command uses it.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>>>>>> configurable) rows,
>>>>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all
>>>>>>>>>>>>> the result at one.
>>>>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>>>>>>> of the result set), the command finishes.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > we would like to ask how useful is this interactive feature.
>>>>>>>>>>>>> Is it critical for you? Would the following simplification be sufficient?
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > 1) query command always returns the entire fetch size. We
>>>>>>>>>>>>> can make it configurable through environment variables, default to be 100,
>>>>>>>>>>>>> and you can also reset it in each individual query command using "query
>>>>>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > 2) provide an option for you to specify a file where we can
>>>>>>>>>>>>> dump all the query result in and you can use shell pagination to list the
>>>>>>>>>>>>> content of the file.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > --
>>>>>>>>>>>>> > Cheers
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Jinmei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> ~/William
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Cheers
>>>>>>>>>>
>>>>>>>>>> Jinmei
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ~/William
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -John
>>>>>>> john.blum10101 (skype)
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cheers
>>>>>
>>>>> Jinmei
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Cheers
>>>>
>>>> Jinmei
>>>>
>>>
>>
>


-- 
Cheers

Jinmei

Re: refactor query command

Posted by Dan Smith <ds...@pivotal.io>.

(1) and (2) seem like good options to me.

One thing I think we should think about is not doing things like pagination
or writing the output to a file specifically for one command. It seems like
maybe these should be generic features of the shell that work with any
command. For example maybe I would also want to dump the output of a lucene
query to a file.

-Dan

On Wed, Jul 12, 2017 at 10:46 AM, Michael Stolz <ms...@pivotal.io> wrote:

> I'm fine with imposing limits on queries from within our own tooling, but
> we cannot impose arbitrary limits on queries that are performed by
> application code.
>
> That would be a silent breaking change to existing behavior at any
> customer who has large queries. There is no way to know by examining code
> or queries if the query is supposed to return 10,000 rows, so only by
> testing every query they have could they determine if the imposed limit
> breaks the intent of the query.
>
> Silent breaking changes to public APIs are not acceptable.
>
> --
> Mike Stolz
> Principal Engineer, GemFire Product Manager
> Mobile: +1-631-835-4771 <(631)%20835-4771>
>
> On Wed, Jul 12, 2017 at 1:29 PM, jiliao@pivotal.io <ji...@pivotal.io>
> wrote:
>
>> We would like to avoid letting user accidentally issues a query that
>> would yield large result set even if they are dumping the result into a
>> file for performance reasons. If they want a large result set sent back by
>> gfsh, they have to do so consciously by adding a large limit in the query
>> themselves.
>>
>>
>>
>> -------- Original Message --------
>> Subject: Re: refactor query command
>> From: Swapnil Bawaskar
>> To: user@geode.apache.org
>> CC:
>>
>>
>> +1
>> One suggestion I would like to make is that if the user specifies that
>> the query results should go to a file, we should not apply the limit clause
>> on the server.
>>
>> On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <ji...@pivotal.io> wrote:
>>
>>> Basically, our reasoning is client-side pagination is not as useful as
>>> people would think, you can either get all the results dumped to the
>>> console, and use scroll bar to move back and forth, or dump it into a file,
>>> and uses whatever piping mechanism supported by your environment. The
>>> server side retrieves everything at once anyway and saves the entire result
>>> set in the backend. It's not like we are saving any server side work here.
>>>
>>> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>>>
>>>> Currently the way it's implementing the client-side pagination is
>>>> convoluted and doubtfully useful. We are proposing to get rid of the
>>>> client-side pagination and only have the server side impose a limit (and
>>>> maybe implement pagination on the server side later on).
>>>>
>>>> The new behavior should look like this:
>>>>
>>>> gfsh> set  APP_FETCH_SIZE  50;
>>>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>>>
>>>> Result : true
>>>> Limit  : 50
>>>> Rows   : 3
>>>>
>>>> Result
>>>> --------
>>>> value1
>>>> value2
>>>> value3
>>>>
>>>>
>>>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>>>
>>>> Result : true
>>>> Limit  : 50
>>>> Rows   : 50
>>>>
>>>> Result
>>>> --------
>>>> value1
>>>> ...
>>>> value50
>>>>
>>>> gfsh> query --query="select * from /A limit 100"  // suppose entry size
>>>> is 1000
>>>> Result : true
>>>> Rows   : 100
>>>>
>>>> Result
>>>> --------
>>>> value1
>>>> ...
>>>> value100
>>>>
>>>>
>>>> gfsh> query --query="select * from /A limit 500" --file="output.txt"
>>>>  // suppose entry size is 1000
>>>> Result : true
>>>> Rows   : 500
>>>>
>>>> Query results output to /var/tempFolder/output.txt
>>>>
>>>> (And the output.txt content to be:
>>>> Result
>>>> --------
>>>> value1
>>>> ...
>>>> value500)
>>>>
>>>>
>>>> Bear in mind that we are trying to get rid of client side pagination,
>>>> so the --page-size or --limit option would not apply anymore. Only the
>>>> limit inside the query will be honored by the server side. If they query
>>>> does not have a limit clause, the server side will impose a limit (default
>>>> to 100). The limit can only be explicitly overridden if user chooses to do
>>>> so. So that user would not accidentally execute a query that would result
>>>> in a large result set.
>>>>
>>>> Would this be sufficient to replace the client-side pagination?
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <agingade@pivotal.io
>>>> > wrote:
>>>>
>>>>> To make it clear, gfsh could print the query it sent to server in the
>>>>> result summary (it shows if it got executed with the limit):
>>>>> Query     :
>>>>> Result     : true
>>>>> startCount : 0
>>>>> endCount   : 20
>>>>> Rows       : 1
>>>>>
>>>>> -Anil.
>>>>>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>>>>
>>>>>> I think it might be worth differentiating the result "LIMIT" (as
>>>>>> used in the OQL query statement like so... "SELECT * FROM /Region
>>>>>> WHERE ... LIMIT 1000")  from what is actually "streamed" back to
>>>>>> *Gfsh* as the default (e.g. 100).
>>>>>>
>>>>>> Clearly sending all the results back is quite expensive depending on
>>>>>> the number of results/LIMIT specified.  Therefore, whatever "--option"
>>>>>> is provided to the `query` command is a further reduction in what is
>>>>>> actually streamed back to the client (e.g. *Gfsh*) initially, sort
>>>>>> of like paging, therefore ... `gfsh> query --query="SELECT * FROM
>>>>>> /Region WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>>>>>
>>>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>>>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>>>>> only be effectively applied to the OQL as it determines what results the
>>>>>> query actually returns.
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <
>>>>>> agingade@pivotal.io> wrote:
>>>>>>
>>>>>>> >> Actually a really nice thing would be to put the pagination
>>>>>>> feature into the OQL engine where it belongs.
>>>>>>> +1 on this.
>>>>>>>
>>>>>>> >> if they query mode is interactive, it sends the first 20
>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>> page,
>>>>>>> >> once it hits the last page (showing all 1000 record or get to the
>>>>>>> end of the result set), the command finishes.
>>>>>>>
>>>>>>> We could provide one more option to end user to quit getting to next
>>>>>>> page and go-back to gfsh command for new commands (if its not there).
>>>>>>>
>>>>>>> I think providing multiple options to view large result set, is a
>>>>>>> nice feature from tooling perspective (interactive result batching, dumping
>>>>>>> into an external file, etc...)
>>>>>>>
>>>>>>> >> It’s fairly common in query tooling to be able to set a result
>>>>>>> set limit.
>>>>>>> Yes...many of the interactive query tools allow pagination/batching
>>>>>>> as part of the result display.
>>>>>>>
>>>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>> We need to make sure that user can differentiate query commands from
>>>>>>> options provided by tool.
>>>>>>>
>>>>>>> -Anil.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>
>>>>>>>> The way I read this is: One is limiting on the server side, the
>>>>>>>> other is limiting the client side.  IOW within the query string is acting
>>>>>>>> on server side.
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> what if user wants to do:
>>>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>>>
>>>>>>>>> What's the difference between put it inside the query string or
>>>>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>>>>
>>>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> It’s fairly common in query tooling to be able to set a result
>>>>>>>>>> set limit.  I would make this a first class option within gfsh instead of
>>>>>>>>>> an environment variable.
>>>>>>>>>>
>>>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>>>
>>>>>>>>>> or
>>>>>>>>>>
>>>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>>>
>>>>>>>>>> The result set limit is semantically different from specifying a
>>>>>>>>>> LIMIT on the OQL query itself.
>>>>>>>>>>
>>>>>>>>>> Anthony
>>>>>>>>>>
>>>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be
>>>>>>>>>> interesting to explore at least a couple output formats, csv being one of
>>>>>>>>>> the most common for people that wants to import or analyze the data using
>>>>>>>>>> other tools.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <mstolz@pivotal.io
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Actually a really nice thing would be to put the pagination
>>>>>>>>>>> feature into the OQL engine where it belongs. Clients shouldn't have to
>>>>>>>>>>> implement pagination.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Mike Stolz
>>>>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>>>>> that the results might be huge. Thus I find the combination of #1 and #2 to
>>>>>>>>>>>> be sufficient for me.
>>>>>>>>>>>>
>>>>>>>>>>>> Sarge
>>>>>>>>>>>>
>>>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>>>> >
>>>>>>>>>>>> > In our refactor week, we are trying to refactor how
>>>>>>>>>>>> multi-step command is implemented. The currently implementation is hard to
>>>>>>>>>>>> understand to begin with. The implementation breaks the OO design
>>>>>>>>>>>> principals in multiple ways. It's not thread-safe either. This is an
>>>>>>>>>>>> internal command type, and and only our "query" command uses it.
>>>>>>>>>>>> >
>>>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>>>>> configurable) rows,
>>>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all
>>>>>>>>>>>> the result at one.
>>>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>>>>>> of the result set), the command finishes.
>>>>>>>>>>>> >
>>>>>>>>>>>> > we would like to ask how useful is this interactive feature.
>>>>>>>>>>>> Is it critical for you? Would the following simplification be sufficient?
>>>>>>>>>>>> >
>>>>>>>>>>>> > 1) query command always returns the entire fetch size. We can
>>>>>>>>>>>> make it configurable through environment variables, default to be 100, and
>>>>>>>>>>>> you can also reset it in each individual query command using "query
>>>>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>>>>> >
>>>>>>>>>>>> > 2) provide an option for you to specify a file where we can
>>>>>>>>>>>> dump all the query result in and you can use shell pagination to list the
>>>>>>>>>>>> content of the file.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> > --
>>>>>>>>>>>> > Cheers
>>>>>>>>>>>> >
>>>>>>>>>>>> > Jinmei
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> ~/William
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>> Jinmei
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ~/William
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -John
>>>>>> john.blum10101 (skype)
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Cheers
>>>>
>>>> Jinmei
>>>>
>>>
>>>
>>>
>>> --
>>> Cheers
>>>
>>> Jinmei
>>>
>>
>

Re: refactor query command

Posted by John Blum <jb...@pivotal.io>.

I would also say, if we add a feature like...

gfsh> set APP_FETCH_SIZE=1000

That this only applies to my session, i.e. no other tooling is affected nor
is the cluster affected globally.



On Wed, Jul 12, 2017 at 11:14 AM, John Blum <jb...@pivotal.io> wrote:

> Agreed!
>
> What I explained and meant before was... if a user wants to "limit" the
> results of the query, then they should do so "explicitly" by using the
> "LIMIT" OQL keyword within the query itself.  This is not difficult to do...
>
> gfsh> query -query="SELECT * FROM /Region WHERE ... LIMIT 1000"
>
>
> There should not be some superficial *Gfsh* query command option, like '
> --limit' or a System property (ugh! We really need to get away from this
> System property nonsense).
>
> Imagine for a moment the user wants to run...
>
> SELECT count(*) FROM /Region
>
> And we impose a default LIMIT of 1000, or 100.  Then what?
>
> No.  It is simple/common enough in any Query language (SQL alike) without
> adding arbitrary options to the `query` command in *Gfsh* to limit the
> results of a query if that is what the user wants. Besides, there is
> nothing preventing a user from circumventing the arbitrary/default limit
> anyway, by saying...
>
> SELECT * FROM /Region WHERE ... LIMIT 2147483647 <(214)%20748-3647>
>
> or...
>
> gfsh> query -query=".." --limit=2147483647 <(214)%20748-3647>
>
> I believe our users are smart enough and conscientious to the fact that 'SELECT
> * FROM /Region' (without a predicate) is not a smart query.
>
> However, I do think there is value in not streaming back the entire result
> set to a client "tool" (e.g. *Gfsh*, *Pulse*, etc).
>
> Actual GemFire cache client applications should not affected by anything a
> tool does unless it is applied to the system itself as a whole (like
> Cluster Config).
>
> $0.02
>
> -j
>
>
>
> On Wed, Jul 12, 2017 at 10:46 AM, Michael Stolz <ms...@pivotal.io> wrote:
>
>> I'm fine with imposing limits on queries from within our own tooling, but
>> we cannot impose arbitrary limits on queries that are performed by
>> application code.
>>
>> That would be a silent breaking change to existing behavior at any
>> customer who has large queries. There is no way to know by examining code
>> or queries if the query is supposed to return 10,000 rows, so only by
>> testing every query they have could they determine if the imposed limit
>> breaks the intent of the query.
>>
>> Silent breaking changes to public APIs are not acceptable.
>>
>> --
>> Mike Stolz
>> Principal Engineer, GemFire Product Manager
>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>
>> On Wed, Jul 12, 2017 at 1:29 PM, jiliao@pivotal.io <ji...@pivotal.io>
>> wrote:
>>
>>> We would like to avoid letting user accidentally issues a query that
>>> would yield large result set even if they are dumping the result into a
>>> file for performance reasons. If they want a large result set sent back by
>>> gfsh, they have to do so consciously by adding a large limit in the query
>>> themselves.
>>>
>>>
>>>
>>> -------- Original Message --------
>>> Subject: Re: refactor query command
>>> From: Swapnil Bawaskar
>>> To: user@geode.apache.org
>>> CC:
>>>
>>>
>>> +1
>>> One suggestion I would like to make is that if the user specifies that
>>> the query results should go to a file, we should not apply the limit clause
>>> on the server.
>>>
>>> On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <ji...@pivotal.io> wrote:
>>>
>>>> Basically, our reasoning is client-side pagination is not as useful as
>>>> people would think, you can either get all the results dumped to the
>>>> console, and use scroll bar to move back and forth, or dump it into a file,
>>>> and uses whatever piping mechanism supported by your environment. The
>>>> server side retrieves everything at once anyway and saves the entire result
>>>> set in the backend. It's not like we are saving any server side work here.
>>>>
>>>> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>>>>
>>>>> Currently the way it's implementing the client-side pagination is
>>>>> convoluted and doubtfully useful. We are proposing to get rid of the
>>>>> client-side pagination and only have the server side impose a limit (and
>>>>> maybe implement pagination on the server side later on).
>>>>>
>>>>> The new behavior should look like this:
>>>>>
>>>>> gfsh> set  APP_FETCH_SIZE  50;
>>>>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>>>>
>>>>> Result : true
>>>>> Limit  : 50
>>>>> Rows   : 3
>>>>>
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> value2
>>>>> value3
>>>>>
>>>>>
>>>>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>>>>
>>>>> Result : true
>>>>> Limit  : 50
>>>>> Rows   : 50
>>>>>
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> ...
>>>>> value50
>>>>>
>>>>> gfsh> query --query="select * from /A limit 100"  // suppose entry
>>>>> size is 1000
>>>>> Result : true
>>>>> Rows   : 100
>>>>>
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> ...
>>>>> value100
>>>>>
>>>>>
>>>>> gfsh> query --query="select * from /A limit 500" --file="output.txt"
>>>>>  // suppose entry size is 1000
>>>>> Result : true
>>>>> Rows   : 500
>>>>>
>>>>> Query results output to /var/tempFolder/output.txt
>>>>>
>>>>> (And the output.txt content to be:
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> ...
>>>>> value500)
>>>>>
>>>>>
>>>>> Bear in mind that we are trying to get rid of client side pagination,
>>>>> so the --page-size or --limit option would not apply anymore. Only the
>>>>> limit inside the query will be honored by the server side. If they query
>>>>> does not have a limit clause, the server side will impose a limit (default
>>>>> to 100). The limit can only be explicitly overridden if user chooses to do
>>>>> so. So that user would not accidentally execute a query that would result
>>>>> in a large result set.
>>>>>
>>>>> Would this be sufficient to replace the client-side pagination?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <
>>>>> agingade@pivotal.io> wrote:
>>>>>
>>>>>> To make it clear, gfsh could print the query it sent to server in the
>>>>>> result summary (it shows if it got executed with the limit):
>>>>>> Query     :
>>>>>> Result     : true
>>>>>> startCount : 0
>>>>>> endCount   : 20
>>>>>> Rows       : 1
>>>>>>
>>>>>> -Anil.
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>>>>>
>>>>>>> I think it might be worth differentiating the result "LIMIT" (as
>>>>>>> used in the OQL query statement like so... "SELECT * FROM /Region
>>>>>>> WHERE ... LIMIT 1000")  from what is actually "streamed" back to
>>>>>>> *Gfsh* as the default (e.g. 100).
>>>>>>>
>>>>>>> Clearly sending all the results back is quite expensive depending on
>>>>>>> the number of results/LIMIT specified.  Therefore, whatever "
>>>>>>> --option" is provided to the `query` command is a further reduction
>>>>>>> in what is actually streamed back to the client (e.g. *Gfsh*)
>>>>>>> initially, sort of like paging, therefore ... `gfsh> query
>>>>>>> --query="SELECT * FROM /Region WHERE ... LIMIT 1000" --page-size=25`...
>>>>>>> perhaps?
>>>>>>>
>>>>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>>>>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>>>>>> only be effectively applied to the OQL as it determines what results the
>>>>>>> query actually returns.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <
>>>>>>> agingade@pivotal.io> wrote:
>>>>>>>
>>>>>>>> >> Actually a really nice thing would be to put the pagination
>>>>>>>> feature into the OQL engine where it belongs.
>>>>>>>> +1 on this.
>>>>>>>>
>>>>>>>> >> if they query mode is interactive, it sends the first 20
>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>> page,
>>>>>>>> >> once it hits the last page (showing all 1000 record or get to
>>>>>>>> the end of the result set), the command finishes.
>>>>>>>>
>>>>>>>> We could provide one more option to end user to quit getting to
>>>>>>>> next page and go-back to gfsh command for new commands (if its not there).
>>>>>>>>
>>>>>>>> I think providing multiple options to view large result set, is a
>>>>>>>> nice feature from tooling perspective (interactive result batching, dumping
>>>>>>>> into an external file, etc...)
>>>>>>>>
>>>>>>>> >> It’s fairly common in query tooling to be able to set a result
>>>>>>>> set limit.
>>>>>>>> Yes...many of the interactive query tools allow pagination/batching
>>>>>>>> as part of the result display.
>>>>>>>>
>>>>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>> We need to make sure that user can differentiate query commands
>>>>>>>> from options provided by tool.
>>>>>>>>
>>>>>>>> -Anil.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> The way I read this is: One is limiting on the server side, the
>>>>>>>>> other is limiting the client side.  IOW within the query string is acting
>>>>>>>>> on server side.
>>>>>>>>>
>>>>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> what if user wants to do:
>>>>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>>>>
>>>>>>>>>> What's the difference between put it inside the query string or
>>>>>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <abaker@pivotal.io
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> It’s fairly common in query tooling to be able to set a result
>>>>>>>>>>> set limit.  I would make this a first class option within gfsh instead of
>>>>>>>>>>> an environment variable.
>>>>>>>>>>>
>>>>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>>>>
>>>>>>>>>>> or
>>>>>>>>>>>
>>>>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>>>>
>>>>>>>>>>> The result set limit is semantically different from specifying a
>>>>>>>>>>> LIMIT on the OQL query itself.
>>>>>>>>>>>
>>>>>>>>>>> Anthony
>>>>>>>>>>>
>>>>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be
>>>>>>>>>>> interesting to explore at least a couple output formats, csv being one of
>>>>>>>>>>> the most common for people that wants to import or analyze the data using
>>>>>>>>>>> other tools.
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <
>>>>>>>>>>> mstolz@pivotal.io> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Actually a really nice thing would be to put the pagination
>>>>>>>>>>>> feature into the OQL engine where it belongs. Clients shouldn't have to
>>>>>>>>>>>> implement pagination.
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Mike Stolz
>>>>>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>>>>>> that the results might be huge. Thus I find the combination of #1 and #2 to
>>>>>>>>>>>>> be sufficient for me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sarge
>>>>>>>>>>>>>
>>>>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > In our refactor week, we are trying to refactor how
>>>>>>>>>>>>> multi-step command is implemented. The currently implementation is hard to
>>>>>>>>>>>>> understand to begin with. The implementation breaks the OO design
>>>>>>>>>>>>> principals in multiple ways. It's not thread-safe either. This is an
>>>>>>>>>>>>> internal command type, and and only our "query" command uses it.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>>>>>> configurable) rows,
>>>>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all
>>>>>>>>>>>>> the result at one.
>>>>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>>>>>>> of the result set), the command finishes.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > we would like to ask how useful is this interactive feature.
>>>>>>>>>>>>> Is it critical for you? Would the following simplification be sufficient?
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > 1) query command always returns the entire fetch size. We
>>>>>>>>>>>>> can make it configurable through environment variables, default to be 100,
>>>>>>>>>>>>> and you can also reset it in each individual query command using "query
>>>>>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > 2) provide an option for you to specify a file where we can
>>>>>>>>>>>>> dump all the query result in and you can use shell pagination to list the
>>>>>>>>>>>>> content of the file.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > --
>>>>>>>>>>>>> > Cheers
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Jinmei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> ~/William
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Cheers
>>>>>>>>>>
>>>>>>>>>> Jinmei
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ~/William
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -John
>>>>>>> john.blum10101 (skype)
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cheers
>>>>>
>>>>> Jinmei
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Cheers
>>>>
>>>> Jinmei
>>>>
>>>
>>
>
>
> --
> -John
> john.blum10101 (skype)
>



-- 
-John
john.blum10101 (skype)

Re: refactor query command

Posted by John Blum <jb...@pivotal.io>.

Agreed!

What I explained and meant before was... if a user wants to "limit" the
results of the query, then they should do so "explicitly" by using the
"LIMIT" OQL keyword within the query itself.  This is not difficult to do...

gfsh> query -query="SELECT * FROM /Region WHERE ... LIMIT 1000"


There should not be some superficial *Gfsh* query command option, like '
--limit' or a System property (ugh! We really need to get away from this
System property nonsense).

Imagine for a moment the user wants to run...

SELECT count(*) FROM /Region

And we impose a default LIMIT of 1000, or 100.  Then what?

No.  It is simple/common enough in any Query language (SQL alike) without
adding arbitrary options to the `query` command in *Gfsh* to limit the
results of a query if that is what the user wants. Besides, there is
nothing preventing a user from circumventing the arbitrary/default limit
anyway, by saying...

SELECT * FROM /Region WHERE ... LIMIT 2147483647

or...

gfsh> query -query=".." --limit=2147483647

I believe our users are smart enough and conscientious to the fact that 'SELECT
* FROM /Region' (without a predicate) is not a smart query.

However, I do think there is value in not streaming back the entire result
set to a client "tool" (e.g. *Gfsh*, *Pulse*, etc).

Actual GemFire cache client applications should not affected by anything a
tool does unless it is applied to the system itself as a whole (like
Cluster Config).

$0.02

-j



On Wed, Jul 12, 2017 at 10:46 AM, Michael Stolz <ms...@pivotal.io> wrote:

> I'm fine with imposing limits on queries from within our own tooling, but
> we cannot impose arbitrary limits on queries that are performed by
> application code.
>
> That would be a silent breaking change to existing behavior at any
> customer who has large queries. There is no way to know by examining code
> or queries if the query is supposed to return 10,000 rows, so only by
> testing every query they have could they determine if the imposed limit
> breaks the intent of the query.
>
> Silent breaking changes to public APIs are not acceptable.
>
> --
> Mike Stolz
> Principal Engineer, GemFire Product Manager
> Mobile: +1-631-835-4771 <(631)%20835-4771>
>
> On Wed, Jul 12, 2017 at 1:29 PM, jiliao@pivotal.io <ji...@pivotal.io>
> wrote:
>
>> We would like to avoid letting user accidentally issues a query that
>> would yield large result set even if they are dumping the result into a
>> file for performance reasons. If they want a large result set sent back by
>> gfsh, they have to do so consciously by adding a large limit in the query
>> themselves.
>>
>>
>>
>> -------- Original Message --------
>> Subject: Re: refactor query command
>> From: Swapnil Bawaskar
>> To: user@geode.apache.org
>> CC:
>>
>>
>> +1
>> One suggestion I would like to make is that if the user specifies that
>> the query results should go to a file, we should not apply the limit clause
>> on the server.
>>
>> On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <ji...@pivotal.io> wrote:
>>
>>> Basically, our reasoning is client-side pagination is not as useful as
>>> people would think, you can either get all the results dumped to the
>>> console, and use scroll bar to move back and forth, or dump it into a file,
>>> and uses whatever piping mechanism supported by your environment. The
>>> server side retrieves everything at once anyway and saves the entire result
>>> set in the backend. It's not like we are saving any server side work here.
>>>
>>> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>>>
>>>> Currently the way it's implementing the client-side pagination is
>>>> convoluted and doubtfully useful. We are proposing to get rid of the
>>>> client-side pagination and only have the server side impose a limit (and
>>>> maybe implement pagination on the server side later on).
>>>>
>>>> The new behavior should look like this:
>>>>
>>>> gfsh> set  APP_FETCH_SIZE  50;
>>>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>>>
>>>> Result : true
>>>> Limit  : 50
>>>> Rows   : 3
>>>>
>>>> Result
>>>> --------
>>>> value1
>>>> value2
>>>> value3
>>>>
>>>>
>>>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>>>
>>>> Result : true
>>>> Limit  : 50
>>>> Rows   : 50
>>>>
>>>> Result
>>>> --------
>>>> value1
>>>> ...
>>>> value50
>>>>
>>>> gfsh> query --query="select * from /A limit 100"  // suppose entry size
>>>> is 1000
>>>> Result : true
>>>> Rows   : 100
>>>>
>>>> Result
>>>> --------
>>>> value1
>>>> ...
>>>> value100
>>>>
>>>>
>>>> gfsh> query --query="select * from /A limit 500" --file="output.txt"
>>>>  // suppose entry size is 1000
>>>> Result : true
>>>> Rows   : 500
>>>>
>>>> Query results output to /var/tempFolder/output.txt
>>>>
>>>> (And the output.txt content to be:
>>>> Result
>>>> --------
>>>> value1
>>>> ...
>>>> value500)
>>>>
>>>>
>>>> Bear in mind that we are trying to get rid of client side pagination,
>>>> so the --page-size or --limit option would not apply anymore. Only the
>>>> limit inside the query will be honored by the server side. If they query
>>>> does not have a limit clause, the server side will impose a limit (default
>>>> to 100). The limit can only be explicitly overridden if user chooses to do
>>>> so. So that user would not accidentally execute a query that would result
>>>> in a large result set.
>>>>
>>>> Would this be sufficient to replace the client-side pagination?
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <agingade@pivotal.io
>>>> > wrote:
>>>>
>>>>> To make it clear, gfsh could print the query it sent to server in the
>>>>> result summary (it shows if it got executed with the limit):
>>>>> Query     :
>>>>> Result     : true
>>>>> startCount : 0
>>>>> endCount   : 20
>>>>> Rows       : 1
>>>>>
>>>>> -Anil.
>>>>>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>>>>
>>>>>> I think it might be worth differentiating the result "LIMIT" (as
>>>>>> used in the OQL query statement like so... "SELECT * FROM /Region
>>>>>> WHERE ... LIMIT 1000")  from what is actually "streamed" back to
>>>>>> *Gfsh* as the default (e.g. 100).
>>>>>>
>>>>>> Clearly sending all the results back is quite expensive depending on
>>>>>> the number of results/LIMIT specified.  Therefore, whatever "--option"
>>>>>> is provided to the `query` command is a further reduction in what is
>>>>>> actually streamed back to the client (e.g. *Gfsh*) initially, sort
>>>>>> of like paging, therefore ... `gfsh> query --query="SELECT * FROM
>>>>>> /Region WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>>>>>
>>>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>>>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>>>>> only be effectively applied to the OQL as it determines what results the
>>>>>> query actually returns.
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <
>>>>>> agingade@pivotal.io> wrote:
>>>>>>
>>>>>>> >> Actually a really nice thing would be to put the pagination
>>>>>>> feature into the OQL engine where it belongs.
>>>>>>> +1 on this.
>>>>>>>
>>>>>>> >> if they query mode is interactive, it sends the first 20
>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>> page,
>>>>>>> >> once it hits the last page (showing all 1000 record or get to the
>>>>>>> end of the result set), the command finishes.
>>>>>>>
>>>>>>> We could provide one more option to end user to quit getting to next
>>>>>>> page and go-back to gfsh command for new commands (if its not there).
>>>>>>>
>>>>>>> I think providing multiple options to view large result set, is a
>>>>>>> nice feature from tooling perspective (interactive result batching, dumping
>>>>>>> into an external file, etc...)
>>>>>>>
>>>>>>> >> It’s fairly common in query tooling to be able to set a result
>>>>>>> set limit.
>>>>>>> Yes...many of the interactive query tools allow pagination/batching
>>>>>>> as part of the result display.
>>>>>>>
>>>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>> We need to make sure that user can differentiate query commands from
>>>>>>> options provided by tool.
>>>>>>>
>>>>>>> -Anil.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>
>>>>>>>> The way I read this is: One is limiting on the server side, the
>>>>>>>> other is limiting the client side.  IOW within the query string is acting
>>>>>>>> on server side.
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> what if user wants to do:
>>>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>>>
>>>>>>>>> What's the difference between put it inside the query string or
>>>>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>>>>
>>>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> It’s fairly common in query tooling to be able to set a result
>>>>>>>>>> set limit.  I would make this a first class option within gfsh instead of
>>>>>>>>>> an environment variable.
>>>>>>>>>>
>>>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>>>
>>>>>>>>>> or
>>>>>>>>>>
>>>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>>>
>>>>>>>>>> The result set limit is semantically different from specifying a
>>>>>>>>>> LIMIT on the OQL query itself.
>>>>>>>>>>
>>>>>>>>>> Anthony
>>>>>>>>>>
>>>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be
>>>>>>>>>> interesting to explore at least a couple output formats, csv being one of
>>>>>>>>>> the most common for people that wants to import or analyze the data using
>>>>>>>>>> other tools.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <mstolz@pivotal.io
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Actually a really nice thing would be to put the pagination
>>>>>>>>>>> feature into the OQL engine where it belongs. Clients shouldn't have to
>>>>>>>>>>> implement pagination.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Mike Stolz
>>>>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>>>>> that the results might be huge. Thus I find the combination of #1 and #2 to
>>>>>>>>>>>> be sufficient for me.
>>>>>>>>>>>>
>>>>>>>>>>>> Sarge
>>>>>>>>>>>>
>>>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>>>> >
>>>>>>>>>>>> > In our refactor week, we are trying to refactor how
>>>>>>>>>>>> multi-step command is implemented. The currently implementation is hard to
>>>>>>>>>>>> understand to begin with. The implementation breaks the OO design
>>>>>>>>>>>> principals in multiple ways. It's not thread-safe either. This is an
>>>>>>>>>>>> internal command type, and and only our "query" command uses it.
>>>>>>>>>>>> >
>>>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>>>>> configurable) rows,
>>>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all
>>>>>>>>>>>> the result at one.
>>>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>>>>>> of the result set), the command finishes.
>>>>>>>>>>>> >
>>>>>>>>>>>> > we would like to ask how useful is this interactive feature.
>>>>>>>>>>>> Is it critical for you? Would the following simplification be sufficient?
>>>>>>>>>>>> >
>>>>>>>>>>>> > 1) query command always returns the entire fetch size. We can
>>>>>>>>>>>> make it configurable through environment variables, default to be 100, and
>>>>>>>>>>>> you can also reset it in each individual query command using "query
>>>>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>>>>> >
>>>>>>>>>>>> > 2) provide an option for you to specify a file where we can
>>>>>>>>>>>> dump all the query result in and you can use shell pagination to list the
>>>>>>>>>>>> content of the file.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> > --
>>>>>>>>>>>> > Cheers
>>>>>>>>>>>> >
>>>>>>>>>>>> > Jinmei
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> ~/William
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>> Jinmei
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ~/William
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -John
>>>>>> john.blum10101 (skype)
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Cheers
>>>>
>>>> Jinmei
>>>>
>>>
>>>
>>>
>>> --
>>> Cheers
>>>
>>> Jinmei
>>>
>>
>


-- 
-John
john.blum10101 (skype)

Re: refactor query command

Posted by Michael Stolz <ms...@pivotal.io>.

I'm fine with imposing limits on queries from within our own tooling, but
we cannot impose arbitrary limits on queries that are performed by
application code.

That would be a silent breaking change to existing behavior at any customer
who has large queries. There is no way to know by examining code or queries
if the query is supposed to return 10,000 rows, so only by testing every
query they have could they determine if the imposed limit breaks the intent
of the query.

Silent breaking changes to public APIs are not acceptable.

--
Mike Stolz
Principal Engineer, GemFire Product Manager
Mobile: +1-631-835-4771

On Wed, Jul 12, 2017 at 1:29 PM, jiliao@pivotal.io <ji...@pivotal.io>
wrote:

> We would like to avoid letting user accidentally issues a query that would
> yield large result set even if they are dumping the result into a file for
> performance reasons. If they want a large result set sent back by gfsh,
> they have to do so consciously by adding a large limit in the query
> themselves.
>
>
>
> -------- Original Message --------
> Subject: Re: refactor query command
> From: Swapnil Bawaskar
> To: user@geode.apache.org
> CC:
>
>
> +1
> One suggestion I would like to make is that if the user specifies that the
> query results should go to a file, we should not apply the limit clause on
> the server.
>
> On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <ji...@pivotal.io> wrote:
>
>> Basically, our reasoning is client-side pagination is not as useful as
>> people would think, you can either get all the results dumped to the
>> console, and use scroll bar to move back and forth, or dump it into a file,
>> and uses whatever piping mechanism supported by your environment. The
>> server side retrieves everything at once anyway and saves the entire result
>> set in the backend. It's not like we are saving any server side work here.
>>
>> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>>
>>> Currently the way it's implementing the client-side pagination is
>>> convoluted and doubtfully useful. We are proposing to get rid of the
>>> client-side pagination and only have the server side impose a limit (and
>>> maybe implement pagination on the server side later on).
>>>
>>> The new behavior should look like this:
>>>
>>> gfsh> set  APP_FETCH_SIZE  50;
>>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>>
>>> Result : true
>>> Limit  : 50
>>> Rows   : 3
>>>
>>> Result
>>> --------
>>> value1
>>> value2
>>> value3
>>>
>>>
>>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>>
>>> Result : true
>>> Limit  : 50
>>> Rows   : 50
>>>
>>> Result
>>> --------
>>> value1
>>> ...
>>> value50
>>>
>>> gfsh> query --query="select * from /A limit 100"  // suppose entry size
>>> is 1000
>>> Result : true
>>> Rows   : 100
>>>
>>> Result
>>> --------
>>> value1
>>> ...
>>> value100
>>>
>>>
>>> gfsh> query --query="select * from /A limit 500" --file="output.txt"  //
>>> suppose entry size is 1000
>>> Result : true
>>> Rows   : 500
>>>
>>> Query results output to /var/tempFolder/output.txt
>>>
>>> (And the output.txt content to be:
>>> Result
>>> --------
>>> value1
>>> ...
>>> value500)
>>>
>>>
>>> Bear in mind that we are trying to get rid of client side pagination, so
>>> the --page-size or --limit option would not apply anymore. Only the limit
>>> inside the query will be honored by the server side. If they query does not
>>> have a limit clause, the server side will impose a limit (default to 100).
>>> The limit can only be explicitly overridden if user chooses to do so. So
>>> that user would not accidentally execute a query that would result in a
>>> large result set.
>>>
>>> Would this be sufficient to replace the client-side pagination?
>>>
>>>
>>>
>>>
>>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <ag...@pivotal.io>
>>> wrote:
>>>
>>>> To make it clear, gfsh could print the query it sent to server in the
>>>> result summary (it shows if it got executed with the limit):
>>>> Query     :
>>>> Result     : true
>>>> startCount : 0
>>>> endCount   : 20
>>>> Rows       : 1
>>>>
>>>> -Anil.
>>>>
>>>>
>>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>>>
>>>>> I think it might be worth differentiating the result "LIMIT" (as used
>>>>> in the OQL query statement like so... "SELECT * FROM /Region WHERE
>>>>> ... LIMIT 1000")  from what is actually "streamed" back to *Gfsh* as
>>>>> the default (e.g. 100).
>>>>>
>>>>> Clearly sending all the results back is quite expensive depending on
>>>>> the number of results/LIMIT specified.  Therefore, whatever "--option"
>>>>> is provided to the `query` command is a further reduction in what is
>>>>> actually streamed back to the client (e.g. *Gfsh*) initially, sort of
>>>>> like paging, therefore ... `gfsh> query --query="SELECT * FROM
>>>>> /Region WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>>>>
>>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>>>> only be effectively applied to the OQL as it determines what results the
>>>>> query actually returns.
>>>>>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <
>>>>> agingade@pivotal.io> wrote:
>>>>>
>>>>>> >> Actually a really nice thing would be to put the pagination
>>>>>> feature into the OQL engine where it belongs.
>>>>>> +1 on this.
>>>>>>
>>>>>> >> if they query mode is interactive, it sends the first 20
>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>> page,
>>>>>> >> once it hits the last page (showing all 1000 record or get to the
>>>>>> end of the result set), the command finishes.
>>>>>>
>>>>>> We could provide one more option to end user to quit getting to next
>>>>>> page and go-back to gfsh command for new commands (if its not there).
>>>>>>
>>>>>> I think providing multiple options to view large result set, is a
>>>>>> nice feature from tooling perspective (interactive result batching, dumping
>>>>>> into an external file, etc...)
>>>>>>
>>>>>> >> It’s fairly common in query tooling to be able to set a result
>>>>>> set limit.
>>>>>> Yes...many of the interactive query tools allow pagination/batching
>>>>>> as part of the result display.
>>>>>>
>>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>> We need to make sure that user can differentiate query commands from
>>>>>> options provided by tool.
>>>>>>
>>>>>> -Anil.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>>>> william.markito@gmail.com> wrote:
>>>>>>
>>>>>>> The way I read this is: One is limiting on the server side, the
>>>>>>> other is limiting the client side.  IOW within the query string is acting
>>>>>>> on server side.
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> what if user wants to do:
>>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>>
>>>>>>>> What's the difference between put it inside the query string or
>>>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> It’s fairly common in query tooling to be able to set a result set
>>>>>>>>> limit.  I would make this a first class option within gfsh instead of an
>>>>>>>>> environment variable.
>>>>>>>>>
>>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>>
>>>>>>>>> or
>>>>>>>>>
>>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>>
>>>>>>>>> The result set limit is semantically different from specifying a
>>>>>>>>> LIMIT on the OQL query itself.
>>>>>>>>>
>>>>>>>>> Anthony
>>>>>>>>>
>>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be
>>>>>>>>> interesting to explore at least a couple output formats, csv being one of
>>>>>>>>> the most common for people that wants to import or analyze the data using
>>>>>>>>> other tools.
>>>>>>>>>
>>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Actually a really nice thing would be to put the pagination
>>>>>>>>>> feature into the OQL engine where it belongs. Clients shouldn't have to
>>>>>>>>>> implement pagination.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Mike Stolz
>>>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>>>>
>>>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>>>> that the results might be huge. Thus I find the combination of #1 and #2 to
>>>>>>>>>>> be sufficient for me.
>>>>>>>>>>>
>>>>>>>>>>> Sarge
>>>>>>>>>>>
>>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>>> >
>>>>>>>>>>> > In our refactor week, we are trying to refactor how multi-step
>>>>>>>>>>> command is implemented. The currently implementation is hard to understand
>>>>>>>>>>> to begin with. The implementation breaks the OO design principals in
>>>>>>>>>>> multiple ways. It's not thread-safe either. This is an internal command
>>>>>>>>>>> type, and and only our "query" command uses it.
>>>>>>>>>>> >
>>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>>>> configurable) rows,
>>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all the
>>>>>>>>>>> result at one.
>>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>>>>> of the result set), the command finishes.
>>>>>>>>>>> >
>>>>>>>>>>> > we would like to ask how useful is this interactive feature.
>>>>>>>>>>> Is it critical for you? Would the following simplification be sufficient?
>>>>>>>>>>> >
>>>>>>>>>>> > 1) query command always returns the entire fetch size. We can
>>>>>>>>>>> make it configurable through environment variables, default to be 100, and
>>>>>>>>>>> you can also reset it in each individual query command using "query
>>>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>>>> >
>>>>>>>>>>> > 2) provide an option for you to specify a file where we can
>>>>>>>>>>> dump all the query result in and you can use shell pagination to list the
>>>>>>>>>>> content of the file.
>>>>>>>>>>> >
>>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> > --
>>>>>>>>>>> > Cheers
>>>>>>>>>>> >
>>>>>>>>>>> > Jinmei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ~/William
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Cheers
>>>>>>>>
>>>>>>>> Jinmei
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ~/William
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -John
>>>>> john.blum10101 (skype)
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Cheers
>>>
>>> Jinmei
>>>
>>
>>
>>
>> --
>> Cheers
>>
>> Jinmei
>>
>

Re: refactor query command

Posted by Wayne Lund <wl...@pivotal.io>.

@Michael I disagree that this should only be a gfsh concern. I would like to see it work like top or limit on sql so that we could have hope that UIs could reasonably support pagination, a feature that I think has been noticeably missing for years.

Sent from my iPhone

> On Jul 12, 2017, at 9:31 AM, Swapnil Bawaskar <sb...@pivotal.io> wrote:
> 
> +1
> One suggestion I would like to make is that if the user specifies that the query results should go to a file, we should not apply the limit clause on the server.
> 
>> On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <ji...@pivotal.io> wrote:
>> Basically, our reasoning is client-side pagination is not as useful as people would think, you can either get all the results dumped to the console, and use scroll bar to move back and forth, or dump it into a file, and uses whatever piping mechanism supported by your environment. The server side retrieves everything at once anyway and saves the entire result set in the backend. It's not like we are saving any server side work here.
>> 
>>> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>>> Currently the way it's implementing the client-side pagination is convoluted and doubtfully useful. We are proposing to get rid of the client-side pagination and only have the server side impose a limit (and maybe implement pagination on the server side later on).
>>> 
>>> The new behavior should look like this:
>>> 
>>> gfsh> set  APP_FETCH_SIZE  50;
>>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>> 
>>> Result : true
>>> Limit  : 50
>>> Rows   : 3
>>> 
>>> Result
>>> --------
>>> value1
>>> value2
>>> value3
>>> 
>>> 
>>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>> 
>>> Result : true
>>> Limit  : 50
>>> Rows   : 50
>>> 
>>> Result
>>> --------
>>> value1
>>> ...
>>> value50
>>> 
>>> gfsh> query --query="select * from /A limit 100"  // suppose entry size is 1000
>>> Result : true
>>> Rows   : 100
>>> 
>>> Result
>>> --------
>>> value1
>>> ...
>>> value100
>>> 
>>> 
>>> gfsh> query --query="select * from /A limit 500" --file="output.txt"  // suppose entry size is 1000
>>> Result : true
>>> Rows   : 500
>>> 
>>> Query results output to /var/tempFolder/output.txt
>>> 
>>> (And the output.txt content to be: 
>>> Result
>>> --------
>>> value1
>>> ...
>>> value500)
>>> 
>>> 
>>> Bear in mind that we are trying to get rid of client side pagination, so the --page-size or --limit option would not apply anymore. Only the limit inside the query will be honored by the server side. If they query does not have a limit clause, the server side will impose a limit (default to 100). The limit can only be explicitly overridden if user chooses to do so. So that user would not accidentally execute a query that would result in a large result set.
>>> 
>>> Would this be sufficient to replace the client-side pagination?
>>> 
>>> 
>>> 
>>> 
>>>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <ag...@pivotal.io> wrote:
>>>> To make it clear, gfsh could print the query it sent to server in the result summary (it shows if it got executed with the limit):
>>>> Query     :
>>>> Result     : true
>>>> startCount : 0
>>>> endCount   : 20
>>>> Rows       : 1
>>>> 
>>>> -Anil.
>>>> 
>>>> 
>>>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>>>> I think it might be worth differentiating the result "LIMIT" (as used in the OQL query statement like so... "SELECT * FROM /Region WHERE ... LIMIT 1000")  from what is actually "streamed" back to Gfsh as the default (e.g. 100).
>>>>> 
>>>>> Clearly sending all the results back is quite expensive depending on the number of results/LIMIT specified.  Therefore, whatever "--option" is provided to the `query` command is a further reduction in what is actually streamed back to the client (e.g. Gfsh) initially, sort of like paging, therefore ... `gfsh> query --query="SELECT * FROM /Region WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>>>> 
>>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit option would just be confusing to users.  LIMIT like sort (ORDER BY) can only be effectively applied to the OQL as it determines what results the query actually returns.
>>>>> 
>>>>> 
>>>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <ag...@pivotal.io> wrote:
>>>>>> >> Actually a really nice thing would be to put the pagination feature into the OQL engine where it belongs. 
>>>>>> +1 on this.
>>>>>> 
>>>>>> >> if they query mode is interactive, it sends the first 20 (page-size, not configurable) records. and user uses "n" to go to the next page, 
>>>>>> >> once it hits the last page (showing all 1000 record or get to the end of the result set), the command finishes.
>>>>>> 
>>>>>> We could provide one more option to end user to quit getting to next page and go-back to gfsh command for new commands (if its not there).
>>>>>> 
>>>>>> I think providing multiple options to view large result set, is a nice feature from tooling perspective (interactive result batching, dumping into an external file, etc...)
>>>>>> 
>>>>>> >> It’s fairly common in query tooling to be able to set a result set limit. 
>>>>>> Yes...many of the interactive query tools allow pagination/batching as part of the result display.
>>>>>> 
>>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>> We need to make sure that user can differentiate query commands from options provided by tool.
>>>>>> 
>>>>>> -Anil.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <wi...@gmail.com> wrote:
>>>>>>> The way I read this is: One is limiting on the server side, the other is limiting the client side.  IOW within the query string is acting on server side. 
>>>>>>> 
>>>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io> wrote:
>>>>>>>> what if user wants to do:
>>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>> 
>>>>>>>> What's the difference between put it inside the query string or outside? I think eventually it's adding the limit clause to the query.
>>>>>>>> 
>>>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io> wrote:
>>>>>>>>> It’s fairly common in query tooling to be able to set a result set limit.  I would make this a first class option within gfsh instead of an environment variable.
>>>>>>>>> 
>>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>> 
>>>>>>>>> or
>>>>>>>>> 
>>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>> 
>>>>>>>>> The result set limit is semantically different from specifying a LIMIT on the OQL query itself.
>>>>>>>>> 
>>>>>>>>> Anthony
>>>>>>>>> 
>>>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <wi...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be interesting to explore at least a couple output formats, csv being one of the most common for people that wants to import or analyze the data using other tools. 
>>>>>>>>>> 
>>>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io> wrote:
>>>>>>>>>>> Actually a really nice thing would be to put the pagination feature into the OQL engine where it belongs. Clients shouldn't have to implement pagination.
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> Mike Stolz
>>>>>>>>>>> Principal Engineer, GemFire Product Manager 
>>>>>>>>>>> Mobile: +1-631-835-4771
>>>>>>>>>>> 
>>>>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <md...@pivotal.io> wrote:
>>>>>>>>>>>> I prefer to redirect output to a file when there is any chance that the results might be huge. Thus I find the combination of #1 and #2 to be sufficient for me.
>>>>>>>>>>>> 
>>>>>>>>>>>> Sarge
>>>>>>>>>>>> 
>>>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>>>> >
>>>>>>>>>>>> > In our refactor week, we are trying to refactor how multi-step command is implemented. The currently implementation is hard to understand to begin with. The implementation breaks the OO design principals in multiple ways. It's not thread-safe either. This is an internal command type, and and only our "query" command uses it.
>>>>>>>>>>>> >
>>>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not configurable) rows,
>>>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all the result at one.
>>>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20 (page-size, not configurable) records. and user uses "n" to go to the next page, once it hits the last page (showing all 1000 record or get to the end of the result set), the command finishes.
>>>>>>>>>>>> >
>>>>>>>>>>>> > we would like to ask how useful is this interactive feature. Is it critical for you? Would the following simplification be sufficient?
>>>>>>>>>>>> >
>>>>>>>>>>>> > 1) query command always returns the entire fetch size. We can make it configurable through environment variables, default to be 100, and you can also reset it in each individual query command using "query --query='select * from /A limit 10'
>>>>>>>>>>>> >
>>>>>>>>>>>> > 2) provide an option for you to specify a file where we can dump all the query result in and you can use shell pagination to list the content of the file.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> > --
>>>>>>>>>>>> > Cheers
>>>>>>>>>>>> >
>>>>>>>>>>>> > Jinmei
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> -- 
>>>>>>>>>> ~/William
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Cheers
>>>>>>>> 
>>>>>>>> Jinmei
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> ~/William
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> -John
>>>>> john.blum10101 (skype)
>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Cheers
>>> 
>>> Jinmei
>> 
>> 
>> 
>> -- 
>> Cheers
>> 
>> Jinmei

Re: refactor query command

Posted by Swapnil Bawaskar <sb...@pivotal.io>.

+1
One suggestion I would like to make is that if the user specifies that the
query results should go to a file, we should not apply the limit clause on
the server.

On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <ji...@pivotal.io> wrote:

> Basically, our reasoning is client-side pagination is not as useful as
> people would think, you can either get all the results dumped to the
> console, and use scroll bar to move back and forth, or dump it into a file,
> and uses whatever piping mechanism supported by your environment. The
> server side retrieves everything at once anyway and saves the entire result
> set in the backend. It's not like we are saving any server side work here.
>
> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>
>> Currently the way it's implementing the client-side pagination is
>> convoluted and doubtfully useful. We are proposing to get rid of the
>> client-side pagination and only have the server side impose a limit (and
>> maybe implement pagination on the server side later on).
>>
>> The new behavior should look like this:
>>
>> gfsh> set  APP_FETCH_SIZE  50;
>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>
>> Result : true
>> Limit  : 50
>> Rows   : 3
>>
>> Result
>> --------
>> value1
>> value2
>> value3
>>
>>
>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>
>> Result : true
>> Limit  : 50
>> Rows   : 50
>>
>> Result
>> --------
>> value1
>> ...
>> value50
>>
>> gfsh> query --query="select * from /A limit 100"  // suppose entry size
>> is 1000
>> Result : true
>> Rows   : 100
>>
>> Result
>> --------
>> value1
>> ...
>> value100
>>
>>
>> gfsh> query --query="select * from /A limit 500" --file="output.txt"  //
>> suppose entry size is 1000
>> Result : true
>> Rows   : 500
>>
>> Query results output to /var/tempFolder/output.txt
>>
>> (And the output.txt content to be:
>> Result
>> --------
>> value1
>> ...
>> value500)
>>
>>
>> Bear in mind that we are trying to get rid of client side pagination, so
>> the --page-size or --limit option would not apply anymore. Only the limit
>> inside the query will be honored by the server side. If they query does not
>> have a limit clause, the server side will impose a limit (default to 100).
>> The limit can only be explicitly overridden if user chooses to do so. So
>> that user would not accidentally execute a query that would result in a
>> large result set.
>>
>> Would this be sufficient to replace the client-side pagination?
>>
>>
>>
>>
>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <ag...@pivotal.io>
>> wrote:
>>
>>> To make it clear, gfsh could print the query it sent to server in the
>>> result summary (it shows if it got executed with the limit):
>>> Query     :
>>> Result     : true
>>> startCount : 0
>>> endCount   : 20
>>> Rows       : 1
>>>
>>> -Anil.
>>>
>>>
>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>>
>>>> I think it might be worth differentiating the result "LIMIT" (as used
>>>> in the OQL query statement like so... "SELECT * FROM /Region WHERE ...
>>>> LIMIT 1000")  from what is actually "streamed" back to *Gfsh* as the
>>>> default (e.g. 100).
>>>>
>>>> Clearly sending all the results back is quite expensive depending on
>>>> the number of results/LIMIT specified.  Therefore, whatever "--option"
>>>> is provided to the `query` command is a further reduction in what is
>>>> actually streamed back to the client (e.g. *Gfsh*) initially, sort of
>>>> like paging, therefore ... `gfsh> query --query="SELECT * FROM /Region
>>>> WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>>>
>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>>> only be effectively applied to the OQL as it determines what results the
>>>> query actually returns.
>>>>
>>>>
>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <
>>>> agingade@pivotal.io> wrote:
>>>>
>>>>> >> Actually a really nice thing would be to put the pagination feature
>>>>> into the OQL engine where it belongs.
>>>>> +1 on this.
>>>>>
>>>>> >> if they query mode is interactive, it sends the first 20
>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>> page,
>>>>> >> once it hits the last page (showing all 1000 record or get to the
>>>>> end of the result set), the command finishes.
>>>>>
>>>>> We could provide one more option to end user to quit getting to next
>>>>> page and go-back to gfsh command for new commands (if its not there).
>>>>>
>>>>> I think providing multiple options to view large result set, is a nice
>>>>> feature from tooling perspective (interactive result batching, dumping into
>>>>> an external file, etc...)
>>>>>
>>>>> >> It’s fairly common in query tooling to be able to set a result set
>>>>> limit.
>>>>> Yes...many of the interactive query tools allow pagination/batching as
>>>>> part of the result display.
>>>>>
>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>> We need to make sure that user can differentiate query commands from
>>>>> options provided by tool.
>>>>>
>>>>> -Anil.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>>> william.markito@gmail.com> wrote:
>>>>>
>>>>>> The way I read this is: One is limiting on the server side, the other
>>>>>> is limiting the client side.  IOW within the query string is acting on
>>>>>> server side.
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>>>> wrote:
>>>>>>
>>>>>>> what if user wants to do:
>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>
>>>>>>> What's the difference between put it inside the query string or
>>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> It’s fairly common in query tooling to be able to set a result set
>>>>>>>> limit.  I would make this a first class option within gfsh instead of an
>>>>>>>> environment variable.
>>>>>>>>
>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>
>>>>>>>> or
>>>>>>>>
>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>
>>>>>>>> The result set limit is semantically different from specifying a
>>>>>>>> LIMIT on the OQL query itself.
>>>>>>>>
>>>>>>>> Anthony
>>>>>>>>
>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>>
>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be interesting
>>>>>>>> to explore at least a couple output formats, csv being one of the most
>>>>>>>> common for people that wants to import or analyze the data using other
>>>>>>>> tools.
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Actually a really nice thing would be to put the pagination
>>>>>>>>> feature into the OQL engine where it belongs. Clients shouldn't have to
>>>>>>>>> implement pagination.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Mike Stolz
>>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>>
>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>>>
>>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>>> that the results might be huge. Thus I find the combination of #1 and #2 to
>>>>>>>>>> be sufficient for me.
>>>>>>>>>>
>>>>>>>>>> Sarge
>>>>>>>>>>
>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>>> wrote:
>>>>>>>>>> >
>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>> >
>>>>>>>>>> > In our refactor week, we are trying to refactor how multi-step
>>>>>>>>>> command is implemented. The currently implementation is hard to understand
>>>>>>>>>> to begin with. The implementation breaks the OO design principals in
>>>>>>>>>> multiple ways. It's not thread-safe either. This is an internal command
>>>>>>>>>> type, and and only our "query" command uses it.
>>>>>>>>>> >
>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>>> configurable) rows,
>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all the
>>>>>>>>>> result at one.
>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>>>> of the result set), the command finishes.
>>>>>>>>>> >
>>>>>>>>>> > we would like to ask how useful is this interactive feature. Is
>>>>>>>>>> it critical for you? Would the following simplification be sufficient?
>>>>>>>>>> >
>>>>>>>>>> > 1) query command always returns the entire fetch size. We can
>>>>>>>>>> make it configurable through environment variables, default to be 100, and
>>>>>>>>>> you can also reset it in each individual query command using "query
>>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>>> >
>>>>>>>>>> > 2) provide an option for you to specify a file where we can
>>>>>>>>>> dump all the query result in and you can use shell pagination to list the
>>>>>>>>>> content of the file.
>>>>>>>>>> >
>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> > --
>>>>>>>>>> > Cheers
>>>>>>>>>> >
>>>>>>>>>> > Jinmei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ~/William
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Cheers
>>>>>>>
>>>>>>> Jinmei
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ~/William
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -John
>>>> john.blum10101 (skype)
>>>>
>>>
>>>
>>
>>
>> --
>> Cheers
>>
>> Jinmei
>>
>
>
>
> --
> Cheers
>
> Jinmei
>

Re: refactor query command

Posted by Jinmei Liao <ji...@pivotal.io>.

Basically, our reasoning is client-side pagination is not as useful as
people would think, you can either get all the results dumped to the
console, and use scroll bar to move back and forth, or dump it into a file,
and uses whatever piping mechanism supported by your environment. The
server side retrieves everything at once anyway and saves the entire result
set in the backend. It's not like we are saving any server side work here.

On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:

> Currently the way it's implementing the client-side pagination is
> convoluted and doubtfully useful. We are proposing to get rid of the
> client-side pagination and only have the server side impose a limit (and
> maybe implement pagination on the server side later on).
>
> The new behavior should look like this:
>
> gfsh> set  APP_FETCH_SIZE  50;
> gfsh> query --query="select * from /A"  // suppose entry size is 3
>
> Result : true
> Limit  : 50
> Rows   : 3
>
> Result
> --------
> value1
> value2
> value3
>
>
> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>
> Result : true
> Limit  : 50
> Rows   : 50
>
> Result
> --------
> value1
> ...
> value50
>
> gfsh> query --query="select * from /A limit 100"  // suppose entry size is
> 1000
> Result : true
> Rows   : 100
>
> Result
> --------
> value1
> ...
> value100
>
>
> gfsh> query --query="select * from /A limit 500" --file="output.txt"  //
> suppose entry size is 1000
> Result : true
> Rows   : 500
>
> Query results output to /var/tempFolder/output.txt
>
> (And the output.txt content to be:
> Result
> --------
> value1
> ...
> value500)
>
>
> Bear in mind that we are trying to get rid of client side pagination, so
> the --page-size or --limit option would not apply anymore. Only the limit
> inside the query will be honored by the server side. If they query does not
> have a limit clause, the server side will impose a limit (default to 100).
> The limit can only be explicitly overridden if user chooses to do so. So
> that user would not accidentally execute a query that would result in a
> large result set.
>
> Would this be sufficient to replace the client-side pagination?
>
>
>
>
> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <ag...@pivotal.io>
> wrote:
>
>> To make it clear, gfsh could print the query it sent to server in the
>> result summary (it shows if it got executed with the limit):
>> Query     :
>> Result     : true
>> startCount : 0
>> endCount   : 20
>> Rows       : 1
>>
>> -Anil.
>>
>>
>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>
>>> I think it might be worth differentiating the result "LIMIT" (as used
>>> in the OQL query statement like so... "SELECT * FROM /Region WHERE ...
>>> LIMIT 1000")  from what is actually "streamed" back to *Gfsh* as the
>>> default (e.g. 100).
>>>
>>> Clearly sending all the results back is quite expensive depending on the
>>> number of results/LIMIT specified.  Therefore, whatever "--option" is
>>> provided to the `query` command is a further reduction in what is
>>> actually streamed back to the client (e.g. *Gfsh*) initially, sort of
>>> like paging, therefore ... `gfsh> query --query="SELECT * FROM /Region
>>> WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>>
>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>> only be effectively applied to the OQL as it determines what results the
>>> query actually returns.
>>>
>>>
>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <agingade@pivotal.io
>>> > wrote:
>>>
>>>> >> Actually a really nice thing would be to put the pagination feature
>>>> into the OQL engine where it belongs.
>>>> +1 on this.
>>>>
>>>> >> if they query mode is interactive, it sends the first 20 (page-size,
>>>> not configurable) records. and user uses "n" to go to the next page,
>>>> >> once it hits the last page (showing all 1000 record or get to the
>>>> end of the result set), the command finishes.
>>>>
>>>> We could provide one more option to end user to quit getting to next
>>>> page and go-back to gfsh command for new commands (if its not there).
>>>>
>>>> I think providing multiple options to view large result set, is a nice
>>>> feature from tooling perspective (interactive result batching, dumping into
>>>> an external file, etc...)
>>>>
>>>> >> It’s fairly common in query tooling to be able to set a result set
>>>> limit.
>>>> Yes...many of the interactive query tools allow pagination/batching as
>>>> part of the result display.
>>>>
>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>> We need to make sure that user can differentiate query commands from
>>>> options provided by tool.
>>>>
>>>> -Anil.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>> william.markito@gmail.com> wrote:
>>>>
>>>>> The way I read this is: One is limiting on the server side, the other
>>>>> is limiting the client side.  IOW within the query string is acting on
>>>>> server side.
>>>>>
>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>>> wrote:
>>>>>
>>>>>> what if user wants to do:
>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>
>>>>>> What's the difference between put it inside the query string or
>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>>>>> wrote:
>>>>>>
>>>>>>> It’s fairly common in query tooling to be able to set a result set
>>>>>>> limit.  I would make this a first class option within gfsh instead of an
>>>>>>> environment variable.
>>>>>>>
>>>>>>> gfsh> set query-limit=1000
>>>>>>>
>>>>>>> or
>>>>>>>
>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>
>>>>>>> The result set limit is semantically different from specifying a
>>>>>>> LIMIT on the OQL query itself.
>>>>>>>
>>>>>>> Anthony
>>>>>>>
>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>
>>>>>>> +1 for the combination of 1 and 2 as well.  It would be interesting
>>>>>>> to explore at least a couple output formats, csv being one of the most
>>>>>>> common for people that wants to import or analyze the data using other
>>>>>>> tools.
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Actually a really nice thing would be to put the pagination feature
>>>>>>>> into the OQL engine where it belongs. Clients shouldn't have to implement
>>>>>>>> pagination.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Mike Stolz
>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>>
>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>> that the results might be huge. Thus I find the combination of #1 and #2 to
>>>>>>>>> be sufficient for me.
>>>>>>>>>
>>>>>>>>> Sarge
>>>>>>>>>
>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>> >
>>>>>>>>> > In our refactor week, we are trying to refactor how multi-step
>>>>>>>>> command is implemented. The currently implementation is hard to understand
>>>>>>>>> to begin with. The implementation breaks the OO design principals in
>>>>>>>>> multiple ways. It's not thread-safe either. This is an internal command
>>>>>>>>> type, and and only our "query" command uses it.
>>>>>>>>> >
>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>> configurable) rows,
>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all the
>>>>>>>>> result at one.
>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>>> of the result set), the command finishes.
>>>>>>>>> >
>>>>>>>>> > we would like to ask how useful is this interactive feature. Is
>>>>>>>>> it critical for you? Would the following simplification be sufficient?
>>>>>>>>> >
>>>>>>>>> > 1) query command always returns the entire fetch size. We can
>>>>>>>>> make it configurable through environment variables, default to be 100, and
>>>>>>>>> you can also reset it in each individual query command using "query
>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>> >
>>>>>>>>> > 2) provide an option for you to specify a file where we can dump
>>>>>>>>> all the query result in and you can use shell pagination to list the
>>>>>>>>> content of the file.
>>>>>>>>> >
>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > --
>>>>>>>>> > Cheers
>>>>>>>>> >
>>>>>>>>> > Jinmei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ~/William
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Cheers
>>>>>>
>>>>>> Jinmei
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~/William
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -John
>>> john.blum10101 (skype)
>>>
>>
>>
>
>
> --
> Cheers
>
> Jinmei
>



-- 
Cheers

Jinmei

Re: refactor query command

Posted by Michael Stolz <ms...@pivotal.io>.

The server side shouldn't impose a limit on normal client queries, only
gfsh queries. For that reason, I'd rather see gfsh itself (or the admin api
if that's how gfsh is performing queries) impose the limit by appending the
--limit to the query string if it isn't present. Imposing it after the
query completes means the query has to do all the work, build the whole
result set and you only get back part of it. Imposing it inside the query
string allows the query to terminate on each member when limit is reached
(unless there is an order-by).

--
Mike Stolz
Principal Engineer, GemFire Product Manager
Mobile: +1-631-835-4771

On Tue, Jul 11, 2017 at 7:22 PM, Jinmei Liao <ji...@pivotal.io> wrote:

> Currently the way it's implementing the client-side pagination is
> convoluted and doubtfully useful. We are proposing to get rid of the
> client-side pagination and only have the server side impose a limit (and
> maybe implement pagination on the server side later on).
>
> The new behavior should look like this:
>
> gfsh> set  APP_FETCH_SIZE  50;
> gfsh> query --query="select * from /A"  // suppose entry size is 3
>
> Result : true
> Limit  : 50
> Rows   : 3
>
> Result
> --------
> value1
> value2
> value3
>
>
> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>
> Result : true
> Limit  : 50
> Rows   : 50
>
> Result
> --------
> value1
> ...
> value50
>
> gfsh> query --query="select * from /A limit 100"  // suppose entry size is
> 1000
> Result : true
> Rows   : 100
>
> Result
> --------
> value1
> ...
> value100
>
>
> gfsh> query --query="select * from /A limit 500" --file="output.txt"  //
> suppose entry size is 1000
> Result : true
> Rows   : 500
>
> Query results output to /var/tempFolder/output.txt
>
> (And the output.txt content to be:
> Result
> --------
> value1
> ...
> value500)
>
>
> Bear in mind that we are trying to get rid of client side pagination, so
> the --page-size or --limit option would not apply anymore. Only the limit
> inside the query will be honored by the server side. If they query does not
> have a limit clause, the server side will impose a limit (default to 100).
> The limit can only be explicitly overridden if user chooses to do so. So
> that user would not accidentally execute a query that would result in a
> large result set.
>
> Would this be sufficient to replace the client-side pagination?
>
>
>
>
> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <ag...@pivotal.io>
> wrote:
>
>> To make it clear, gfsh could print the query it sent to server in the
>> result summary (it shows if it got executed with the limit):
>> Query     :
>> Result     : true
>> startCount : 0
>> endCount   : 20
>> Rows       : 1
>>
>> -Anil.
>>
>>
>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>>
>>> I think it might be worth differentiating the result "LIMIT" (as used
>>> in the OQL query statement like so... "SELECT * FROM /Region WHERE ...
>>> LIMIT 1000")  from what is actually "streamed" back to *Gfsh* as the
>>> default (e.g. 100).
>>>
>>> Clearly sending all the results back is quite expensive depending on the
>>> number of results/LIMIT specified.  Therefore, whatever "--option" is
>>> provided to the `query` command is a further reduction in what is
>>> actually streamed back to the client (e.g. *Gfsh*) initially, sort of
>>> like paging, therefore ... `gfsh> query --query="SELECT * FROM /Region
>>> WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>>
>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>> only be effectively applied to the OQL as it determines what results the
>>> query actually returns.
>>>
>>>
>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <agingade@pivotal.io
>>> > wrote:
>>>
>>>> >> Actually a really nice thing would be to put the pagination feature
>>>> into the OQL engine where it belongs.
>>>> +1 on this.
>>>>
>>>> >> if they query mode is interactive, it sends the first 20 (page-size,
>>>> not configurable) records. and user uses "n" to go to the next page,
>>>> >> once it hits the last page (showing all 1000 record or get to the
>>>> end of the result set), the command finishes.
>>>>
>>>> We could provide one more option to end user to quit getting to next
>>>> page and go-back to gfsh command for new commands (if its not there).
>>>>
>>>> I think providing multiple options to view large result set, is a nice
>>>> feature from tooling perspective (interactive result batching, dumping into
>>>> an external file, etc...)
>>>>
>>>> >> It’s fairly common in query tooling to be able to set a result set
>>>> limit.
>>>> Yes...many of the interactive query tools allow pagination/batching as
>>>> part of the result display.
>>>>
>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>> We need to make sure that user can differentiate query commands from
>>>> options provided by tool.
>>>>
>>>> -Anil.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>> william.markito@gmail.com> wrote:
>>>>
>>>>> The way I read this is: One is limiting on the server side, the other
>>>>> is limiting the client side.  IOW within the query string is acting on
>>>>> server side.
>>>>>
>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>>> wrote:
>>>>>
>>>>>> what if user wants to do:
>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>
>>>>>> What's the difference between put it inside the query string or
>>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>>>>> wrote:
>>>>>>
>>>>>>> It’s fairly common in query tooling to be able to set a result set
>>>>>>> limit.  I would make this a first class option within gfsh instead of an
>>>>>>> environment variable.
>>>>>>>
>>>>>>> gfsh> set query-limit=1000
>>>>>>>
>>>>>>> or
>>>>>>>
>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>
>>>>>>> The result set limit is semantically different from specifying a
>>>>>>> LIMIT on the OQL query itself.
>>>>>>>
>>>>>>> Anthony
>>>>>>>
>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>> william.markito@gmail.com> wrote:
>>>>>>>
>>>>>>> +1 for the combination of 1 and 2 as well.  It would be interesting
>>>>>>> to explore at least a couple output formats, csv being one of the most
>>>>>>> common for people that wants to import or analyze the data using other
>>>>>>> tools.
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Actually a really nice thing would be to put the pagination feature
>>>>>>>> into the OQL engine where it belongs. Clients shouldn't have to implement
>>>>>>>> pagination.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Mike Stolz
>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>>
>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>> that the results might be huge. Thus I find the combination of #1 and #2 to
>>>>>>>>> be sufficient for me.
>>>>>>>>>
>>>>>>>>> Sarge
>>>>>>>>>
>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>> >
>>>>>>>>> > In our refactor week, we are trying to refactor how multi-step
>>>>>>>>> command is implemented. The currently implementation is hard to understand
>>>>>>>>> to begin with. The implementation breaks the OO design principals in
>>>>>>>>> multiple ways. It's not thread-safe either. This is an internal command
>>>>>>>>> type, and and only our "query" command uses it.
>>>>>>>>> >
>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>> configurable) rows,
>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all the
>>>>>>>>> result at one.
>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>>> of the result set), the command finishes.
>>>>>>>>> >
>>>>>>>>> > we would like to ask how useful is this interactive feature. Is
>>>>>>>>> it critical for you? Would the following simplification be sufficient?
>>>>>>>>> >
>>>>>>>>> > 1) query command always returns the entire fetch size. We can
>>>>>>>>> make it configurable through environment variables, default to be 100, and
>>>>>>>>> you can also reset it in each individual query command using "query
>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>> >
>>>>>>>>> > 2) provide an option for you to specify a file where we can dump
>>>>>>>>> all the query result in and you can use shell pagination to list the
>>>>>>>>> content of the file.
>>>>>>>>> >
>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > --
>>>>>>>>> > Cheers
>>>>>>>>> >
>>>>>>>>> > Jinmei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ~/William
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Cheers
>>>>>>
>>>>>> Jinmei
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~/William
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -John
>>> john.blum10101 (skype)
>>>
>>
>>
>
>
> --
> Cheers
>
> Jinmei
>

Re: refactor query command

Posted by Jinmei Liao <ji...@pivotal.io>.

Currently the way it's implementing the client-side pagination is
convoluted and doubtfully useful. We are proposing to get rid of the
client-side pagination and only have the server side impose a limit (and
maybe implement pagination on the server side later on).

The new behavior should look like this:

gfsh> set  APP_FETCH_SIZE  50;
gfsh> query --query="select * from /A"  // suppose entry size is 3

Result : true
Limit  : 50
Rows   : 3

Result
--------
value1
value2
value3


gfsh> query --query="select * from /A"  // suppose entry size is 1000

Result : true
Limit  : 50
Rows   : 50

Result
--------
value1
...
value50

gfsh> query --query="select * from /A limit 100"  // suppose entry size is
1000
Result : true
Rows   : 100

Result
--------
value1
...
value100


gfsh> query --query="select * from /A limit 500" --file="output.txt"  //
suppose entry size is 1000
Result : true
Rows   : 500

Query results output to /var/tempFolder/output.txt

(And the output.txt content to be:
Result
--------
value1
...
value500)


Bear in mind that we are trying to get rid of client side pagination, so
the --page-size or --limit option would not apply anymore. Only the limit
inside the query will be honored by the server side. If they query does not
have a limit clause, the server side will impose a limit (default to 100).
The limit can only be explicitly overridden if user chooses to do so. So
that user would not accidentally execute a query that would result in a
large result set.

Would this be sufficient to replace the client-side pagination?




On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <ag...@pivotal.io>
wrote:

> To make it clear, gfsh could print the query it sent to server in the
> result summary (it shows if it got executed with the limit):
> Query     :
> Result     : true
> startCount : 0
> endCount   : 20
> Rows       : 1
>
> -Anil.
>
>
> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:
>
>> I think it might be worth differentiating the result "LIMIT" (as used in
>> the OQL query statement like so... "SELECT * FROM /Region WHERE ...
>> LIMIT 1000")  from what is actually "streamed" back to *Gfsh* as the
>> default (e.g. 100).
>>
>> Clearly sending all the results back is quite expensive depending on the
>> number of results/LIMIT specified.  Therefore, whatever "--option" is
>> provided to the `query` command is a further reduction in what is
>> actually streamed back to the client (e.g. *Gfsh*) initially, sort of
>> like paging, therefore ... `gfsh> query --query="SELECT * FROM /Region
>> WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>>
>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit option
>> would just be confusing to users.  LIMIT like sort (ORDER BY) can only be
>> effectively applied to the OQL as it determines what results the query
>> actually returns.
>>
>>
>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <ag...@pivotal.io>
>> wrote:
>>
>>> >> Actually a really nice thing would be to put the pagination feature
>>> into the OQL engine where it belongs.
>>> +1 on this.
>>>
>>> >> if they query mode is interactive, it sends the first 20 (page-size,
>>> not configurable) records. and user uses "n" to go to the next page,
>>> >> once it hits the last page (showing all 1000 record or get to the end
>>> of the result set), the command finishes.
>>>
>>> We could provide one more option to end user to quit getting to next
>>> page and go-back to gfsh command for new commands (if its not there).
>>>
>>> I think providing multiple options to view large result set, is a nice
>>> feature from tooling perspective (interactive result batching, dumping into
>>> an external file, etc...)
>>>
>>> >> It’s fairly common in query tooling to be able to set a result set
>>> limit.
>>> Yes...many of the interactive query tools allow pagination/batching as
>>> part of the result display.
>>>
>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>> We need to make sure that user can differentiate query commands from
>>> options provided by tool.
>>>
>>> -Anil.
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>> william.markito@gmail.com> wrote:
>>>
>>>> The way I read this is: One is limiting on the server side, the other
>>>> is limiting the client side.  IOW within the query string is acting on
>>>> server side.
>>>>
>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io>
>>>> wrote:
>>>>
>>>>> what if user wants to do:
>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>
>>>>> What's the difference between put it inside the query string or
>>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>>
>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>>>> wrote:
>>>>>
>>>>>> It’s fairly common in query tooling to be able to set a result set
>>>>>> limit.  I would make this a first class option within gfsh instead of an
>>>>>> environment variable.
>>>>>>
>>>>>> gfsh> set query-limit=1000
>>>>>>
>>>>>> or
>>>>>>
>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>
>>>>>> The result set limit is semantically different from specifying a
>>>>>> LIMIT on the OQL query itself.
>>>>>>
>>>>>> Anthony
>>>>>>
>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>> william.markito@gmail.com> wrote:
>>>>>>
>>>>>> +1 for the combination of 1 and 2 as well.  It would be interesting
>>>>>> to explore at least a couple output formats, csv being one of the most
>>>>>> common for people that wants to import or analyze the data using other
>>>>>> tools.
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io>
>>>>>> wrote:
>>>>>>
>>>>>>> Actually a really nice thing would be to put the pagination feature
>>>>>>> into the OQL engine where it belongs. Clients shouldn't have to implement
>>>>>>> pagination.
>>>>>>>
>>>>>>> --
>>>>>>> Mike Stolz
>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>> mdodge@pivotal.io> wrote:
>>>>>>>
>>>>>>>> I prefer to redirect output to a file when there is any chance that
>>>>>>>> the results might be huge. Thus I find the combination of #1 and #2 to be
>>>>>>>> sufficient for me.
>>>>>>>>
>>>>>>>> Sarge
>>>>>>>>
>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
>>>>>>>> >
>>>>>>>> > Hi, all gfsh-users,
>>>>>>>> >
>>>>>>>> > In our refactor week, we are trying to refactor how multi-step
>>>>>>>> command is implemented. The currently implementation is hard to understand
>>>>>>>> to begin with. The implementation breaks the OO design principals in
>>>>>>>> multiple ways. It's not thread-safe either. This is an internal command
>>>>>>>> type, and and only our "query" command uses it.
>>>>>>>> >
>>>>>>>> > This is how our current "query" command works:
>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not configurable)
>>>>>>>> rows,
>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all the
>>>>>>>> result at one.
>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>>> of the result set), the command finishes.
>>>>>>>> >
>>>>>>>> > we would like to ask how useful is this interactive feature. Is
>>>>>>>> it critical for you? Would the following simplification be sufficient?
>>>>>>>> >
>>>>>>>> > 1) query command always returns the entire fetch size. We can
>>>>>>>> make it configurable through environment variables, default to be 100, and
>>>>>>>> you can also reset it in each individual query command using "query
>>>>>>>> --query='select * from /A limit 10'
>>>>>>>> >
>>>>>>>> > 2) provide an option for you to specify a file where we can dump
>>>>>>>> all the query result in and you can use shell pagination to list the
>>>>>>>> content of the file.
>>>>>>>> >
>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Cheers
>>>>>>>> >
>>>>>>>> > Jinmei
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ~/William
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cheers
>>>>>
>>>>> Jinmei
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ~/William
>>>>
>>>
>>>
>>
>>
>> --
>> -John
>> john.blum10101 (skype)
>>
>
>


-- 
Cheers

Jinmei

Re: refactor query command

Posted by Anilkumar Gingade <ag...@pivotal.io>.

To make it clear, gfsh could print the query it sent to server in the
result summary (it shows if it got executed with the limit):
Query     :
Result     : true
startCount : 0
endCount   : 20
Rows       : 1

-Anil.


On Tue, Jul 11, 2017 at 12:48 PM, John Blum <jb...@pivotal.io> wrote:

> I think it might be worth differentiating the result "LIMIT" (as used in
> the OQL query statement like so... "SELECT * FROM /Region WHERE ... LIMIT
> 1000")  from what is actually "streamed" back to *Gfsh* as the default
> (e.g. 100).
>
> Clearly sending all the results back is quite expensive depending on the
> number of results/LIMIT specified.  Therefore, whatever "--option" is
> provided to the `query` command is a further reduction in what is
> actually streamed back to the client (e.g. *Gfsh*) initially, sort of
> like paging, therefore ... `gfsh> query --query="SELECT * FROM /Region
> WHERE ... LIMIT 1000" --page-size=25`... perhaps?
>
> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit option
> would just be confusing to users.  LIMIT like sort (ORDER BY) can only be
> effectively applied to the OQL as it determines what results the query
> actually returns.
>
>
> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <ag...@pivotal.io>
> wrote:
>
>> >> Actually a really nice thing would be to put the pagination feature
>> into the OQL engine where it belongs.
>> +1 on this.
>>
>> >> if they query mode is interactive, it sends the first 20 (page-size,
>> not configurable) records. and user uses "n" to go to the next page,
>> >> once it hits the last page (showing all 1000 record or get to the end
>> of the result set), the command finishes.
>>
>> We could provide one more option to end user to quit getting to next page
>> and go-back to gfsh command for new commands (if its not there).
>>
>> I think providing multiple options to view large result set, is a nice
>> feature from tooling perspective (interactive result batching, dumping into
>> an external file, etc...)
>>
>> >> It’s fairly common in query tooling to be able to set a result set
>> limit.
>> Yes...many of the interactive query tools allow pagination/batching as
>> part of the result display.
>>
>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>> We need to make sure that user can differentiate query commands from
>> options provided by tool.
>>
>> -Anil.
>>
>>
>>
>>
>>
>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>> william.markito@gmail.com> wrote:
>>
>>> The way I read this is: One is limiting on the server side, the other is
>>> limiting the client side.  IOW within the query string is acting on server
>>> side.
>>>
>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io> wrote:
>>>
>>>> what if user wants to do:
>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>
>>>> What's the difference between put it inside the query string or
>>>> outside? I think eventually it's adding the limit clause to the query.
>>>>
>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>>> wrote:
>>>>
>>>>> It’s fairly common in query tooling to be able to set a result set
>>>>> limit.  I would make this a first class option within gfsh instead of an
>>>>> environment variable.
>>>>>
>>>>> gfsh> set query-limit=1000
>>>>>
>>>>> or
>>>>>
>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>
>>>>> The result set limit is semantically different from specifying a LIMIT
>>>>> on the OQL query itself.
>>>>>
>>>>> Anthony
>>>>>
>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>> william.markito@gmail.com> wrote:
>>>>>
>>>>> +1 for the combination of 1 and 2 as well.  It would be interesting to
>>>>> explore at least a couple output formats, csv being one of the most common
>>>>> for people that wants to import or analyze the data using other tools.
>>>>>
>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io>
>>>>> wrote:
>>>>>
>>>>>> Actually a really nice thing would be to put the pagination feature
>>>>>> into the OQL engine where it belongs. Clients shouldn't have to implement
>>>>>> pagination.
>>>>>>
>>>>>> --
>>>>>> Mike Stolz
>>>>>> Principal Engineer, GemFire Product Manager
>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>> mdodge@pivotal.io> wrote:
>>>>>>
>>>>>>> I prefer to redirect output to a file when there is any chance that
>>>>>>> the results might be huge. Thus I find the combination of #1 and #2 to be
>>>>>>> sufficient for me.
>>>>>>>
>>>>>>> Sarge
>>>>>>>
>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
>>>>>>> >
>>>>>>> > Hi, all gfsh-users,
>>>>>>> >
>>>>>>> > In our refactor week, we are trying to refactor how multi-step
>>>>>>> command is implemented. The currently implementation is hard to understand
>>>>>>> to begin with. The implementation breaks the OO design principals in
>>>>>>> multiple ways. It's not thread-safe either. This is an internal command
>>>>>>> type, and and only our "query" command uses it.
>>>>>>> >
>>>>>>> > This is how our current "query" command works:
>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not configurable)
>>>>>>> rows,
>>>>>>> > 3) if the query mode is NOT interactive, it sends back all the
>>>>>>> result at one.
>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>>> of the result set), the command finishes.
>>>>>>> >
>>>>>>> > we would like to ask how useful is this interactive feature. Is it
>>>>>>> critical for you? Would the following simplification be sufficient?
>>>>>>> >
>>>>>>> > 1) query command always returns the entire fetch size. We can make
>>>>>>> it configurable through environment variables, default to be 100, and you
>>>>>>> can also reset it in each individual query command using "query
>>>>>>> --query='select * from /A limit 10'
>>>>>>> >
>>>>>>> > 2) provide an option for you to specify a file where we can dump
>>>>>>> all the query result in and you can use shell pagination to list the
>>>>>>> content of the file.
>>>>>>> >
>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > Cheers
>>>>>>> >
>>>>>>> > Jinmei
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~/William
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Cheers
>>>>
>>>> Jinmei
>>>>
>>>
>>>
>>>
>>> --
>>> ~/William
>>>
>>
>>
>
>
> --
> -John
> john.blum10101 (skype)
>

Re: refactor query command

Posted by John Blum <jb...@pivotal.io>.

I think it might be worth differentiating the result "LIMIT" (as used in
the OQL query statement like so... "SELECT * FROM /Region WHERE ... LIMIT
1000")  from what is actually "streamed" back to *Gfsh* as the default
(e.g. 100).

Clearly sending all the results back is quite expensive depending on the
number of results/LIMIT specified.  Therefore, whatever "--option" is
provided to the `query` command is a further reduction in what is actually
streamed back to the client (e.g. *Gfsh*) initially, sort of like paging,
therefore ... `gfsh> query --query="SELECT * FROM /Region WHERE ... LIMIT
1000" --page-size=25`... perhaps?

Therefore, I think having 2 limits, as in OQL LIMIT and a --limit option
would just be confusing to users.  LIMIT like sort (ORDER BY) can only be
effectively applied to the OQL as it determines what results the query
actually returns.


On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <ag...@pivotal.io>
wrote:

> >> Actually a really nice thing would be to put the pagination feature
> into the OQL engine where it belongs.
> +1 on this.
>
> >> if they query mode is interactive, it sends the first 20 (page-size,
> not configurable) records. and user uses "n" to go to the next page,
> >> once it hits the last page (showing all 1000 record or get to the end
> of the result set), the command finishes.
>
> We could provide one more option to end user to quit getting to next page
> and go-back to gfsh command for new commands (if its not there).
>
> I think providing multiple options to view large result set, is a nice
> feature from tooling perspective (interactive result batching, dumping into
> an external file, etc...)
>
> >> It’s fairly common in query tooling to be able to set a result set
> limit.
> Yes...many of the interactive query tools allow pagination/batching as
> part of the result display.
>
> >> gfsh> query --query='select * from /A limit 10' --limit=100
> We need to make sure that user can differentiate query commands from
> options provided by tool.
>
> -Anil.
>
>
>
>
>
> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
> william.markito@gmail.com> wrote:
>
>> The way I read this is: One is limiting on the server side, the other is
>> limiting the client side.  IOW within the query string is acting on server
>> side.
>>
>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io> wrote:
>>
>>> what if user wants to do:
>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>
>>> What's the difference between put it inside the query string or outside?
>>> I think eventually it's adding the limit clause to the query.
>>>
>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io>
>>> wrote:
>>>
>>>> It’s fairly common in query tooling to be able to set a result set
>>>> limit.  I would make this a first class option within gfsh instead of an
>>>> environment variable.
>>>>
>>>> gfsh> set query-limit=1000
>>>>
>>>> or
>>>>
>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>
>>>> The result set limit is semantically different from specifying a LIMIT
>>>> on the OQL query itself.
>>>>
>>>> Anthony
>>>>
>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>> william.markito@gmail.com> wrote:
>>>>
>>>> +1 for the combination of 1 and 2 as well.  It would be interesting to
>>>> explore at least a couple output formats, csv being one of the most common
>>>> for people that wants to import or analyze the data using other tools.
>>>>
>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io>
>>>> wrote:
>>>>
>>>>> Actually a really nice thing would be to put the pagination feature
>>>>> into the OQL engine where it belongs. Clients shouldn't have to implement
>>>>> pagination.
>>>>>
>>>>> --
>>>>> Mike Stolz
>>>>> Principal Engineer, GemFire Product Manager
>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>> mdodge@pivotal.io> wrote:
>>>>>
>>>>>> I prefer to redirect output to a file when there is any chance that
>>>>>> the results might be huge. Thus I find the combination of #1 and #2 to be
>>>>>> sufficient for me.
>>>>>>
>>>>>> Sarge
>>>>>>
>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
>>>>>> >
>>>>>> > Hi, all gfsh-users,
>>>>>> >
>>>>>> > In our refactor week, we are trying to refactor how multi-step
>>>>>> command is implemented. The currently implementation is hard to understand
>>>>>> to begin with. The implementation breaks the OO design principals in
>>>>>> multiple ways. It's not thread-safe either. This is an internal command
>>>>>> type, and and only our "query" command uses it.
>>>>>> >
>>>>>> > This is how our current "query" command works:
>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>> > 2) server retrieves the first 1000 (fetch-size, not configurable)
>>>>>> rows,
>>>>>> > 3) if the query mode is NOT interactive, it sends back all the
>>>>>> result at one.
>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>>> of the result set), the command finishes.
>>>>>> >
>>>>>> > we would like to ask how useful is this interactive feature. Is it
>>>>>> critical for you? Would the following simplification be sufficient?
>>>>>> >
>>>>>> > 1) query command always returns the entire fetch size. We can make
>>>>>> it configurable through environment variables, default to be 100, and you
>>>>>> can also reset it in each individual query command using "query
>>>>>> --query='select * from /A limit 10'
>>>>>> >
>>>>>> > 2) provide an option for you to specify a file where we can dump
>>>>>> all the query result in and you can use shell pagination to list the
>>>>>> content of the file.
>>>>>> >
>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Cheers
>>>>>> >
>>>>>> > Jinmei
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> ~/William
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Cheers
>>>
>>> Jinmei
>>>
>>
>>
>>
>> --
>> ~/William
>>
>
>


-- 
-John
john.blum10101 (skype)

Re: refactor query command

Posted by Anilkumar Gingade <ag...@pivotal.io>.

>> Actually a really nice thing would be to put the pagination feature into
the OQL engine where it belongs.
+1 on this.

>> if they query mode is interactive, it sends the first 20 (page-size, not
configurable) records. and user uses "n" to go to the next page,
>> once it hits the last page (showing all 1000 record or get to the end of
the result set), the command finishes.

We could provide one more option to end user to quit getting to next page
and go-back to gfsh command for new commands (if its not there).

I think providing multiple options to view large result set, is a nice
feature from tooling perspective (interactive result batching, dumping into
an external file, etc...)

>> It’s fairly common in query tooling to be able to set a result set
limit.
Yes...many of the interactive query tools allow pagination/batching as part
of the result display.

>> gfsh> query --query='select * from /A limit 10' --limit=100
We need to make sure that user can differentiate query commands from
options provided by tool.

-Anil.





On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
william.markito@gmail.com> wrote:

> The way I read this is: One is limiting on the server side, the other is
> limiting the client side.  IOW within the query string is acting on server
> side.
>
> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io> wrote:
>
>> what if user wants to do:
>> gfsh> query --query='select * from /A limit 10' --limit=100
>>
>> What's the difference between put it inside the query string or outside?
>> I think eventually it's adding the limit clause to the query.
>>
>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io> wrote:
>>
>>> It’s fairly common in query tooling to be able to set a result set
>>> limit.  I would make this a first class option within gfsh instead of an
>>> environment variable.
>>>
>>> gfsh> set query-limit=1000
>>>
>>> or
>>>
>>> gfsh> query --query='select * from /A’ --limit=1000
>>>
>>> The result set limit is semantically different from specifying a LIMIT
>>> on the OQL query itself.
>>>
>>> Anthony
>>>
>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>> william.markito@gmail.com> wrote:
>>>
>>> +1 for the combination of 1 and 2 as well.  It would be interesting to
>>> explore at least a couple output formats, csv being one of the most common
>>> for people that wants to import or analyze the data using other tools.
>>>
>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io>
>>> wrote:
>>>
>>>> Actually a really nice thing would be to put the pagination feature
>>>> into the OQL engine where it belongs. Clients shouldn't have to implement
>>>> pagination.
>>>>
>>>> --
>>>> Mike Stolz
>>>> Principal Engineer, GemFire Product Manager
>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>
>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>> mdodge@pivotal.io> wrote:
>>>>
>>>>> I prefer to redirect output to a file when there is any chance that
>>>>> the results might be huge. Thus I find the combination of #1 and #2 to be
>>>>> sufficient for me.
>>>>>
>>>>> Sarge
>>>>>
>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
>>>>> >
>>>>> > Hi, all gfsh-users,
>>>>> >
>>>>> > In our refactor week, we are trying to refactor how multi-step
>>>>> command is implemented. The currently implementation is hard to understand
>>>>> to begin with. The implementation breaks the OO design principals in
>>>>> multiple ways. It's not thread-safe either. This is an internal command
>>>>> type, and and only our "query" command uses it.
>>>>> >
>>>>> > This is how our current "query" command works:
>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>> > 2) server retrieves the first 1000 (fetch-size, not configurable)
>>>>> rows,
>>>>> > 3) if the query mode is NOT interactive, it sends back all the
>>>>> result at one.
>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>>> of the result set), the command finishes.
>>>>> >
>>>>> > we would like to ask how useful is this interactive feature. Is it
>>>>> critical for you? Would the following simplification be sufficient?
>>>>> >
>>>>> > 1) query command always returns the entire fetch size. We can make
>>>>> it configurable through environment variables, default to be 100, and you
>>>>> can also reset it in each individual query command using "query
>>>>> --query='select * from /A limit 10'
>>>>> >
>>>>> > 2) provide an option for you to specify a file where we can dump all
>>>>> the query result in and you can use shell pagination to list the content of
>>>>> the file.
>>>>> >
>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Cheers
>>>>> >
>>>>> > Jinmei
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> ~/William
>>>
>>>
>>>
>>
>>
>> --
>> Cheers
>>
>> Jinmei
>>
>
>
>
> --
> ~/William
>

Re: refactor query command

Posted by William Markito Oliveira <wi...@gmail.com>.

The way I read this is: One is limiting on the server side, the other is
limiting the client side.  IOW within the query string is acting on server
side.

On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <ji...@pivotal.io> wrote:

> what if user wants to do:
> gfsh> query --query='select * from /A limit 10' --limit=100
>
> What's the difference between put it inside the query string or outside? I
> think eventually it's adding the limit clause to the query.
>
> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io> wrote:
>
>> It’s fairly common in query tooling to be able to set a result set
>> limit.  I would make this a first class option within gfsh instead of an
>> environment variable.
>>
>> gfsh> set query-limit=1000
>>
>> or
>>
>> gfsh> query --query='select * from /A’ --limit=1000
>>
>> The result set limit is semantically different from specifying a LIMIT on
>> the OQL query itself.
>>
>> Anthony
>>
>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>> william.markito@gmail.com> wrote:
>>
>> +1 for the combination of 1 and 2 as well.  It would be interesting to
>> explore at least a couple output formats, csv being one of the most common
>> for people that wants to import or analyze the data using other tools.
>>
>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io> wrote:
>>
>>> Actually a really nice thing would be to put the pagination feature into
>>> the OQL engine where it belongs. Clients shouldn't have to implement
>>> pagination.
>>>
>>> --
>>> Mike Stolz
>>> Principal Engineer, GemFire Product Manager
>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>
>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>> mdodge@pivotal.io> wrote:
>>>
>>>> I prefer to redirect output to a file when there is any chance that the
>>>> results might be huge. Thus I find the combination of #1 and #2 to be
>>>> sufficient for me.
>>>>
>>>> Sarge
>>>>
>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
>>>> >
>>>> > Hi, all gfsh-users,
>>>> >
>>>> > In our refactor week, we are trying to refactor how multi-step
>>>> command is implemented. The currently implementation is hard to understand
>>>> to begin with. The implementation breaks the OO design principals in
>>>> multiple ways. It's not thread-safe either. This is an internal command
>>>> type, and and only our "query" command uses it.
>>>> >
>>>> > This is how our current "query" command works:
>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>> > 2) server retrieves the first 1000 (fetch-size, not configurable)
>>>> rows,
>>>> > 3) if the query mode is NOT interactive, it sends back all the result
>>>> at one.
>>>> > 4) if they query mode is interactive, it sends the first 20
>>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>>> page, once it hits the last page (showing all 1000 record or get to the end
>>>> of the result set), the command finishes.
>>>> >
>>>> > we would like to ask how useful is this interactive feature. Is it
>>>> critical for you? Would the following simplification be sufficient?
>>>> >
>>>> > 1) query command always returns the entire fetch size. We can make it
>>>> configurable through environment variables, default to be 100, and you can
>>>> also reset it in each individual query command using "query --query='select
>>>> * from /A limit 10'
>>>> >
>>>> > 2) provide an option for you to specify a file where we can dump all
>>>> the query result in and you can use shell pagination to list the content of
>>>> the file.
>>>> >
>>>> > Please let us know your thoughts/comments. Thanks!
>>>> >
>>>> >
>>>> > --
>>>> > Cheers
>>>> >
>>>> > Jinmei
>>>>
>>>>
>>>
>>
>>
>> --
>> ~/William
>>
>>
>>
>
>
> --
> Cheers
>
> Jinmei
>



-- 
~/William

Re: refactor query command

Posted by Jinmei Liao <ji...@pivotal.io>.

what if user wants to do:
gfsh> query --query='select * from /A limit 10' --limit=100

What's the difference between put it inside the query string or outside? I
think eventually it's adding the limit clause to the query.

On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <ab...@pivotal.io> wrote:

> It’s fairly common in query tooling to be able to set a result set limit.
> I would make this a first class option within gfsh instead of an
> environment variable.
>
> gfsh> set query-limit=1000
>
> or
>
> gfsh> query --query='select * from /A’ --limit=1000
>
> The result set limit is semantically different from specifying a LIMIT on
> the OQL query itself.
>
> Anthony
>
> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
> william.markito@gmail.com> wrote:
>
> +1 for the combination of 1 and 2 as well.  It would be interesting to
> explore at least a couple output formats, csv being one of the most common
> for people that wants to import or analyze the data using other tools.
>
> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io> wrote:
>
>> Actually a really nice thing would be to put the pagination feature into
>> the OQL engine where it belongs. Clients shouldn't have to implement
>> pagination.
>>
>> --
>> Mike Stolz
>> Principal Engineer, GemFire Product Manager
>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>
>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>> mdodge@pivotal.io> wrote:
>>
>>> I prefer to redirect output to a file when there is any chance that the
>>> results might be huge. Thus I find the combination of #1 and #2 to be
>>> sufficient for me.
>>>
>>> Sarge
>>>
>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
>>> >
>>> > Hi, all gfsh-users,
>>> >
>>> > In our refactor week, we are trying to refactor how multi-step command
>>> is implemented. The currently implementation is hard to understand to begin
>>> with. The implementation breaks the OO design principals in multiple ways.
>>> It's not thread-safe either. This is an internal command type, and and only
>>> our "query" command uses it.
>>> >
>>> > This is how our current "query" command works:
>>> > 1) user issues a "query --query='select * from /A'" command,
>>> > 2) server retrieves the first 1000 (fetch-size, not configurable) rows,
>>> > 3) if the query mode is NOT interactive, it sends back all the result
>>> at one.
>>> > 4) if they query mode is interactive, it sends the first 20
>>> (page-size, not configurable) records. and user uses "n" to go to the next
>>> page, once it hits the last page (showing all 1000 record or get to the end
>>> of the result set), the command finishes.
>>> >
>>> > we would like to ask how useful is this interactive feature. Is it
>>> critical for you? Would the following simplification be sufficient?
>>> >
>>> > 1) query command always returns the entire fetch size. We can make it
>>> configurable through environment variables, default to be 100, and you can
>>> also reset it in each individual query command using "query --query='select
>>> * from /A limit 10'
>>> >
>>> > 2) provide an option for you to specify a file where we can dump all
>>> the query result in and you can use shell pagination to list the content of
>>> the file.
>>> >
>>> > Please let us know your thoughts/comments. Thanks!
>>> >
>>> >
>>> > --
>>> > Cheers
>>> >
>>> > Jinmei
>>>
>>>
>>
>
>
> --
> ~/William
>
>
>


-- 
Cheers

Jinmei

Re: refactor query command

Posted by Wayne Lund <wl...@pivotal.io>.

+1

Wayne Lund
Advisory Platform Architect
916.296.1893
wlund@pivotal.io

> On Jul 11, 2017, at 8:41 AM, Anthony Baker <ab...@pivotal.io> wrote:
> 
> It’s fairly common in query tooling to be able to set a result set limit.  I would make this a first class option within gfsh instead of an environment variable.
> 
> gfsh> set query-limit=1000
> 
> or
> 
> gfsh> query --query='select * from /A’ --limit=1000
> 
> The result set limit is semantically different from specifying a LIMIT on the OQL query itself.
> 
> Anthony
> 
>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <william.markito@gmail.com <ma...@gmail.com>> wrote:
>> 
>> +1 for the combination of 1 and 2 as well.  It would be interesting to explore at least a couple output formats, csv being one of the most common for people that wants to import or analyze the data using other tools. 
>> 
>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <mstolz@pivotal.io <ma...@pivotal.io>> wrote:
>> Actually a really nice thing would be to put the pagination feature into the OQL engine where it belongs. Clients shouldn't have to implement pagination.
>> 
>> --
>> Mike Stolz
>> Principal Engineer, GemFire Product Manager 
>> Mobile: +1-631-835-4771 <tel:(631)%20835-4771>
>> 
>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <mdodge@pivotal.io <ma...@pivotal.io>> wrote:
>> I prefer to redirect output to a file when there is any chance that the results might be huge. Thus I find the combination of #1 and #2 to be sufficient for me.
>> 
>> Sarge
>> 
>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <jiliao@pivotal.io <ma...@pivotal.io>> wrote:
>> >
>> > Hi, all gfsh-users,
>> >
>> > In our refactor week, we are trying to refactor how multi-step command is implemented. The currently implementation is hard to understand to begin with. The implementation breaks the OO design principals in multiple ways. It's not thread-safe either. This is an internal command type, and and only our "query" command uses it.
>> >
>> > This is how our current "query" command works:
>> > 1) user issues a "query --query='select * from /A'" command,
>> > 2) server retrieves the first 1000 (fetch-size, not configurable) rows,
>> > 3) if the query mode is NOT interactive, it sends back all the result at one.
>> > 4) if they query mode is interactive, it sends the first 20 (page-size, not configurable) records. and user uses "n" to go to the next page, once it hits the last page (showing all 1000 record or get to the end of the result set), the command finishes.
>> >
>> > we would like to ask how useful is this interactive feature. Is it critical for you? Would the following simplification be sufficient?
>> >
>> > 1) query command always returns the entire fetch size. We can make it configurable through environment variables, default to be 100, and you can also reset it in each individual query command using "query --query='select * from /A limit 10'
>> >
>> > 2) provide an option for you to specify a file where we can dump all the query result in and you can use shell pagination to list the content of the file.
>> >
>> > Please let us know your thoughts/comments. Thanks!
>> >
>> >
>> > --
>> > Cheers
>> >
>> > Jinmei
>> 
>> 
>> 
>> 
>> 
>> -- 
>> ~/William
>

Re: refactor query command

Posted by Anthony Baker <ab...@pivotal.io>.

It’s fairly common in query tooling to be able to set a result set limit.  I would make this a first class option within gfsh instead of an environment variable.

gfsh> set query-limit=1000

or

gfsh> query --query='select * from /A’ --limit=1000

The result set limit is semantically different from specifying a LIMIT on the OQL query itself.

Anthony

> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <wi...@gmail.com> wrote:
> 
> +1 for the combination of 1 and 2 as well.  It would be interesting to explore at least a couple output formats, csv being one of the most common for people that wants to import or analyze the data using other tools. 
> 
> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <mstolz@pivotal.io <ma...@pivotal.io>> wrote:
> Actually a really nice thing would be to put the pagination feature into the OQL engine where it belongs. Clients shouldn't have to implement pagination.
> 
> --
> Mike Stolz
> Principal Engineer, GemFire Product Manager 
> Mobile: +1-631-835-4771 <tel:(631)%20835-4771>
> 
> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <mdodge@pivotal.io <ma...@pivotal.io>> wrote:
> I prefer to redirect output to a file when there is any chance that the results might be huge. Thus I find the combination of #1 and #2 to be sufficient for me.
> 
> Sarge
> 
> > On 10 Jul, 2017, at 17:13, Jinmei Liao <jiliao@pivotal.io <ma...@pivotal.io>> wrote:
> >
> > Hi, all gfsh-users,
> >
> > In our refactor week, we are trying to refactor how multi-step command is implemented. The currently implementation is hard to understand to begin with. The implementation breaks the OO design principals in multiple ways. It's not thread-safe either. This is an internal command type, and and only our "query" command uses it.
> >
> > This is how our current "query" command works:
> > 1) user issues a "query --query='select * from /A'" command,
> > 2) server retrieves the first 1000 (fetch-size, not configurable) rows,
> > 3) if the query mode is NOT interactive, it sends back all the result at one.
> > 4) if they query mode is interactive, it sends the first 20 (page-size, not configurable) records. and user uses "n" to go to the next page, once it hits the last page (showing all 1000 record or get to the end of the result set), the command finishes.
> >
> > we would like to ask how useful is this interactive feature. Is it critical for you? Would the following simplification be sufficient?
> >
> > 1) query command always returns the entire fetch size. We can make it configurable through environment variables, default to be 100, and you can also reset it in each individual query command using "query --query='select * from /A limit 10'
> >
> > 2) provide an option for you to specify a file where we can dump all the query result in and you can use shell pagination to list the content of the file.
> >
> > Please let us know your thoughts/comments. Thanks!
> >
> >
> > --
> > Cheers
> >
> > Jinmei
> 
> 
> 
> 
> 
> -- 
> ~/William

Re: refactor query command

Posted by William Markito Oliveira <wi...@gmail.com>.

+1 for the combination of 1 and 2 as well.  It would be interesting to
explore at least a couple output formats, csv being one of the most common
for people that wants to import or analyze the data using other tools.

On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <ms...@pivotal.io> wrote:

> Actually a really nice thing would be to put the pagination feature into
> the OQL engine where it belongs. Clients shouldn't have to implement
> pagination.
>
> --
> Mike Stolz
> Principal Engineer, GemFire Product Manager
> Mobile: +1-631-835-4771 <(631)%20835-4771>
>
> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <mdodge@pivotal.io
> > wrote:
>
>> I prefer to redirect output to a file when there is any chance that the
>> results might be huge. Thus I find the combination of #1 and #2 to be
>> sufficient for me.
>>
>> Sarge
>>
>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
>> >
>> > Hi, all gfsh-users,
>> >
>> > In our refactor week, we are trying to refactor how multi-step command
>> is implemented. The currently implementation is hard to understand to begin
>> with. The implementation breaks the OO design principals in multiple ways.
>> It's not thread-safe either. This is an internal command type, and and only
>> our "query" command uses it.
>> >
>> > This is how our current "query" command works:
>> > 1) user issues a "query --query='select * from /A'" command,
>> > 2) server retrieves the first 1000 (fetch-size, not configurable) rows,
>> > 3) if the query mode is NOT interactive, it sends back all the result
>> at one.
>> > 4) if they query mode is interactive, it sends the first 20 (page-size,
>> not configurable) records. and user uses "n" to go to the next page, once
>> it hits the last page (showing all 1000 record or get to the end of the
>> result set), the command finishes.
>> >
>> > we would like to ask how useful is this interactive feature. Is it
>> critical for you? Would the following simplification be sufficient?
>> >
>> > 1) query command always returns the entire fetch size. We can make it
>> configurable through environment variables, default to be 100, and you can
>> also reset it in each individual query command using "query --query='select
>> * from /A limit 10'
>> >
>> > 2) provide an option for you to specify a file where we can dump all
>> the query result in and you can use shell pagination to list the content of
>> the file.
>> >
>> > Please let us know your thoughts/comments. Thanks!
>> >
>> >
>> > --
>> > Cheers
>> >
>> > Jinmei
>>
>>
>


-- 
~/William

Re: refactor query command

Posted by Michael Stolz <ms...@pivotal.io>.

Actually a really nice thing would be to put the pagination feature into
the OQL engine where it belongs. Clients shouldn't have to implement
pagination.

--
Mike Stolz
Principal Engineer, GemFire Product Manager
Mobile: +1-631-835-4771

On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <md...@pivotal.io>
wrote:

> I prefer to redirect output to a file when there is any chance that the
> results might be huge. Thus I find the combination of #1 and #2 to be
> sufficient for me.
>
> Sarge
>
> > On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
> >
> > Hi, all gfsh-users,
> >
> > In our refactor week, we are trying to refactor how multi-step command
> is implemented. The currently implementation is hard to understand to begin
> with. The implementation breaks the OO design principals in multiple ways.
> It's not thread-safe either. This is an internal command type, and and only
> our "query" command uses it.
> >
> > This is how our current "query" command works:
> > 1) user issues a "query --query='select * from /A'" command,
> > 2) server retrieves the first 1000 (fetch-size, not configurable) rows,
> > 3) if the query mode is NOT interactive, it sends back all the result at
> one.
> > 4) if they query mode is interactive, it sends the first 20 (page-size,
> not configurable) records. and user uses "n" to go to the next page, once
> it hits the last page (showing all 1000 record or get to the end of the
> result set), the command finishes.
> >
> > we would like to ask how useful is this interactive feature. Is it
> critical for you? Would the following simplification be sufficient?
> >
> > 1) query command always returns the entire fetch size. We can make it
> configurable through environment variables, default to be 100, and you can
> also reset it in each individual query command using "query --query='select
> * from /A limit 10'
> >
> > 2) provide an option for you to specify a file where we can dump all the
> query result in and you can use shell pagination to list the content of the
> file.
> >
> > Please let us know your thoughts/comments. Thanks!
> >
> >
> > --
> > Cheers
> >
> > Jinmei
>
>

Re: refactor query command

Posted by Michael William Dodge <md...@pivotal.io>.

I prefer to redirect output to a file when there is any chance that the results might be huge. Thus I find the combination of #1 and #2 to be sufficient for me.

Sarge

> On 10 Jul, 2017, at 17:13, Jinmei Liao <ji...@pivotal.io> wrote:
> 
> Hi, all gfsh-users,
> 
> In our refactor week, we are trying to refactor how multi-step command is implemented. The currently implementation is hard to understand to begin with. The implementation breaks the OO design principals in multiple ways. It's not thread-safe either. This is an internal command type, and and only our "query" command uses it. 
> 
> This is how our current "query" command works:
> 1) user issues a "query --query='select * from /A'" command,
> 2) server retrieves the first 1000 (fetch-size, not configurable) rows, 
> 3) if the query mode is NOT interactive, it sends back all the result at one.
> 4) if they query mode is interactive, it sends the first 20 (page-size, not configurable) records. and user uses "n" to go to the next page, once it hits the last page (showing all 1000 record or get to the end of the result set), the command finishes.
> 
> we would like to ask how useful is this interactive feature. Is it critical for you? Would the following simplification be sufficient?
> 
> 1) query command always returns the entire fetch size. We can make it configurable through environment variables, default to be 100, and you can also reset it in each individual query command using "query --query='select * from /A limit 10'
> 
> 2) provide an option for you to specify a file where we can dump all the query result in and you can use shell pagination to list the content of the file.
> 
> Please let us know your thoughts/comments. Thanks!
> 
> 
> -- 
> Cheers
> 
> Jinmei