You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@bloodhound.apache.org by Apache Bloodhound <bl...@incubator.apache.org> on 2012/11/28 16:17:32 UTC

[Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Page "Proposals/BEP-0004/ResourceQuery" was added by andrej
Content:
-------8<------8<------8<------8<------8<------8<------8<------8<--------
= Resource Query component
[[PageOutline]]
== Introduction #introduction

This page describes functionality of Resource Query component. Resource Query component is responsible for resource indexing and query execution. It is not responsible for representation of search results to user. For overview of search and query solution see [wiki:BEP-0004].

Usually user will not access to Resource Query component directly but via UI frontends e.g. search page, widget or wiki macro. Consider below a simple search workflow:
 1. User searches for “bla status:closed” string in quick search box
 1. Quick search forwards user to search page with URL …?q=bla20%status:closed
 1. Search page calls Resource Query component and calls Resource Query component with query ”bla status:closed” and other parameters e.d. fields, sort etc.
 1. Search page renders query results in appropriate way

Resource Query component will provide a !ResourceQuery.query method with the following parameters:
 * '''query''': query string e.g. “bla status:closed” or a parsed representation of the query . For more information see [#query_syntax Query syntax].
 * '''sort''': optional sorting
 * '''boost''': optional list of fields with boost values e.g. {“id”: 1000, “subject” :100, “description”:10}. Used only for score based sorting.
 * '''filters''': optional list of terms. Usually can be cached by underlying search framework. For example {“type”: “wiki”}
 * '''fields''': list of fields to return
 * optional paging fields: '''rows/start''' or '''page/pagesize''' fields
 * '''facets''' - optional list of facet terms, can be field or expression.

== Resource Query is not a report tool #notreport
As it was discussed on dev mailing list, search and query serve a different purpose than reports. Resource Query is not intended not provide complex SQL like expressions linke JOIN, UNION etc. Resource Query will search through flattened resource representation. Query syntax should support issue tracker specifics such as search through attachments, related tickets etc.

== Query Syntax #query_syntax
Resource Query will accept [http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html Lucene-like] syntax familiar to users of Solr, [http://packages.python.org/Whoosh/querylang.html Whoosh], Haystack, [http://code.google.com/p/unladen-swallow/issues/searchtips Google Code] and [http://confluence.jetbrains.com/display/YTD4/Search+and+Command+Attributes#SearchandCommandAttributes-GeneralSearchAttributes YouTrack] with additional functions/meta tags specific for Bloodhound/Trac e.g. related tickets, attachment etc.

Default Resource Query operator is AND.

Bloodhound should provide it’s own query parser in order to be independent from underlying search platforms.

=== Issue-tracker specifics #tracker_specifics
Resource Query should be able to search through Bloodhound specific fields/functions:
 * comments
 * attachments
 * history
 * related resources with different relation types: linked, duplicated, blocked, child/parent etc.

Resource Query should support version changing, similar to WAS and CHANGED operator in JIRA (https://confluence.atlassian.com/display/JIRA/Advanced+Searching#AdvancedSearching-WAS, http://confluence.jetbrains.com/display/YTD4/Search+and+Command+Attributes) 

Other functions or meta tags can be used in query. Meta tags can be marked with specific character e.g. “#” (similar to YouTrack special keywords - http://confluence.jetbrains.com/display/YTD4/Search+and+Command+Attributes#SearchandCommandAttributes-ShortcutKeywords):
 * #me - current user
 * #my - assigned to me
 * #currentProject
 * #ticket, #wiki etc.
 * date and time helper functions e.g. 2weeksago, 1yearago etc. 

Indexing and query syntax must be easily extended by plugins. Here is not a complete list of other possible meta tags that can be provided by additional plugins:
 * #resolved/unresolved - status:(resolved OR closed)
 * version aggregation e.g. earliestUnreleasedVersion
 * #hasAttachment
 * code:xxx ... - contains code in wiki format
 * #duplicated
 * #closed = status:closed
 * #yesterday
 * ...

== Use cases #usecases
=== User uses free text search or query  in quick search box #usecase_freesearch
User inputs text  or query string in search box. The input can be directly propagated to query parameter, for example:
 * bla
 * open issue
 * bla “open issue”
 * bla status:open
 * status:open

=== Possibility to specify what fields to return #usecase_fields
Search page or widget must specify fields parameter of ResourceQuery.query  method. 
{{{
#!python
resourceQuery.query(fields=("id" , "title", "status"),...)
}}}

=== Boolean operators and grouping #usecase_grouping
Resource Query must support AND, OR, NOT and grouping (default operator is AND). Query string may look like:
 * alpha AND NOT (beta OR gamma)
 * “render AND shading”  - expression is equal to “render shading”
 * title:x OR ( title:y AND message:z)

=== User can search using range expression #usecase_range
Query string parameter should support inclusive and exclusive range expression, for example:
 * date:[20050101 TO 20090715]
 * title:{Aida TO Carmen}
 * [0025 TO] 
 * {TO suffix}

=== Facets support #usecase_facets
Query must support facets  (e.g. Resources: Tickets(10), Wiki (20)), Status (Open (22), Closed(33)) etc. Facets parameter should be used for this purposes.

{{{
#!python
resourceQuery.query(facets=("type", "status"), ...)
}}}

=== Flexible sorting #usecase_sorting
Default sort order of text-search should be based on score and change date. Search page can set the following parameters for !ResourceQuery.query method:
{{{
#!python
resourceQuery.query( sort = {"score":ASC, "change_date": DESC},
  boost = {"id" : 1000, "subject" : 100, "description": 10},...)
}}}

=== Paging support #usecase_paging
Search page will represent query results in pages. For this purposes, it should use the following parameters of !ResourceQuery.query method.
{{{
#!python
resourceQuery.query(start = 100, rows=50, ...)
}}}
or
{{{
#!python
resourceQuery.query(pagesize=50, page=3, ...)
}}}

=== Related ticket use case #usecase_ralated
User queries tickets related to tickets that were reopened in last 14 days. The query can be exprese with the following call:
{{{
#!python
resourceQuery.query( 
  query="changed.status_from:open changed_date:[1weekago TO]",
  facets=("parent_ticket"), ...
)
}}}

=== Search in comments #usecase_comments
{{{
#!python
resourceQuery.query(query="attachment:bla", ...)
}}}

=== Show in attachment  #usecase_comments
{{{
#!python
resourceQuery.query(query="attachment:bla", ...)
}}}

=== Show tickets that were commented since yesterday.  #usecase_last_commented
{{{
#!python
resourceQuery.query(query="last_commented:[yesterday TO]", ...)
}}}

=== Show all resources in current project that links to a ticket #usecase_project_linked
{{{
#!python
resourceQuery.query(query="#currentProject AND linked:#123", ...)
}}}
-------8<------8<------8<------8<------8<------8<------8<------8<--------

--
Page URL: <https://issues.apache.org/bloodhound/wiki/Proposals/BEP-0004/ResourceQuery>
Apache Bloodhound <https://issues.apache.org/bloodhound/>
The Apache Bloodhound (incubating) issue tracker

This is an automated message. Someone added your email address to be
notified of changes on 'Proposals/BEP-0004/ResourceQuery' page.
If it was not you, please report to .

Re: [Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Posted by Olemis Lang <ol...@gmail.com>.
On 12/4/12, Gary Martin <ga...@wandisco.com> wrote:
> Hi,
>

:)

[...]
>
> The addition of a greater variety of keywords (metatags) does seem a
> great idea.

+1

[...]
>
> The character we use to mark these is of no great consequence to me but
> there may be advantages to retaining consistency with that form. On
> which note we may as well add 'user' as a synonym for 'me' and 'my'.

we could consider

  - user:username
  - me
  - user:me
  - assigned:me , owner:me

... and a different separator char e.g. =

>
> The helper functions are perhaps where I got this impression. The syntax
> for these seems interesting as well. Is the suggestion there that we use
> something like
>
>     >>> querystring = '1weekago'
>     >>> m = re.search('(?P<n>\d+)weeks?ago',querystring)
>     >>> int(m.group('n')) if m is not None else 0
>     1
>
> to discover relatively simple numeric arguments? Is that considered to
> be better than a weeksago(N) style?
>

like I said before I've been working on something like this few weeks
ago for TracGViz plugin . I've been implementing a parser for Google
Visualization API Query Language in gviz_ql branch [1]_ ... which is a
SQL dialect operating on flat data tables (i.e. columns + rows) .
Something like that has been mentioned in BEP 4 btw .

Well , to the point ... In there there is a chance to use date ,
datetime and timeofday literals [2]_ . They seem to be more powerful
in the sense that expressions like this are possible

  - dateDiff(startDate, now()) < 30 (i.e. 1 month ago)
  - dateDiff(startDate, date "2012-12-05") < 90 (i.e. days
    covered by recent december IPMC status report)
  - startDate < date "2008-01-01" and endDate > "2012-01-01"

> I like the idea of including ranges and the ability to define open and
> closed intervals for ranges. If we are already able to make that
> distinction, I would probably also add the ability to mix open and
> closed syntax so you can include at one end of the range and exclude at
> the other. I am also considering whether we should get a bit closer to
> open and closed interval notation in sets if that is not too confusing
> for users.
>

of course , as a matter of legibility we could introduce timestamp
literals as well and make them look like

  - dateDiff(startDate, now()) < 1 month
  - dateDiff(startDate, date "2012-12-05") < 3 months
  - dateDiff(endDate , startDate) > 2 years

or other similar literals .

> Finally, for now at least, it may also be worth making sure we have
> operator precedence defined in that document for completeness.
>

+1
I'd also suggest that it'd be nice to structure the syntax using
clauses + operators , similar to SQL . That'd allow us to write a fast
one-way parser for search expressions (preferably an operator
precedence grammar ;) .

[...]

.. [1] GViz QL parser for TracGViz plugin
        (https://bitbucket.org/olemis/trac-gviz/src/61393f2d74e2/?at=gviz_ql)

.. [2] Query Language Reference (Version 0.7) #Literals
        (https://developers.google.com/chart/interactive/docs/querylanguage#Literals)

-- 
Regards,

Olemis.

Blog ES: http://simelo-es.blogspot.com/
Blog EN: http://simelo-en.blogspot.com/

Featured article:

Re: [Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Posted by Olemis Lang <ol...@gmail.com>.
On 12/4/12, Andrej Golcov <an...@digiverse.si> wrote:
>
[...]
>
>> I was wondering from the text of the proposal whether there was an
>> intention for these to be possible to use unmarked as well.
>> This might be better to treat as a future enhancement.
> Actually, I didn't intend to propose unmarked meta keywords. Date/time
> helper function should be part of syntax. Probably, I have to clarify
> the proposal regarding this subject.
> What I wanted to propose is to have pluggable query parser where
> plugins can add their own meta keywords and functions.
>

this is nice to have . Defining syntax in the form of clauses might
help with this since plugins could contribute starting keyword and the
parser will call it back to instantiate this clause on its behalf .

FWIW , this is at some extent possible in TracGViz GViz QL parser . In
there each SQL clause is represented internally by an instance of a
sub-class of tracgviz.gvizql.GVizQLClauseHandler . The exact subclass
is selected by keyword (select, where , ...) . Such instances are
responsible for :

1. accepting the parameters parsed by the underlying
    (operator precedence parser & Pygments lexer)
    e.g. store number in offset class in an instance method
2. use that data wisely ... ;)
3. in theory it would be possible to allow for extensible syntax since
    * underlying parser is extensible
    * internal parsing table may be built incrementally by
      feeding grammar productions over and over .
3a. extensibility in (3) is possible as long as
      operator precedence grammar preconditions hold
3b. however (3) above won't be supported in there considering
      the fact that plugin has to be compatible with Google standard ,
      but , for a use case like search ... definitely possible ;)

... so in general , it's nice to have these kinds of extensibility
introduced by plugins .

However , how much extensible will search syntax should be ?

[...]

-- 
Regards,

Olemis.

Blog ES: http://simelo-es.blogspot.com/
Blog EN: http://simelo-en.blogspot.com/

Featured article:

Re: [Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Posted by Gary Martin <ga...@wandisco.com>.
On 05/12/12 13:43, Andrej Golcov wrote:
> Hi,
>
>> It does worry me that it might not be the best idea to support lucene as
>> well as existing syntax but if there are existing parsers for the lucene
>> style syntax it may be OK. Perhaps there is a way of using Whoosh to
>> provide parsing services for us.
> Do you mean to use Whoosh query parser to parse Bloodhound Search query?
> That will bring us a lot functionality with much less effort and bugs.
>  From the first look, Whoosh query syntax
> (http://packages.python.org/Whoosh/querylang.html) covers at least 90%
> of our requirements. According to Whoosh code, it can be also
> extended: http://packages.python.org/Whoosh/parsing.html. In case of
> PyLucene usage, we will provide query mapping from Whoosh to Lucen.
>
> There are few drawbacks that I can see:
>   - we have to follow Whoosh syntax and not imagine ourselves, at least
> in basics  - may be it  is not a drawback :)

Given the qparser plugin system that you have already hinted at, it 
should not be so bad if we want to add syntax.

>   - Bloodhound Search plugin will depend on Whoosh even if other search
> backend is used e.g. PyLucene
>
> If we can leave with this, I vote for starting prototype with Whoosh
> syntax and parser since it will speedup features delivery. Anyway, we
> can implement our own parser later.
>
> Regards, Andrej

Yes, I entirely agree with that approach. I would be very happy to see 
some early search improvements!

Cheers,
     Gary

Re: [Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Posted by Andrej Golcov <an...@digiverse.si>.
Hi,

> It does worry me that it might not be the best idea to support lucene as
> well as existing syntax but if there are existing parsers for the lucene
> style syntax it may be OK. Perhaps there is a way of using Whoosh to
> provide parsing services for us.
Do you mean to use Whoosh query parser to parse Bloodhound Search query?
That will bring us a lot functionality with much less effort and bugs.
>From the first look, Whoosh query syntax
(http://packages.python.org/Whoosh/querylang.html) covers at least 90%
of our requirements. According to Whoosh code, it can be also
extended: http://packages.python.org/Whoosh/parsing.html. In case of
PyLucene usage, we will provide query mapping from Whoosh to Lucen.

There are few drawbacks that I can see:
 - we have to follow Whoosh syntax and not imagine ourselves, at least
in basics  - may be it  is not a drawback :)
 - Bloodhound Search plugin will depend on Whoosh even if other search
backend is used e.g. PyLucene

If we can leave with this, I vote for starting prototype with Whoosh
syntax and parser since it will speedup features delivery. Anyway, we
can implement our own parser later.

Regards, Andrej

Re: [Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Posted by Gary Martin <ga...@wandisco.com>.
On 4 December 2012 16:49, Andrej Golcov <an...@digiverse.si> wrote:

> Hi,
>
> > The addition of a greater variety of keywords (metatags) does seem a
> great idea. It is perhaps worth noting that there is
> > already a certain notion of variable substitution in the current queries
> like this:
> >
> >    https://issues.apache.org/bloodhound/query?owner=$USER
> >
> > The character we use to mark these is of no great consequence to me but
> there may be advantages to retaining consistency
> > with that form. On which note we may as well add 'user' as a synonym for
> 'me' and 'my'. It may turn out that it is fine for these
> > to be case insensitive too.
> Agree, it would be good to retain consistency with existing query syntax.
>

OK, $ may also make a bit more sense as it is something that tends to work
without url encoding in address bars.. only a small advantage of course.

>
> > I was wondering from the text of the proposal whether there was an
> intention for these to be possible to use unmarked as well.
> > This might be better to treat as a future enhancement.
> Actually, I didn't intend to propose unmarked meta keywords. Date/time
> helper function should be part of syntax. Probably, I have to clarify
> the proposal regarding this subject.
> What I wanted to propose is to have pluggable query parser where
> plugins can add their own meta keywords and functions.
>
> > The helper functions are perhaps where I got this impression. The syntax
> for these seems interesting as well.
> > Is the suggestion there that we use something like
> >
> >    >>> querystring = '1weekago'
> >    >>> m = re.search('(?P<n>\d+)weeks?ago',querystring)
> >    >>> int(m.group('n')) if m is not None else 0
> >    1
> >
> > to discover relatively simple numeric arguments? Is that considered to
> be better than a weeksago(N) style?
> For consistency, I borrowed the syntax for date/time variables from
> TracQuery syntax [1].


Yeah, that makes sense. Shows how much I know!


> Personally, I don't have strong opinion which
> syntax is better but it should be simple for user to type it in search
> box. Another possible alternative that came to my mind is lucene
> syntax [2]
>

Ah, this explains quite a lot.  I see that the solr/lucene syntax has many
of the features that I wanted to advocate for the range syntax including
mixed open and closed intervals. I was thinking that it might be better to
use '()' for the open intervals which I believe is closer to set interval
syntax. This could give us syntax like:
    [a..b] : range with a and b included (closed interval)
    (a..b): a and b excluded (open interval)
    (a..b]: a excluded, b included

It would also be nice to allow for the definition of a set so that we could
say something like:
    status: in ['assigned', 'accepted', 'review']

Anyway, swapping {} for () is not the worst change in the world and
compatibility with lucene syntax might well be more appropriate in this
application.

It does worry me that it might not be the best idea to support lucene as
well as existing syntax but if there are existing parsers for the lucene
style syntax it may be OK. Perhaps there is a way of using Whoosh to
provide parsing services for us.

Cheers,
    Gary

Re: [Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Posted by Andrej Golcov <an...@digiverse.si>.
Hi,

> The addition of a greater variety of keywords (metatags) does seem a great idea. It is perhaps worth noting that there is
> already a certain notion of variable substitution in the current queries like this:
>
>    https://issues.apache.org/bloodhound/query?owner=$USER
>
> The character we use to mark these is of no great consequence to me but there may be advantages to retaining consistency
> with that form. On which note we may as well add 'user' as a synonym for 'me' and 'my'. It may turn out that it is fine for these
> to be case insensitive too.
Agree, it would be good to retain consistency with existing query syntax.

> I was wondering from the text of the proposal whether there was an intention for these to be possible to use unmarked as well.
> This might be better to treat as a future enhancement.
Actually, I didn't intend to propose unmarked meta keywords. Date/time
helper function should be part of syntax. Probably, I have to clarify
the proposal regarding this subject.
What I wanted to propose is to have pluggable query parser where
plugins can add their own meta keywords and functions.

> The helper functions are perhaps where I got this impression. The syntax for these seems interesting as well.
> Is the suggestion there that we use something like
>
>    >>> querystring = '1weekago'
>    >>> m = re.search('(?P<n>\d+)weeks?ago',querystring)
>    >>> int(m.group('n')) if m is not None else 0
>    1
>
> to discover relatively simple numeric arguments? Is that considered to be better than a weeksago(N) style?
For consistency, I borrowed the syntax for date/time variables from
TracQuery syntax [1].  Personally, I don't have strong opinion which
syntax is better but it should be simple for user to type it in search
box. Another possible alternative that came to my mind is lucene
syntax [2]

> Finally, for now at least, it may also be worth making sure we have operator precedence defined in that document for completeness.
Good point, will cover this subject in proposal.

[1] https://issues.apache.org/bloodhound/wiki/TracQuery#QueryLanguage
[2]http://lucidworks.lucidimagination.com/display/lweug/Solr+Date+Format

Regards, Andrej

Re: [Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Posted by Gary Martin <ga...@wandisco.com>.
Hi,

Sorry for my delay in responding to this. I find I have to keep saying 
that I am very happy with the way that search is shaping up. I may make 
a couple of minor corrections to the text shortly but I'd like to make a 
few comments and ask a few questions.

The addition of a greater variety of keywords (metatags) does seem a 
great idea. It is perhaps worth noting that there is already a certain 
notion of variable substitution in the current queries like this:

    https://issues.apache.org/bloodhound/query?owner=$USER

The character we use to mark these is of no great consequence to me but 
there may be advantages to retaining consistency with that form. On 
which note we may as well add 'user' as a synonym for 'me' and 'my'. It 
may turn out that it is fine for these to be case insensitive too.

I was wondering from the text of the proposal whether there was an 
intention for these to be possible to use unmarked as well. This might 
be better to treat as a future enhancement.

The helper functions are perhaps where I got this impression. The syntax 
for these seems interesting as well. Is the suggestion there that we use 
something like

    >>> querystring = '1weekago'
    >>> m = re.search('(?P<n>\d+)weeks?ago',querystring)
    >>> int(m.group('n')) if m is not None else 0
    1

to discover relatively simple numeric arguments? Is that considered to 
be better than a weeksago(N) style?

I like the idea of including ranges and the ability to define open and 
closed intervals for ranges. If we are already able to make that 
distinction, I would probably also add the ability to mix open and 
closed syntax so you can include at one end of the range and exclude at 
the other. I am also considering whether we should get a bit closer to 
open and closed interval notation in sets if that is not too confusing 
for users.

Finally, for now at least, it may also be worth making sure we have 
operator precedence defined in that document for completeness.

Anyway, this is all looking very good.

Cheers,
     Gary


On 29/11/12 00:35, Olemis Lang wrote:
> On 11/28/12, Apache Bloodhound <bl...@incubator.apache.org> wrote:
>> Page "Proposals/BEP-0004/ResourceQuery" was added by andrej
>> Content:
>> -------8<------8<------8<------8<------8<------8<------8<------8<--------
>>
> [...]
>> Resource Query component will provide a !ResourceQuery.query method with the
>> following parameters:
>>   * '''query''': query string e.g. “bla status:closed” or a parsed
>> representation of the query . For more information see [#query_syntax Query
>> syntax].
>>   * '''sort''': optional sorting
>>   * '''boost''': optional list of fields with boost values e.g. {“id”: 1000,
>> “subject” :100, “description”:10}. Used only for score based sorting.
>>   * '''filters''': optional list of terms. Usually can be cached by
>> underlying search framework. For example {“type”: “wiki”}
>>   * '''fields''': list of fields to return
>>   * optional paging fields: '''rows/start''' or '''page/pagesize''' fields
>>   * '''facets''' - optional list of facet terms, can be field or expression.
>>
>> == Resource Query is not a report tool #notreport
>> As it was discussed on dev mailing list, search and query serve a different
>> purpose than reports. Resource Query is not intended not provide complex SQL
>> like expressions linke JOIN, UNION etc. Resource Query will search through
>> flattened resource representation. Query syntax should support issue tracker
>> specifics such as search through attachments, related tickets etc.
>>
> [...]
>> Other functions or meta tags can be used in query. Meta tags can be marked
>> with specific character e.g. “#” (similar to YouTrack special keywords -
>> http://confluence.jetbrains.com/display/YTD4/Search+and+Command+Attributes#SearchandCommandAttributes-ShortcutKeywords):
>>   * #me - current user
>>   * #my - assigned to me
>>   * #currentProject
>>   * #ticket, #wiki etc.
>>   * date and time helper functions e.g. 2weeksago, 1yearago etc.
>>
> [...]
>
> This is interesting because it seems to me that these subjects I kept
> from original message overlap with a work I'm trying to finish right
> now ... so, when I have a result I'll follow with two major concrete
> suggestions .
> ;)
>


Re: [Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Posted by Olemis Lang <ol...@gmail.com>.
On 11/28/12, Apache Bloodhound <bl...@incubator.apache.org> wrote:
> Page "Proposals/BEP-0004/ResourceQuery" was added by andrej
> Content:
> -------8<------8<------8<------8<------8<------8<------8<------8<--------
>
[...]
>
> Resource Query component will provide a !ResourceQuery.query method with the
> following parameters:
>  * '''query''': query string e.g. “bla status:closed” or a parsed
> representation of the query . For more information see [#query_syntax Query
> syntax].
>  * '''sort''': optional sorting
>  * '''boost''': optional list of fields with boost values e.g. {“id”: 1000,
> “subject” :100, “description”:10}. Used only for score based sorting.
>  * '''filters''': optional list of terms. Usually can be cached by
> underlying search framework. For example {“type”: “wiki”}
>  * '''fields''': list of fields to return
>  * optional paging fields: '''rows/start''' or '''page/pagesize''' fields
>  * '''facets''' - optional list of facet terms, can be field or expression.
>
> == Resource Query is not a report tool #notreport
> As it was discussed on dev mailing list, search and query serve a different
> purpose than reports. Resource Query is not intended not provide complex SQL
> like expressions linke JOIN, UNION etc. Resource Query will search through
> flattened resource representation. Query syntax should support issue tracker
> specifics such as search through attachments, related tickets etc.
>
[...]
>
> Other functions or meta tags can be used in query. Meta tags can be marked
> with specific character e.g. “#” (similar to YouTrack special keywords -
> http://confluence.jetbrains.com/display/YTD4/Search+and+Command+Attributes#SearchandCommandAttributes-ShortcutKeywords):
>  * #me - current user
>  * #my - assigned to me
>  * #currentProject
>  * #ticket, #wiki etc.
>  * date and time helper functions e.g. 2weeksago, 1yearago etc.
>
[...]

This is interesting because it seems to me that these subjects I kept
from original message overlap with a work I'm trying to finish right
now ... so, when I have a result I'll follow with two major concrete
suggestions .
;)

-- 
Regards,

Olemis.

Blog ES: http://simelo-es.blogspot.com/
Blog EN: http://simelo-en.blogspot.com/

Featured article:

Re: [Apache Bloodhound] Proposals/BEP-0004/ResourceQuery added

Posted by Andrej Golcov <an...@digiverse.si>.
Hi all,

I moved search query requirements, query syntax and use cases into separate
page:
https://issues.apache.org/bloodhound/wiki/Proposals/BEP-0004/ResourceQuery

It is just first draft, so let's discuss how new search query should look
like and what features we want from it.

Regards, Andrej

On 28 November 2012 16:17, Apache Bloodhound <
bloodhound-dev@incubator.apache.org> wrote:

> Page "Proposals/BEP-0004/ResourceQuery" was added by andrej
> Content:
> -------8<------8<------8<------8<------8<------8<------8<------8<--------
> = Resource Query component
> [[PageOutline]]
> == Introduction #introduction
>
> This page describes functionality of Resource Query component. Resource
> Query component is responsible for resource indexing and query execution.
> It is not responsible for representation of search results to user. For
> overview of search and query solution see [wiki:BEP-0004].
>
> Usually user will not access to Resource Query component directly but via
> UI frontends e.g. search page, widget or wiki macro. Consider below a
> simple search workflow:
>  1. User searches for “bla status:closed” string in quick search box
>  1. Quick search forwards user to search page with URL
> …?q=bla20%status:closed
>  1. Search page calls Resource Query component and calls Resource Query
> component with query ”bla status:closed” and other parameters e.d. fields,
> sort etc.
>  1. Search page renders query results in appropriate way
>
> Resource Query component will provide a !ResourceQuery.query method with
> the following parameters:
>  * '''query''': query string e.g. “bla status:closed” or a parsed
> representation of the query . For more information see [#query_syntax Query
> syntax].
>  * '''sort''': optional sorting
>  * '''boost''': optional list of fields with boost values e.g. {“id”:
> 1000, “subject” :100, “description”:10}. Used only for score based sorting.
>  * '''filters''': optional list of terms. Usually can be cached by
> underlying search framework. For example {“type”: “wiki”}
>  * '''fields''': list of fields to return
>  * optional paging fields: '''rows/start''' or '''page/pagesize''' fields
>  * '''facets''' - optional list of facet terms, can be field or expression.
>
> == Resource Query is not a report tool #notreport
> As it was discussed on dev mailing list, search and query serve a
> different purpose than reports. Resource Query is not intended not provide
> complex SQL like expressions linke JOIN, UNION etc. Resource Query will
> search through flattened resource representation. Query syntax should
> support issue tracker specifics such as search through attachments, related
> tickets etc.
>
> == Query Syntax #query_syntax
> Resource Query will accept [
> http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.htmlLucene-like] syntax familiar to users of Solr, [
> http://packages.python.org/Whoosh/querylang.html Whoosh], Haystack, [
> http://code.google.com/p/unladen-swallow/issues/searchtips Google Code]
> and [
> http://confluence.jetbrains.com/display/YTD4/Search+and+Command+Attributes#SearchandCommandAttributes-GeneralSearchAttributesYouTrack] with additional functions/meta tags specific for Bloodhound/Trac
> e.g. related tickets, attachment etc.
>
> Default Resource Query operator is AND.
>
> Bloodhound should provide it’s own query parser in order to be independent
> from underlying search platforms.
>
> === Issue-tracker specifics #tracker_specifics
> Resource Query should be able to search through Bloodhound specific
> fields/functions:
>  * comments
>  * attachments
>  * history
>  * related resources with different relation types: linked, duplicated,
> blocked, child/parent etc.
>
> Resource Query should support version changing, similar to WAS and CHANGED
> operator in JIRA (
> https://confluence.atlassian.com/display/JIRA/Advanced+Searching#AdvancedSearching-WAS,
> http://confluence.jetbrains.com/display/YTD4/Search+and+Command+Attributes
> )
>
> Other functions or meta tags can be used in query. Meta tags can be marked
> with specific character e.g. “#” (similar to YouTrack special keywords -
> http://confluence.jetbrains.com/display/YTD4/Search+and+Command+Attributes#SearchandCommandAttributes-ShortcutKeywords
> ):
>  * #me - current user
>  * #my - assigned to me
>  * #currentProject
>  * #ticket, #wiki etc.
>  * date and time helper functions e.g. 2weeksago, 1yearago etc.
>
> Indexing and query syntax must be easily extended by plugins. Here is not
> a complete list of other possible meta tags that can be provided by
> additional plugins:
>  * #resolved/unresolved - status:(resolved OR closed)
>  * version aggregation e.g. earliestUnreleasedVersion
>  * #hasAttachment
>  * code:xxx ... - contains code in wiki format
>  * #duplicated
>  * #closed = status:closed
>  * #yesterday
>  * ...
>
> == Use cases #usecases
> === User uses free text search or query  in quick search box
> #usecase_freesearch
> User inputs text  or query string in search box. The input can be directly
> propagated to query parameter, for example:
>  * bla
>  * open issue
>  * bla “open issue”
>  * bla status:open
>  * status:open
>
> === Possibility to specify what fields to return #usecase_fields
> Search page or widget must specify fields parameter of ResourceQuery.query
>  method.
> {{{
> #!python
> resourceQuery.query(fields=("id" , "title", "status"),...)
> }}}
>
> === Boolean operators and grouping #usecase_grouping
> Resource Query must support AND, OR, NOT and grouping (default operator is
> AND). Query string may look like:
>  * alpha AND NOT (beta OR gamma)
>  * “render AND shading”  - expression is equal to “render shading”
>  * title:x OR ( title:y AND message:z)
>
> === User can search using range expression #usecase_range
> Query string parameter should support inclusive and exclusive range
> expression, for example:
>  * date:[20050101 TO 20090715]
>  * title:{Aida TO Carmen}
>  * [0025 TO]
>  * {TO suffix}
>
> === Facets support #usecase_facets
> Query must support facets  (e.g. Resources: Tickets(10), Wiki (20)),
> Status (Open (22), Closed(33)) etc. Facets parameter should be used for
> this purposes.
>
> {{{
> #!python
> resourceQuery.query(facets=("type", "status"), ...)
> }}}
>
> === Flexible sorting #usecase_sorting
> Default sort order of text-search should be based on score and change
> date. Search page can set the following parameters for !ResourceQuery.query
> method:
> {{{
> #!python
> resourceQuery.query( sort = {"score":ASC, "change_date": DESC},
>   boost = {"id" : 1000, "subject" : 100, "description": 10},...)
> }}}
>
> === Paging support #usecase_paging
> Search page will represent query results in pages. For this purposes, it
> should use the following parameters of !ResourceQuery.query method.
> {{{
> #!python
> resourceQuery.query(start = 100, rows=50, ...)
> }}}
> or
> {{{
> #!python
> resourceQuery.query(pagesize=50, page=3, ...)
> }}}
>
> === Related ticket use case #usecase_ralated
> User queries tickets related to tickets that were reopened in last 14
> days. The query can be exprese with the following call:
> {{{
> #!python
> resourceQuery.query(
>   query="changed.status_from:open changed_date:[1weekago TO]",
>   facets=("parent_ticket"), ...
> )
> }}}
>
> === Search in comments #usecase_comments
> {{{
> #!python
> resourceQuery.query(query="attachment:bla", ...)
> }}}
>
> === Show in attachment  #usecase_comments
> {{{
> #!python
> resourceQuery.query(query="attachment:bla", ...)
> }}}
>
> === Show tickets that were commented since yesterday.
>  #usecase_last_commented
> {{{
> #!python
> resourceQuery.query(query="last_commented:[yesterday TO]", ...)
> }}}
>
> === Show all resources in current project that links to a ticket
> #usecase_project_linked
> {{{
> #!python
> resourceQuery.query(query="#currentProject AND linked:#123", ...)
> }}}
> -------8<------8<------8<------8<------8<------8<------8<------8<--------
>
> --
> Page URL: <
> https://issues.apache.org/bloodhound/wiki/Proposals/BEP-0004/ResourceQuery
> >
> Apache Bloodhound <https://issues.apache.org/bloodhound/>
> The Apache Bloodhound (incubating) issue tracker
>
> This is an automated message. Someone added your email address to be
> notified of changes on 'Proposals/BEP-0004/ResourceQuery' page.
> If it was not you, please report to .
>