You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Anil Khadka (JIRA)" <ji...@apache.org> on 2009/08/26 20:05:59 UTC

[jira] Created: (SOLR-1387) Add more search options for filtering facets.

Add more search options for filtering facets.
---------------------------------------------

                 Key: SOLR-1387
                 URL: https://issues.apache.org/jira/browse/SOLR-1387
             Project: Solr
          Issue Type: New Feature
          Components: search
    Affects Versions: 1.4
            Reporter: Anil Khadka


Currently for filtering the facets, we have to use prefix (which use String.startsWith() in java). 
We can add some parameters like
* facet.iPrefix : this would act like case-insensitive search. (or --->  facet.prefix=a&facet.caseinsense=on)
* facet.regex : this is pure regular expression search (which obviously would be expensive if issued).

Moreover, allowing multiple filtering for same field would be great like
facet.prefix=a OR facet.prefix=A ... sth like this.

All above concepts could be equally applicable to TermsComponent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1387) Add more search options for filtering field facets.

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-1387:
---------------------------

    Summary: Add more search options for filtering field facets.  (was: Add more search options for filtering facets.)

> Add more search options for filtering field facets.
> ---------------------------------------------------
>
>                 Key: SOLR-1387
>                 URL: https://issues.apache.org/jira/browse/SOLR-1387
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Anil Khadka
>
> Currently for filtering the facets, we have to use prefix (which use String.startsWith() in java). 
> We can add some parameters like
> * facet.iPrefix : this would act like case-insensitive search. (or --->  facet.prefix=a&facet.caseinsense=on)
> * facet.regex : this is pure regular expression search (which obviously would be expensive if issued).
> Moreover, allowing multiple filtering for same field would be great like
> facet.prefix=a OR facet.prefix=A ... sth like this.
> All above concepts could be equally applicable to TermsComponent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1387) Add more search options for filtering facets.

Posted by "Anil Khadka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751032#action_12751032 ] 

Anil Khadka commented on SOLR-1387:
-----------------------------------

I've come up with following code. Any suggestions?? 
[This is just a code snippet]

{code:title=Extension of SimpleFacet.java|borderStyle=solid}
/*** SEARCHING ***/
// HashSet is choosen to avoid duplicate entry
    HashSet<String> termsDump = new HashSet<String>();
      for (String term: terms ) { //<------ terms[] from FieldCache.DEFAULT ... StringIndex.loopup
        if (term == null ) continue;
        for (String p : iprefixList) { //<--- list of prefix to be search case insensitively.
          // doing iprefix
          if (term.toUpperCase().startsWith(p.toUpperCase())) { //<---- Is this the best way to do??
            termsDump.add(term);
          }
        }
        for (String re: regexList) { // <--- list of regular expression
          if (term.matches(re)) {
            //equivalent to Pattern.compile(re).matcher(term).matches()
            termsDump.add(term);
          }
        }
      }
     // Just add the list of input terms without searching :)
     termsDump.addAll(inputTermsList);
      
/*** COUNTING ***/ // <-- this counting method is different from regular prefix (finding spectrum in an array)
    FieldType ft = searcher.getSchema().getFieldType(field);
    NamedList<Integer> res = new NamedList();
    Term t = new Term(field);
    for (String term : termList) { // <---- termList = termsDump from above
      String internal = ft.toInternal(term);
      int count = searcher.numDocs(new TermQuery(t.createTerm(internal)), base); // <--- Do we loose performance on this??
      res.add(term, count);
    }
    
/*** SORTING ***/ // <-- regular CountPair<String,Integer> thing.
    for (int i = 0, n= nList.size(); i <n; i++){
          queue.add(new CountPair<String,Integer>(res.getName(i), res.getVal(i)));
        }
{code}

The syntax would look like (localParams style) this:
{code}
  &facet.field={!XFilter=on prefix=A,B,C iPrefix=a,b,c,d termsList=e,f,g,h regex=^a[a-z0-9]+g$,z*}field_name
{code}
XFilter: i called this eXtended Filter for facet!!

> Add more search options for filtering facets.
> ---------------------------------------------
>
>                 Key: SOLR-1387
>                 URL: https://issues.apache.org/jira/browse/SOLR-1387
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Anil Khadka
>
> Currently for filtering the facets, we have to use prefix (which use String.startsWith() in java). 
> We can add some parameters like
> * facet.iPrefix : this would act like case-insensitive search. (or --->  facet.prefix=a&facet.caseinsense=on)
> * facet.regex : this is pure regular expression search (which obviously would be expensive if issued).
> Moreover, allowing multiple filtering for same field would be great like
> facet.prefix=a OR facet.prefix=A ... sth like this.
> All above concepts could be equally applicable to TermsComponent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1387) Add more search options for filtering field facets.

Posted by "Anil Khadka (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anil Khadka updated SOLR-1387:
------------------------------

        Fix Version/s: 1.5
    Affects Version/s:     (was: 1.4)

> Add more search options for filtering field facets.
> ---------------------------------------------------
>
>                 Key: SOLR-1387
>                 URL: https://issues.apache.org/jira/browse/SOLR-1387
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Anil Khadka
>             Fix For: 1.5
>
>
> Currently for filtering the facets, we have to use prefix (which use String.startsWith() in java). 
> We can add some parameters like
> * facet.iPrefix : this would act like case-insensitive search. (or --->  facet.prefix=a&facet.caseinsense=on)
> * facet.regex : this is pure regular expression search (which obviously would be expensive if issued).
> Moreover, allowing multiple filtering for same field would be great like
> facet.prefix=a OR facet.prefix=A ... sth like this.
> All above concepts could be equally applicable to TermsComponent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1387) Add more search options for filtering facets.

Posted by "Avlesh Singh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748061#action_12748061 ] 

Avlesh Singh commented on SOLR-1387:
------------------------------------

{quote}
facet.iPrefix : this would act like case-insensitive search. (or ---> facet.prefix=a&facet.caseinsense=on)
{quote}
I don't see a reason as to why the case filter be there. you can always apply a lower case filter to you field while indexing and searching. 

{quote}
facet.regex : this is pure regular expression search (which obviously would be expensive if issued).
{quote}
You mean wildcards. Right?

{quote}
Moreover, allowing multiple filtering for same field would be great like facet.prefix=a OR facet.prefix=A ... sth like this.
{quote}
This has been recently discussed on the dev mailing list here - http://www.lucidimagination.com/search/document/f954dbb323746ed1/multiple_facet_prefix 
The syntax that was agreed upon was local params in this manner - facet.field={!prefix=foo prefix=bar}myfield

> Add more search options for filtering facets.
> ---------------------------------------------
>
>                 Key: SOLR-1387
>                 URL: https://issues.apache.org/jira/browse/SOLR-1387
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Anil Khadka
>
> Currently for filtering the facets, we have to use prefix (which use String.startsWith() in java). 
> We can add some parameters like
> * facet.iPrefix : this would act like case-insensitive search. (or --->  facet.prefix=a&facet.caseinsense=on)
> * facet.regex : this is pure regular expression search (which obviously would be expensive if issued).
> Moreover, allowing multiple filtering for same field would be great like
> facet.prefix=a OR facet.prefix=A ... sth like this.
> All above concepts could be equally applicable to TermsComponent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1387) Add more search options for filtering facets.

Posted by "Anil Khadka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748087#action_12748087 ] 

Anil Khadka commented on SOLR-1387:
-----------------------------------

> I don't see a reason as to why the case filter be there. you can always apply a lower case filter to you field while indexing and searching.
suppose i indexed a field called "placename" having name like California, Nevada, San Jose... 
If I use LowerCaseFilterFactory it will be stored in lowered case and when retrieving as FACET (or TermsComponent) it is also in lowered case. --> (california, nevada, san jose)
And this will mess thing up (at least for me). I know there are others who want this too.

> You mean wildcards. Right?
Yes, it would be the first step towards it... [ again i don't mean A* or abc*.., i would rather want *a or a*bc]

> This has been recently discussed on the dev mailing list here - http://www.lucidimagination.com/search/document/f954dbb323746ed1/multiple_facet_prefix
The syntax that was agreed upon was local params in this manner - facet.field={!prefix=foo prefix=bar}myfield
Yes this is what i'm talking about, having an option to get both the individual list and merge list for each query (here 'foo' and 'bar') would be better.


> Add more search options for filtering facets.
> ---------------------------------------------
>
>                 Key: SOLR-1387
>                 URL: https://issues.apache.org/jira/browse/SOLR-1387
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Anil Khadka
>
> Currently for filtering the facets, we have to use prefix (which use String.startsWith() in java). 
> We can add some parameters like
> * facet.iPrefix : this would act like case-insensitive search. (or --->  facet.prefix=a&facet.caseinsense=on)
> * facet.regex : this is pure regular expression search (which obviously would be expensive if issued).
> Moreover, allowing multiple filtering for same field would be great like
> facet.prefix=a OR facet.prefix=A ... sth like this.
> All above concepts could be equally applicable to TermsComponent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.