You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ben <be...@autonomic.net> on 2009/06/29 15:29:57 UTC
Excluding Characters and SubStrings in a Faceted Wildcard Query
Hello,
I've been using SOLR for a while now, but am stuck for information on
two issues :
1) Is it possible to exclude characters in a SOLR facet wildcard query?
e.g.
[^,]* to match any character except an "," ?
2) Can one setup the facet wildcard query to return the exact sub
strings it matched of the queried facet, rather than the whole string?
I hope somebody can help :)
Thanks,
Ben
Re: Excluding Characters and SubStrings in a Faceted Wildcard Query
Posted by Norberto Meijome <nu...@gmail.com>.
On Mon, 29 Jun 2009 15:10:59 +0100
Ben <be...@autonomic.net> wrote:
> Hi Erik,
>
> I'm not sure exactly how much context you need here, so I'll try to keep
> it short and expand as needed.
>
> The column I am faceting contains a comma deliniated set of vectors.
> Each vector is made up of {Make,Year,Model} e.g.
> _ford_1996_focus,mercedes_1996_clk,ford_2000_focus
>
> I have a custom request handler, where if I want to find all the cars
> from 1996 I pass in a facet query for the Year (1996) which is
> transformed to a wildcard facet query :
>
> _*_1996_*
>
> In otherwords, it'll match any records whose vector column contains a
> string, which somewhere has a car from 1996.
>
> Why not put the Make, Year and Model in separate columns and do a facet
> query of multiple columns?... because once we've selected 1996, we
> should (in the above example) then be offering "ford and mercedes" as
> further facet choices, and nothing more. If the parts were in their own
> columns, there would be no way to tie the Makes and Models to specific
> years, for example.
>
[...]
Hi,
It must be late and I probably need more $coffee... but isn't what u just
described (search for 1996, show 'ford', 'mercedes') how facets DO work?
once you have the facet on the make field, and solr told you that both 'ford'
and 'mercedes' are available in that field, it is up to you to search for
'make=ford and date=1996" if you ONLY want fords, generation 1996...
cheers,
B
_________________________
{Beto|Norberto|Numard} Meijome
"He has the attention span of a lightning bolt."
Robert Redford
I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.
Re: Excluding Characters and SubStrings in a Faceted Wildcard Query
Posted by Ben <be...@autonomic.net>.
Hi Erik,
I'm not sure exactly how much context you need here, so I'll try to keep
it short and expand as needed.
The column I am faceting contains a comma deliniated set of vectors.
Each vector is made up of {Make,Year,Model} e.g.
_ford_1996_focus,mercedes_1996_clk,ford_2000_focus
I have a custom request handler, where if I want to find all the cars
from 1996 I pass in a facet query for the Year (1996) which is
transformed to a wildcard facet query :
_*_1996_*
In otherwords, it'll match any records whose vector column contains a
string, which somewhere has a car from 1996.
Why not put the Make, Year and Model in separate columns and do a facet
query of multiple columns?... because once we've selected 1996, we
should (in the above example) then be offering "ford and mercedes" as
further facet choices, and nothing more. If the parts were in their own
columns, there would be no way to tie the Makes and Models to specific
years, for example.
At anyrate, the wildcard search returns the entire match
(_ford_1996_focus,mercedes_1996_clk,ford_2000_focus). I then have to do
another RegExp over it to extract only the two parts (the first ford and
mercedes) that were from 1996. This isn't using SOLR's cache very
effectively.
It would be excellent if SOLR could break up that comma separated list
into three different parts, and run the RegExp over each , returning
only those which match. Is that what you're implying with Analysis? If
that were the case, I'd not need to worry about character exclusion.
Sorry if that's a bit fuzzy... it's hard trying to explain enough to be
useful, but not too much that it turns into an essay!!!
Thanks,
Ben
The solution I'm using is to form a vector
Erik Hatcher wrote:
> Ben,
>
> Could you post an example of the type of data you're dealing with and
> how you want it handled? I suspect there is a way to accomplish what
> you want using an analyzed field, or by preprocessing the data you're
> indexing.
>
> Erik
>
> On Jun 29, 2009, at 9:29 AM, Ben wrote:
>
>> Hello,
>>
>> I've been using SOLR for a while now, but am stuck for information on
>> two issues :
>>
>> 1) Is it possible to exclude characters in a SOLR facet wildcard query?
>> e.g.
>> [^,]* to match any character except an "," ?
>>
>> 2) Can one setup the facet wildcard query to return the exact sub
>> strings it matched of the queried facet, rather than the whole string?
>>
>> I hope somebody can help :)
>>
>> Thanks,
>>
>> Ben
>
Re: Excluding Characters and SubStrings in a Faceted Wildcard Query
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Ben,
Could you post an example of the type of data you're dealing with and
how you want it handled? I suspect there is a way to accomplish what
you want using an analyzed field, or by preprocessing the data you're
indexing.
Erik
On Jun 29, 2009, at 9:29 AM, Ben wrote:
> Hello,
>
> I've been using SOLR for a while now, but am stuck for information
> on two issues :
>
> 1) Is it possible to exclude characters in a SOLR facet wildcard
> query?
> e.g.
> [^,]* to match any character except an "," ?
>
> 2) Can one setup the facet wildcard query to return the exact sub
> strings it matched of the queried facet, rather than the whole string?
>
> I hope somebody can help :)
>
> Thanks,
>
> Ben