You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by jai2 <ja...@gmail.com> on 2013/09/02 10:55:23 UTC

Re: Solr 4.2 Regular expression, returning only matched substring

hi Erick,

Thanks alot for your reply. i am still looking for any feasible solution,
currently i can only think of creating another core having schema with
patterntokenizer class field types, load it and re-index search results on
this temp core.

is there any way to provide list of patterns for tokenizing? like we do for
stopword filter by using text file? else we will have to create fields for
each pattern individually.

thanks and regards
jai



On Wed, Aug 28, 2013 at 4:45 PM, Erick Erickson [via Lucene] <
ml-node+s472066n4086966h1@n3.nabble.com> wrote:

> Ah, OK. Nothing springs to mind. Even faceting on the individual values
> of the field counts _documents_ that match, but doesn't give you
> which particular values matched. I suppose that in that case you could
> run your regex over the returned labels for the facets.
>
> But that's a really ugly solution. Problem is that in a field with 1M
> unique values you'd get a list 1M long perhaps which wouldn't perform
> at all well.
>
> Depending, you could enumerate your terms (see TermsComponent)
> using terms.regex to get a list of all terms that matched your regex
> up-front, then do some relatively painful facet querying on a long list
> of the returned values, again not something I'd do in a high-query
> environment. Depends I guess on how busy your website is....
>
> Best
> Erick
>
>
> On Wed, Aug 28, 2013 at 4:18 AM, jai2 <[hidden email]<http://user/SendEmail.jtp?type=node&node=4086966&i=0>>
> wrote:
>
> > hi Erick,
> >
> > Appreciate your reply. Facet.query will give count of matches not the
> count
> > of unique pattern matches.
> >
> > if i give regular expression [0-9]{3} to match a 3 digit number it will
> > return total occurrences of three digit numbers, but i want to know
> > occurrences of unique 3 numbers. lets say i have number 100 occurred 10
> > times and 500 occurred 5 times. facet.query will return count as 15,
> > instead
> > of giving count of 100 and 500 individually.
> >
> > Hope i made myself clear. is there any way to to this?
> >
> > thanks and regards
> > jai
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Solr-4-2-Regular-expression-returning-only-matched-substring-tp4086868p4086944.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Solr-4-2-Regular-expression-returning-only-matched-substring-tp4086868p4086966.html
>  To unsubscribe from Solr 4.2 Regular expression, returning only matched
> substring, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4086868&code=amFpNGxvdmVAZ21haWwuY29tfDQwODY4Njh8LTIwNDk4NjMyNzM=>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-2-Regular-expression-returning-only-matched-substring-tp4086868p4087790.html
Sent from the Solr - User mailing list archive at Nabble.com.