You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Saman Rasheed <sa...@hotmail.com> on 2017/05/30 16:29:34 UTC

update please

hi, can someone kindly update me on the question i raised on Mon, 22 May, 17:14


subject:


without termfeq - returning the number of terms/or regex of terms in a document<http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201705.mbox/ajax/%3CVI1P190MB0334F019686D8591A6679F448CF80%40VI1P190MB0334.EURP190.PROD.OUTLOOK.COM%3E>


thanks,

Re: update please

Posted by Rick Leir <rl...@leirtech.com>.
Sam,
First, try it with Solomo* and you should see much better response.

Try things and experiment in the SolrAdmin query tab or analysis tab.

Use debug=true or debugquery= true.

When the server is really slow, use top(1) to see if you are swapping. Say 
top -o RES
Or
top then shift-M

Get a screen grab and post it for us to see.
Check the solr log? Enough for now
Cheers -- Rick

On May 30, 2017 1:00:39 PM EDT, Saman Rasheed <sa...@hotmail.com> wrote:
>Hi Rick,
>
>
>Thanks for coming back to me on this, btw it's 'Saman' but please call
>me sam like everyone else 😊
>
>
>here we go:
>
>~~~~~~~~~~~~~~~~~~~~~~
>
>i have an english book which i have indexed its contents successfully
>into a field called 'content,
>with the following properties:
>
>
><field name="content" type="text_general" indexed="true" stored="true"
>multiValued="true"
>termVectors="true" termPositions="true" termOffsets="true"/>
>
>
>so if need to return the number of a specific term regex e.g. '*olomo*'
>then my document should
>contain 2 and give me 'Solomon' with a term frequency = 2.
>
>
>I've tried going through the term vector section in the reference and
>various other posts
>on the internet but still i havent managed to figure out how.
>
>
>the nearest i found is the following syntax/way:
>
>
>http://localhost:8983/solr/test/tvrh?q=content:[*%20TO%20*]&indent=true&tv.tf=true&tv.df=true
>
>
>which brings my pc to a near halt for about a couple of minutes, and
>then it returns the term
>frequency of every term! but i only need the term frequency of
>particular pattern/regex:
>
>
>is there a way to narrow it down to just one regex term, e.g. *thing*,
>so it will find the term frequency of 'soothing',
>'somthing' and 'everything' ... etc each with their number of
>occurences per document?
>
>
>thanks,
>
>
>________________________________
>From: Rick Leir <rl...@leirtech.com>
>Sent: 30 May 2017 16:45
>To: solr-user@lucene.apache.org
>Subject: Re: update please
>
>Salman,
>That is a week ago, which is a long while. And my Android does not
>display the archives link in a readable way. Would you mind repeating
>the question here? Be a bit verbose, sometimes it is better that way.
>Cheers -- Rick
>
>
>On May 30, 2017 12:29:34 PM EDT, Saman Rasheed
><sa...@hotmail.com> wrote:
>>hi, can someone kindly update me on the question i raised on Mon, 22
>>May, 17:14
>>
>>
>>subject:
>>
>>
>>without termfeq - returning the number of terms/or regex of terms in a
>>document<http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201705.mbox/ajax/%3CVI1P190MB0334F019686D8591A6679F448CF80%40VI1P190MB0334.EURP190.PROD.OUTLOOK.COM%3E>
>>
>>
>>thanks,
>
>--
>Sorry for being brief. Alternate email is rickleir at yahoo dot com

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: update please

Posted by Mikhail Khludnev <mk...@apache.org>.
Sam,
I believe you can search for q=*olomo*, and then request highlighting
hl=true&hl.fl=conent. Probably it needs a tweak to return all fragments,
one per occurrence. A slightly different idea is to request
(TermsComponent)  /terms for the given regexp, get all terms and then,
request tf per every of these terms. fl=termfreq(content,'Solomon'),
termfreq(content,'Yolomonk') , etc


On Tue, May 30, 2017 at 8:00 PM, Saman Rasheed <sa...@hotmail.com>
wrote:

> Hi Rick,
>
>
> Thanks for coming back to me on this, btw it's 'Saman' but please call me
> sam like everyone else 😊
>
>
> here we go:
>
> ~~~~~~~~~~~~~~~~~~~~~~
>
> i have an english book which i have indexed its contents successfully into
> a field called 'content,
> with the following properties:
>
>
> <field name="content" type="text_general" indexed="true" stored="true"
> multiValued="true"
> termVectors="true" termPositions="true" termOffsets="true"/>
>
>
> so if need to return the number of a specific term regex e.g. '*olomo*'
> then my document should
> contain 2 and give me 'Solomon' with a term frequency = 2.
>
>
> I've tried going through the term vector section in the reference and
> various other posts
> on the internet but still i havent managed to figure out how.
>
>
> the nearest i found is the following syntax/way:
>
>
> http://localhost:8983/solr/test/tvrh?q=content:[*%20TO%
> 20*]&indent=true&tv.tf=true&tv.df=true
>
>
> which brings my pc to a near halt for about a couple of minutes, and then
> it returns the term
> frequency of every term! but i only need the term frequency of particular
> pattern/regex:
>
>
> is there a way to narrow it down to just one regex term, e.g. *thing*, so
> it will find the term frequency of 'soothing',
> 'somthing' and 'everything' ... etc each with their number of occurences
> per document?
>
>
> thanks,
>
>
> ________________________________
> From: Rick Leir <rl...@leirtech.com>
> Sent: 30 May 2017 16:45
> To: solr-user@lucene.apache.org
> Subject: Re: update please
>
> Salman,
> That is a week ago, which is a long while. And my Android does not display
> the archives link in a readable way. Would you mind repeating the question
> here? Be a bit verbose, sometimes it is better that way.
> Cheers -- Rick
>
>
> On May 30, 2017 12:29:34 PM EDT, Saman Rasheed <sa...@hotmail.com>
> wrote:
> >hi, can someone kindly update me on the question i raised on Mon, 22
> >May, 17:14
> >
> >
> >subject:
> >
> >
> >without termfeq - returning the number of terms/or regex of terms in a
> >document<http://mail-archives.apache.org/mod_mbox/
> lucene-solr-user/201705.mbox/ajax/%3CVI1P190MB0334F019686D8591A66
> 79F448CF80%40VI1P190MB0334.EURP190.PROD.OUTLOOK.COM%3E>
> >
> >
> >thanks,
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com
>



-- 
Sincerely yours
Mikhail Khludnev

Re: update please

Posted by Saman Rasheed <sa...@hotmail.com>.
Hi Rick,


Thanks for coming back to me on this, btw it's 'Saman' but please call me sam like everyone else 😊


here we go:

~~~~~~~~~~~~~~~~~~~~~~

i have an english book which i have indexed its contents successfully into a field called 'content,
with the following properties:


<field name="content" type="text_general" indexed="true" stored="true" multiValued="true"
termVectors="true" termPositions="true" termOffsets="true"/>


so if need to return the number of a specific term regex e.g. '*olomo*' then my document should
contain 2 and give me 'Solomon' with a term frequency = 2.


I've tried going through the term vector section in the reference and various other posts
on the internet but still i havent managed to figure out how.


the nearest i found is the following syntax/way:


http://localhost:8983/solr/test/tvrh?q=content:[*%20TO%20*]&indent=true&tv.tf=true&tv.df=true


which brings my pc to a near halt for about a couple of minutes, and then it returns the term
frequency of every term! but i only need the term frequency of particular pattern/regex:


is there a way to narrow it down to just one regex term, e.g. *thing*, so it will find the term frequency of 'soothing',
'somthing' and 'everything' ... etc each with their number of occurences per document?


thanks,


________________________________
From: Rick Leir <rl...@leirtech.com>
Sent: 30 May 2017 16:45
To: solr-user@lucene.apache.org
Subject: Re: update please

Salman,
That is a week ago, which is a long while. And my Android does not display the archives link in a readable way. Would you mind repeating the question here? Be a bit verbose, sometimes it is better that way.
Cheers -- Rick


On May 30, 2017 12:29:34 PM EDT, Saman Rasheed <sa...@hotmail.com> wrote:
>hi, can someone kindly update me on the question i raised on Mon, 22
>May, 17:14
>
>
>subject:
>
>
>without termfeq - returning the number of terms/or regex of terms in a
>document<http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201705.mbox/ajax/%3CVI1P190MB0334F019686D8591A6679F448CF80%40VI1P190MB0334.EURP190.PROD.OUTLOOK.COM%3E>
>
>
>thanks,

--
Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: update please

Posted by Rick Leir <rl...@leirtech.com>.
Salman,
That is a week ago, which is a long while. And my Android does not display the archives link in a readable way. Would you mind repeating the question here? Be a bit verbose, sometimes it is better that way.
Cheers -- Rick


On May 30, 2017 12:29:34 PM EDT, Saman Rasheed <sa...@hotmail.com> wrote:
>hi, can someone kindly update me on the question i raised on Mon, 22
>May, 17:14
>
>
>subject:
>
>
>without termfeq - returning the number of terms/or regex of terms in a
>document<http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201705.mbox/ajax/%3CVI1P190MB0334F019686D8591A6679F448CF80%40VI1P190MB0334.EURP190.PROD.OUTLOOK.COM%3E>
>
>
>thanks,

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com