You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Roopesh P Raj <ro...@digitalglue.in> on 2007/09/24 11:04:29 UTC

How to get all the search results - python

Hi,

I am using solr setup in Tomcat 5.5 with python 2.4 using python client solr.py. 

When I search, all the results are not returned. 

The method call for searching is as follows : rows specifies the number of rows.
data = c.search(q='query', fl='id score unique_id Message-ID To From Subject',rows=50, wt='python')

I want to specify that I want all the rows. How can I do that ?

Regards
Roopesh




------------------
DigitalGlue, India




Re: How to get all the search results - python

Posted by Jérôme Etévé <je...@eteve.net>.
By design, it's not very efficient to ask for a large number of
results with solr/lucene. I think you will face performance and memory
problems if you do that.


On 9/24/07, Thorsten Scherler <th...@juntadeandalucia.es> wrote:
> On Mon, 2007-09-24 at 16:29 +0530, Roopesh P Raj wrote:
> > > Hi Roopesh,
> >
> > > I am not sure whether I understand your problem.
> >
> > > Is it the limitation of rows/pagination?
> > > If so why not using a real high number (like rows=10000000000)?
> >
> > > salu2
> >
> > Hi,
> >
> > Assigning a high number will solve my problem. (I thought that there will something like rows='all' to do it).
> >
> > Can I do pagination using the python client?
>
> I am not a python expert but I think so.
>
> > How can I specify the starting position, offset etc for
> > pagination through the python client?
>
> http://wiki.apache.org/solr/CommonQueryParameters
>
> It should work as described in the above document (with the start
> parameter.
>
> e.g.
> data = c.search(q='query', fl='id score unique_id Message-ID To From
> Subject',rows=50, wt='python',start=50)
>
> HTH
> --
> Thorsten Scherler                                 thorsten.at.apache.org
> Open Source Java                      consulting, training and solutions
>
>


-- 
Jerome Eteve.
jerome@eteve.net
http://jerome.eteve.free.fr/

Re: How to get all the search results - python

Posted by Roopesh P Raj <ro...@digitalglue.in>.
Thanks a lot for your replies. I will follow the paginated search.

Thanks and Regards
Roopesh

------------------
DigitalGlue, India




Re: I can't delete, why?

Posted by Yonik Seeley <yo...@apache.org>.
On 9/25/07, Ben Shlomo, Yatir <yb...@shopping.com> wrote:
> I know I can delete multiple docs with the following:
> <delete><query>mediaId:(6720 OR 6721 OR .... )</query></delete>
>
> My question is can I do something like this?
> <delete><query>languageId:123 AND manufacturer:456 </query></delete>
> (It does not work for me and I didn't forget to commit....)

Do you get an error, or do you just not see this document deleted?
Does a query identical to this show matching documents after a commit?

Also keep in mind that delete by id is currently more efficient than
delete by query, so if mediaId is your uniqueKeyField, you would be
better served by using that.

-Yonik

I can't delete, why?

Posted by "Ben Shlomo, Yatir" <yb...@shopping.com>.
Hi!
I know I can delete multiple docs with the following:
<delete><query>mediaId:(6720 OR 6721 OR .... )</query></delete>

My question is can I do something like this?
<delete><query>languageId:123 AND manufacturer:456 </query></delete>
(It does not work for me and I didn't forget to commit....)


How can I do it ? with copy field ?
<delete><query>languageIdmanufacturer:123456</query></delete>
Thanks
yatir

Re: How to get all the search results - python

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Tue, 2007-09-25 at 10:03 +0530, Roopesh P Raj wrote:

DISCLAIMER:
Please, I am subscribed to the user list and there is no need to write
me directly nor cc me in your response. More since we are an open source
project off-list communication is suboptimal and harmful to the
community. The community has many eyes which can see possible problems
with some solution and propose better ones. Further the mailing list has
an archive and proofed solution can be searched. If we all share
off-list mailings no solutions go into the archive and we always have to
repeat the same mails.

PLEASE write to the ml!

> > http://wiki.apache.org/solr/CommonQueryParameters
> 
> > It should work as described in the above document (with the start
> > parameter.
> 
> > e.g. 
> > data = c.search(q='query', fl='id score unique_id Message-ID To From
> > Subject',rows=50, wt='python',start=50)
> 
> > HTH
> > --
> 
> Hi,
> 
> I my application there is a provision to copy the archive based on date indexed. 
> In this case the number of search results may exceed the high number I have 
> assigned to rows, say rows=10000000. I wanted to avoid this situation. In this 
> situation I don't want paginated queries. 
> 
> Can you please tell me how to approach this particular situation.

I think the best way is to
1) get the first response document  (rows=50,start=0)
2) parse the response to see how many results you have
3) do a loop (rows=50,start=50*x) and call solr till you have all
results.

Like Jérôme stated:
On Mon, 2007-09-24 at 12:45 +0100, Jérôme Etévé wrote:
> By design, it's not very efficient to ask for a large number of
> results with solr/lucene. I think you will face performance and memory
> problems if you do that. 

HTH

salu2
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


Re: How to get all the search results - python

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Mon, 2007-09-24 at 16:29 +0530, Roopesh P Raj wrote:
> > Hi Roopesh,
> 
> > I am not sure whether I understand your problem. 
> 
> > Is it the limitation of rows/pagination? 
> > If so why not using a real high number (like rows=10000000000)?
> 
> > salu2
> 
> Hi,
> 
> Assigning a high number will solve my problem. (I thought that there will something like rows='all' to do it).
> 
> Can I do pagination using the python client? 

I am not a python expert but I think so.

> How can I specify the starting position, offset etc for 
> pagination through the python client? 

http://wiki.apache.org/solr/CommonQueryParameters

It should work as described in the above document (with the start
parameter.

e.g. 
data = c.search(q='query', fl='id score unique_id Message-ID To From
Subject',rows=50, wt='python',start=50)

HTH
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


Re: How to get all the search results - python

Posted by Roopesh P Raj <ro...@digitalglue.in>.
> Hi Roopesh,

> I am not sure whether I understand your problem. 

> Is it the limitation of rows/pagination? 
> If so why not using a real high number (like rows=10000000000)?

> salu2

Hi,

Assigning a high number will solve my problem. (I thought that there will something like rows='all' to do it).

Can I do pagination using the python client? How can I specify the starting position, offset etc for 
pagination through the python client? 

Regards
Roopesh


------------------
DigitalGlue, India




Re: How to get all the search results - python

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Mon, 2007-09-24 at 14:34 +0530, Roopesh P Raj wrote:
> Hi,
> 
> I am using solr setup in Tomcat 5.5 with python 2.4 using python client solr.py. 
> 
> When I search, all the results are not returned. 
> 
> The method call for searching is as follows : rows specifies the number of rows.
> data = c.search(q='query', fl='id score unique_id Message-ID To From Subject',rows=50, wt='python')
> 
> I want to specify that I want all the rows. How can I do that ?

Hi Roopesh,

I am not sure whether I understand your problem. 

Is it the limitation of rows/pagination? 
If so why not using a real high number (like rows=10000000000)?

salu2
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions