You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kudrettin Güleryüz <ku...@gmail.com> on 2019/06/28 14:11:53 UTC

different numFound value /select vs. /export

Hi,

I'd like to give my website users ability to export a field for the full
search result set. Specifying a very large pageSize seems to perform very
poorly for this purpose. Therefore, considering using export requestHandler
for exporting search results.

When I play with a core, I noticed that the numFound value was different
between these two queries for the same core
export?fl=id&q=*:*&sort=id%20desc
select?fl=id&q=*:*&sort=id%20desc

Can you please explain why this may be the case? Also any suggestions on
alternatives would be nice.

Thank you

Re: different numFound value /select vs. /export

Posted by Kudrettin Güleryüz <ku...@gmail.com>.
Thank you, issue was indeed format error.

On Fri, Jun 28, 2019 at 2:23 PM Colvin Cowie <co...@gmail.com>
wrote:

> */stream?explain=true&expr=sear*
>
>
> *ch(myCore,zkHost=”192.168.1.10:2181
> <http://192.168.1.10:2181>",qt=”/export”,q=”*:*”, fl=”id”,sort=”id asc”)
> returns*
> * 'search(myCore,zkHost=”192.168.1.10:2181
> <http://192.168.1.10:2181>\",qt=”/export”,q=”**
>
>
> *:*”, fl=”id”,sort=”id asc”)' is not a proper expression clause*
> If the above is exactly as you entered it and the response that came back,
> then you've got invalid quote characters in there *”* vs *"* (e.g. zkHost=”
> rather than zkHost="), which could happen if you've used a rich editor like
> Word that autoformats text, or copied the example from this blog
>
> https://medium.com/@sarkaramrit2/getting-started-with-streaming-expressions-in-apache-solr-b49111a417e3
> which has them wrongly formatted
>
> On Fri, 28 Jun 2019 at 18:00, Kudrettin Güleryüz <ku...@gmail.com>
> wrote:
>
> > Thank you for responding.
> >
> > I didn't go though the parsers involved, I assume they'd be the defaults.
> >
> > I did notice later, though that /export is core specific. In fact we
> have a
> > Solr Cloud with 6 shards. I also found out that /stream can be used for
> > this but couldn't get a solution that works so far:
> > /stream?explain=true&expr=search(myCore,zkHost=”192.168.1.10:2181
> > ",qt=”/export”,q=”*:*”,
> > fl=”id”,sort=”id asc”)
> > returns
> >
> > 'search(myCore,zkHost=”192.168.1.10:2181\",qt=”/export”,q=”*:*”,
> > fl=”id”,sort=”id asc”)' is not a proper expression clause
> >
> > Is my syntax wrong or do I need to enable schema or config level
> > changes in order to get this work?
> >
> >
> > On Fri, Jun 28, 2019 at 11:50 AM Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> > > First I’d make sure that you were using the same query parser in both
> > > situations.
> > >
> > > Second, export is specific to a core, it is not cloud-aware so if this
> is
> > > SolrCloud I’d expect major differences, which you haven’t told us
> about,
> > > off by 5? 10,000?.
> > >
> > > Third, there was a bug at one point where export would leave off the
> last
> > > packet IIRC, what version of Solr are you using?
> > >
> > > Best,
> > > Erick
> > >
> > > > On Jun 28, 2019, at 7:11 AM, Kudrettin Güleryüz <kudrettin@gmail.com
> >
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > I'd like to give my website users ability to export a field for the
> > full
> > > > search result set. Specifying a very large pageSize seems to perform
> > very
> > > > poorly for this purpose. Therefore, considering using export
> > > requestHandler
> > > > for exporting search results.
> > > >
> > > > When I play with a core, I noticed that the numFound value was
> > different
> > > > between these two queries for the same core
> > > > export?fl=id&q=*:*&sort=id%20desc
> > > > select?fl=id&q=*:*&sort=id%20desc
> > > >
> > > > Can you please explain why this may be the case? Also any suggestions
> > on
> > > > alternatives would be nice.
> > > >
> > > > Thank you
> > >
> > >
> >
>

Re: different numFound value /select vs. /export

Posted by Colvin Cowie <co...@gmail.com>.
*/stream?explain=true&expr=sear*


*ch(myCore,zkHost=”192.168.1.10:2181
<http://192.168.1.10:2181>",qt=”/export”,q=”*:*”, fl=”id”,sort=”id asc”)
returns*
* 'search(myCore,zkHost=”192.168.1.10:2181
<http://192.168.1.10:2181>\",qt=”/export”,q=”**


*:*”, fl=”id”,sort=”id asc”)' is not a proper expression clause*
If the above is exactly as you entered it and the response that came back,
then you've got invalid quote characters in there *”* vs *"* (e.g. zkHost=”
rather than zkHost="), which could happen if you've used a rich editor like
Word that autoformats text, or copied the example from this blog
https://medium.com/@sarkaramrit2/getting-started-with-streaming-expressions-in-apache-solr-b49111a417e3
which has them wrongly formatted

On Fri, 28 Jun 2019 at 18:00, Kudrettin Güleryüz <ku...@gmail.com>
wrote:

> Thank you for responding.
>
> I didn't go though the parsers involved, I assume they'd be the defaults.
>
> I did notice later, though that /export is core specific. In fact we have a
> Solr Cloud with 6 shards. I also found out that /stream can be used for
> this but couldn't get a solution that works so far:
> /stream?explain=true&expr=search(myCore,zkHost=”192.168.1.10:2181
> ",qt=”/export”,q=”*:*”,
> fl=”id”,sort=”id asc”)
> returns
>
> 'search(myCore,zkHost=”192.168.1.10:2181\",qt=”/export”,q=”*:*”,
> fl=”id”,sort=”id asc”)' is not a proper expression clause
>
> Is my syntax wrong or do I need to enable schema or config level
> changes in order to get this work?
>
>
> On Fri, Jun 28, 2019 at 11:50 AM Erick Erickson <er...@gmail.com>
> wrote:
>
> > First I’d make sure that you were using the same query parser in both
> > situations.
> >
> > Second, export is specific to a core, it is not cloud-aware so if this is
> > SolrCloud I’d expect major differences, which you haven’t told us about,
> > off by 5? 10,000?.
> >
> > Third, there was a bug at one point where export would leave off the last
> > packet IIRC, what version of Solr are you using?
> >
> > Best,
> > Erick
> >
> > > On Jun 28, 2019, at 7:11 AM, Kudrettin Güleryüz <ku...@gmail.com>
> > wrote:
> > >
> > > Hi,
> > >
> > > I'd like to give my website users ability to export a field for the
> full
> > > search result set. Specifying a very large pageSize seems to perform
> very
> > > poorly for this purpose. Therefore, considering using export
> > requestHandler
> > > for exporting search results.
> > >
> > > When I play with a core, I noticed that the numFound value was
> different
> > > between these two queries for the same core
> > > export?fl=id&q=*:*&sort=id%20desc
> > > select?fl=id&q=*:*&sort=id%20desc
> > >
> > > Can you please explain why this may be the case? Also any suggestions
> on
> > > alternatives would be nice.
> > >
> > > Thank you
> >
> >
>

Re: different numFound value /select vs. /export

Posted by Kudrettin Güleryüz <ku...@gmail.com>.
Thank you for responding.

I didn't go though the parsers involved, I assume they'd be the defaults.

I did notice later, though that /export is core specific. In fact we have a
Solr Cloud with 6 shards. I also found out that /stream can be used for
this but couldn't get a solution that works so far:
/stream?explain=true&expr=search(myCore,zkHost=”192.168.1.10:2181",qt=”/export”,q=”*:*”,
fl=”id”,sort=”id asc”)
returns

'search(myCore,zkHost=”192.168.1.10:2181\",qt=”/export”,q=”*:*”,
fl=”id”,sort=”id asc”)' is not a proper expression clause

Is my syntax wrong or do I need to enable schema or config level
changes in order to get this work?


On Fri, Jun 28, 2019 at 11:50 AM Erick Erickson <er...@gmail.com>
wrote:

> First I’d make sure that you were using the same query parser in both
> situations.
>
> Second, export is specific to a core, it is not cloud-aware so if this is
> SolrCloud I’d expect major differences, which you haven’t told us about,
> off by 5? 10,000?.
>
> Third, there was a bug at one point where export would leave off the last
> packet IIRC, what version of Solr are you using?
>
> Best,
> Erick
>
> > On Jun 28, 2019, at 7:11 AM, Kudrettin Güleryüz <ku...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I'd like to give my website users ability to export a field for the full
> > search result set. Specifying a very large pageSize seems to perform very
> > poorly for this purpose. Therefore, considering using export
> requestHandler
> > for exporting search results.
> >
> > When I play with a core, I noticed that the numFound value was different
> > between these two queries for the same core
> > export?fl=id&q=*:*&sort=id%20desc
> > select?fl=id&q=*:*&sort=id%20desc
> >
> > Can you please explain why this may be the case? Also any suggestions on
> > alternatives would be nice.
> >
> > Thank you
>
>

Re: different numFound value /select vs. /export

Posted by Erick Erickson <er...@gmail.com>.
First I’d make sure that you were using the same query parser in both situations. 

Second, export is specific to a core, it is not cloud-aware so if this is SolrCloud I’d expect major differences, which you haven’t told us about, off by 5? 10,000?.

Third, there was a bug at one point where export would leave off the last packet IIRC, what version of Solr are you using?

Best,
Erick

> On Jun 28, 2019, at 7:11 AM, Kudrettin Güleryüz <ku...@gmail.com> wrote:
> 
> Hi,
> 
> I'd like to give my website users ability to export a field for the full
> search result set. Specifying a very large pageSize seems to perform very
> poorly for this purpose. Therefore, considering using export requestHandler
> for exporting search results.
> 
> When I play with a core, I noticed that the numFound value was different
> between these two queries for the same core
> export?fl=id&q=*:*&sort=id%20desc
> select?fl=id&q=*:*&sort=id%20desc
> 
> Can you please explain why this may be the case? Also any suggestions on
> alternatives would be nice.
> 
> Thank you