You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by joe_coder <co...@gmail.com> on 2009/07/17 15:23:15 UTC

Solr MultiCore query

I am new to solr/lucene and this may come across as a very naive question:)

There are 3 dataset over which I would like to implement the search
functionality. The 3 dataset ( lets call it D1, D2 and D3 ) and some fields
in common ( like name, displayname, desc ) and some specific fields ( like
D1 has some additional text fields and so on ). The requirement for the
search is that, upon searching on a term, I would like to show a hybrid
search ( faceting ) containing best results for D1, D2 and D3 and ( also
give users how many specific results are present in D1.. D3 and ability to
narrow down to that set )

Considering the above requirements, I would like to get some help from the
community on what is the best way to design this? 

1) Should I opt for 3 separate cores or an individual core ( with some
mandatory fields and rest optional .. and make faceting easy ? ) 

2) How can I get the faceting work?

3) How can I get spellcheck/morelikethis work ( incase I choose
single/multiple cores )?

PS: I am planning to use SolrJ.

-- 
View this message in context: http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534383.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Solr MultiCore query

Posted by "Manepalli, Kalyan" <KA...@orbitz.com>.
The default the schema has all the fields as string without any tokenizers. So all the queries will have to be case sensitive. 

For example, the below query would give results. 
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=iPo*

Take a look at this wiki page for Analyzers and tokenizers

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

Thanks,
Kalyan Manepalli
-----Original Message-----
From: Code Tester [mailto:codetester.codetester@gmail.com] 
Sent: Friday, July 17, 2009 1:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr MultiCore query

Both schema.xml ( in example/multicore/core0/conf and
example/multicore/core1/conf ) already have

* <defaultSearchField>name</defaultSearchField>*

Here are the following query responses:

1)
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*:*

<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">254</int></lst><result name="response" numFound="3"
start="0"><doc><str name="id">MA147LL/A</str><str name="name">Apple 60
GB iPod with Video Playback Black</str></doc><doc><str
name="id">F8V7067-APL-KIT</str><str name="name">Belkin Mobile Power
Cord for iPod w/ Dock</str></doc><doc><str name="id">IW-02</str><str
name="name">iPod &amp; iPod Mini USB 2.0 Cable</str></doc></result>
</response>

2)
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
*ipod*
No result

3)
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
*name:ipod
*No result*
*
What may be happening?

Thanks!

On Fri, Jul 17, 2009 at 11:37 PM, ahammad <ah...@gmail.com> wrote:

>
> Hello joe_coder,
>
> Are you using the default example docs in your queries?
>
> If so, then I see that the word "ipod" appears in a field called "name". By
> default, the default search field (defined in solrconfig.xml) is the field
> called "text". This means that when you submit a query without specifying
> which field to look for (using the field:query) notation, Solr
> automatically
> assumes that you are looking in the field called "text".
>
> If you change your query to q=name:ipod, you should get the results back.
>
> One way to prevent this is to change your default search field to something
> else. Alternatively, if you want to search on multiple fields, you can copy
> all those fields to the "text" field and go from there. This can be useful
> if for example you had a book library to search through. You may need to
> search on title, short summary, description etc simultaneously. You can
> copy
> all those things to the text field and then search on the text field, which
> contains all the information that you wanted to search on.
>
>
> joe_coder wrote:
> >
> > Thanks ahammad for the quick reply.
> >
> > As suggested, I am trying out multi core way of implementing the search.
> I
> > am trying out the multicore example and getting stuck at an issue. Here
> is
> > what I did and the issue I am facing
> >
> > 1) Downloaded 1.4 and started the multicore example using java
> > -Dsolr.solr.home=multicore -jar start.jar
> >
> > 2) There were 2 files present under example/multicore/exampledocs/ ,
> which
> > I
> > added to 2 cores respectively. ( Totally 3 docs are present in those 2
> > files
> > and all have the word 'ipod' in it )
> >
> > 3) When I query using
> >
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*:*I
> > get all the 3 results.
> >
> > But when I query using
> >
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
> > *ipod* , I get no results :(
> >
> > What could be the issue ?
> >
> > Thanks!
> >
> >
> > On Fri, Jul 17, 2009 at 7:20 PM, ahammad <ah...@gmail.com> wrote:
> >
> >>
> >> Hello,
> >>
> >> I'm not sure what the best way is to do this, but I have done something
> >> identical.
> >>
> >> I have the same requirements, ie several datasources. I also used SolrJ
> >> and
> >> jsp for this. The way I ended up doing it was to create a multi core
> >> environment, one core per datasource. When I do a query across several
> >> datasources, I use shards. Solr automatically returns a "hybrid" result
> >> set
> >> that way, sorted by solr's default scoring.
> >>
> >> Faceting comes in the picture when you want to show the number of
> >> documents
> >> per datasource and have the ability to narrow down the result set. The
> >> way
> >> I
> >> did it was to add a field called "dataSource" to all the documents, and
> >> injected them with a default value of the data source name (in your
> case,
> >> D1, D2 ...). You can do this by adding this in the schema:
> >>
> >> <field name="dataSource" type="string" indexed="true" stored="true"
> >> required="true" default="D1"/>
> >>
> >> When you perform a query across multiple datasources, you will use
> >> shards.
> >> Here is an example:
> >>
> >>
> >>
> http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2&q=some
> >> query
> >>
> >> That will search on both cores 1 and 2.
> >>
> >> To facet on the datasource in order to be able to categorize the result
> >> set,
> >> you can simply add this snippet to the query:
> >>
> >> &facet=on&facet.field=dataSource
> >>
> >> This will return the datasources that are defined with their number of
> >> results for the query.
> >>
> >> Making the facet results clickable in order to narrow down the results
> >> can
> >> be achieved by adding a filter to the query and filtering to a specific
> >> dataSource. I actually ended uo creating a fairly intuitive front-end
> for
> >> my
> >> system with faceting, filtering, paging etc all using jsp and SolrJ.
> >> SolrJ
> >> is powerful enough to handle all of the backend processing.
> >>
> >> Good luck!
> >>
> >>
> >>
> >>
> >>
> >>
> >> joe_coder wrote:
> >> >
> >> > I missed adding some size related information in the query above.
> >> >
> >> > D1 and D2 would have close to 1 million records each
> >> > D3 would have ~10 million records.
> >> >
> >> > Thanks!
> >> >
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534793.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Solr-MultiCore-query-tp24534383p24539215.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Solr MultiCore query

Posted by Code Tester <co...@gmail.com>.
Both schema.xml ( in example/multicore/core0/conf and
example/multicore/core1/conf ) already have

* <defaultSearchField>name</defaultSearchField>*

Here are the following query responses:

1)
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*:*

<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">254</int></lst><result name="response" numFound="3"
start="0"><doc><str name="id">MA147LL/A</str><str name="name">Apple 60
GB iPod with Video Playback Black</str></doc><doc><str
name="id">F8V7067-APL-KIT</str><str name="name">Belkin Mobile Power
Cord for iPod w/ Dock</str></doc><doc><str name="id">IW-02</str><str
name="name">iPod &amp; iPod Mini USB 2.0 Cable</str></doc></result>
</response>

2)
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
*ipod*
No result

3)
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
*name:ipod
*No result*
*
What may be happening?

Thanks!

On Fri, Jul 17, 2009 at 11:37 PM, ahammad <ah...@gmail.com> wrote:

>
> Hello joe_coder,
>
> Are you using the default example docs in your queries?
>
> If so, then I see that the word "ipod" appears in a field called "name". By
> default, the default search field (defined in solrconfig.xml) is the field
> called "text". This means that when you submit a query without specifying
> which field to look for (using the field:query) notation, Solr
> automatically
> assumes that you are looking in the field called "text".
>
> If you change your query to q=name:ipod, you should get the results back.
>
> One way to prevent this is to change your default search field to something
> else. Alternatively, if you want to search on multiple fields, you can copy
> all those fields to the "text" field and go from there. This can be useful
> if for example you had a book library to search through. You may need to
> search on title, short summary, description etc simultaneously. You can
> copy
> all those things to the text field and then search on the text field, which
> contains all the information that you wanted to search on.
>
>
> joe_coder wrote:
> >
> > Thanks ahammad for the quick reply.
> >
> > As suggested, I am trying out multi core way of implementing the search.
> I
> > am trying out the multicore example and getting stuck at an issue. Here
> is
> > what I did and the issue I am facing
> >
> > 1) Downloaded 1.4 and started the multicore example using java
> > -Dsolr.solr.home=multicore -jar start.jar
> >
> > 2) There were 2 files present under example/multicore/exampledocs/ ,
> which
> > I
> > added to 2 cores respectively. ( Totally 3 docs are present in those 2
> > files
> > and all have the word 'ipod' in it )
> >
> > 3) When I query using
> >
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*:*I
> > get all the 3 results.
> >
> > But when I query using
> >
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
> > *ipod* , I get no results :(
> >
> > What could be the issue ?
> >
> > Thanks!
> >
> >
> > On Fri, Jul 17, 2009 at 7:20 PM, ahammad <ah...@gmail.com> wrote:
> >
> >>
> >> Hello,
> >>
> >> I'm not sure what the best way is to do this, but I have done something
> >> identical.
> >>
> >> I have the same requirements, ie several datasources. I also used SolrJ
> >> and
> >> jsp for this. The way I ended up doing it was to create a multi core
> >> environment, one core per datasource. When I do a query across several
> >> datasources, I use shards. Solr automatically returns a "hybrid" result
> >> set
> >> that way, sorted by solr's default scoring.
> >>
> >> Faceting comes in the picture when you want to show the number of
> >> documents
> >> per datasource and have the ability to narrow down the result set. The
> >> way
> >> I
> >> did it was to add a field called "dataSource" to all the documents, and
> >> injected them with a default value of the data source name (in your
> case,
> >> D1, D2 ...). You can do this by adding this in the schema:
> >>
> >> <field name="dataSource" type="string" indexed="true" stored="true"
> >> required="true" default="D1"/>
> >>
> >> When you perform a query across multiple datasources, you will use
> >> shards.
> >> Here is an example:
> >>
> >>
> >>
> http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2&q=some
> >> query
> >>
> >> That will search on both cores 1 and 2.
> >>
> >> To facet on the datasource in order to be able to categorize the result
> >> set,
> >> you can simply add this snippet to the query:
> >>
> >> &facet=on&facet.field=dataSource
> >>
> >> This will return the datasources that are defined with their number of
> >> results for the query.
> >>
> >> Making the facet results clickable in order to narrow down the results
> >> can
> >> be achieved by adding a filter to the query and filtering to a specific
> >> dataSource. I actually ended uo creating a fairly intuitive front-end
> for
> >> my
> >> system with faceting, filtering, paging etc all using jsp and SolrJ.
> >> SolrJ
> >> is powerful enough to handle all of the backend processing.
> >>
> >> Good luck!
> >>
> >>
> >>
> >>
> >>
> >>
> >> joe_coder wrote:
> >> >
> >> > I missed adding some size related information in the query above.
> >> >
> >> > D1 and D2 would have close to 1 million records each
> >> > D3 would have ~10 million records.
> >> >
> >> > Thanks!
> >> >
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534793.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Solr-MultiCore-query-tp24534383p24539215.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Solr MultiCore query

Posted by ahammad <ah...@gmail.com>.
Hello joe_coder,

Are you using the default example docs in your queries?

If so, then I see that the word "ipod" appears in a field called "name". By
default, the default search field (defined in solrconfig.xml) is the field
called "text". This means that when you submit a query without specifying
which field to look for (using the field:query) notation, Solr automatically
assumes that you are looking in the field called "text".

If you change your query to q=name:ipod, you should get the results back.

One way to prevent this is to change your default search field to something
else. Alternatively, if you want to search on multiple fields, you can copy
all those fields to the "text" field and go from there. This can be useful
if for example you had a book library to search through. You may need to
search on title, short summary, description etc simultaneously. You can copy
all those things to the text field and then search on the text field, which
contains all the information that you wanted to search on.


joe_coder wrote:
> 
> Thanks ahammad for the quick reply.
> 
> As suggested, I am trying out multi core way of implementing the search. I
> am trying out the multicore example and getting stuck at an issue. Here is
> what I did and the issue I am facing
> 
> 1) Downloaded 1.4 and started the multicore example using java
> -Dsolr.solr.home=multicore -jar start.jar
> 
> 2) There were 2 files present under example/multicore/exampledocs/ , which
> I
> added to 2 cores respectively. ( Totally 3 docs are present in those 2
> files
> and all have the word 'ipod' in it )
> 
> 3) When I query using
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*:*I
> get all the 3 results.
> 
> But when I query using
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
> *ipod* , I get no results :(
> 
> What could be the issue ?
> 
> Thanks!
> 
> 
> On Fri, Jul 17, 2009 at 7:20 PM, ahammad <ah...@gmail.com> wrote:
> 
>>
>> Hello,
>>
>> I'm not sure what the best way is to do this, but I have done something
>> identical.
>>
>> I have the same requirements, ie several datasources. I also used SolrJ
>> and
>> jsp for this. The way I ended up doing it was to create a multi core
>> environment, one core per datasource. When I do a query across several
>> datasources, I use shards. Solr automatically returns a "hybrid" result
>> set
>> that way, sorted by solr's default scoring.
>>
>> Faceting comes in the picture when you want to show the number of
>> documents
>> per datasource and have the ability to narrow down the result set. The
>> way
>> I
>> did it was to add a field called "dataSource" to all the documents, and
>> injected them with a default value of the data source name (in your case,
>> D1, D2 ...). You can do this by adding this in the schema:
>>
>> <field name="dataSource" type="string" indexed="true" stored="true"
>> required="true" default="D1"/>
>>
>> When you perform a query across multiple datasources, you will use
>> shards.
>> Here is an example:
>>
>>
>> http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2&q=some
>> query
>>
>> That will search on both cores 1 and 2.
>>
>> To facet on the datasource in order to be able to categorize the result
>> set,
>> you can simply add this snippet to the query:
>>
>> &facet=on&facet.field=dataSource
>>
>> This will return the datasources that are defined with their number of
>> results for the query.
>>
>> Making the facet results clickable in order to narrow down the results
>> can
>> be achieved by adding a filter to the query and filtering to a specific
>> dataSource. I actually ended uo creating a fairly intuitive front-end for
>> my
>> system with faceting, filtering, paging etc all using jsp and SolrJ.
>> SolrJ
>> is powerful enough to handle all of the backend processing.
>>
>> Good luck!
>>
>>
>>
>>
>>
>>
>> joe_coder wrote:
>> >
>> > I missed adding some size related information in the query above.
>> >
>> > D1 and D2 would have close to 1 million records each
>> > D3 would have ~10 million records.
>> >
>> > Thanks!
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534793.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Solr-MultiCore-query-tp24534383p24539215.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr MultiCore query

Posted by Code Tester <co...@gmail.com>.
Thanks ahammad for the quick reply.

As suggested, I am trying out multi core way of implementing the search. I
am trying out the multicore example and getting stuck at an issue. Here is
what I did and the issue I am facing

1) Downloaded 1.4 and started the multicore example using java
-Dsolr.solr.home=multicore -jar start.jar

2) There were 2 files present under example/multicore/exampledocs/ , which I
added to 2 cores respectively. ( Totally 3 docs are present in those 2 files
and all have the word 'ipod' in it )

3) When I query using
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*:*I
get all the 3 results.

But when I query using
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
*ipod* , I get no results :(

What could be the issue ?

Thanks!


On Fri, Jul 17, 2009 at 7:20 PM, ahammad <ah...@gmail.com> wrote:

>
> Hello,
>
> I'm not sure what the best way is to do this, but I have done something
> identical.
>
> I have the same requirements, ie several datasources. I also used SolrJ and
> jsp for this. The way I ended up doing it was to create a multi core
> environment, one core per datasource. When I do a query across several
> datasources, I use shards. Solr automatically returns a "hybrid" result set
> that way, sorted by solr's default scoring.
>
> Faceting comes in the picture when you want to show the number of documents
> per datasource and have the ability to narrow down the result set. The way
> I
> did it was to add a field called "dataSource" to all the documents, and
> injected them with a default value of the data source name (in your case,
> D1, D2 ...). You can do this by adding this in the schema:
>
> <field name="dataSource" type="string" indexed="true" stored="true"
> required="true" default="D1"/>
>
> When you perform a query across multiple datasources, you will use shards.
> Here is an example:
>
>
> http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2&q=some
> query
>
> That will search on both cores 1 and 2.
>
> To facet on the datasource in order to be able to categorize the result
> set,
> you can simply add this snippet to the query:
>
> &facet=on&facet.field=dataSource
>
> This will return the datasources that are defined with their number of
> results for the query.
>
> Making the facet results clickable in order to narrow down the results can
> be achieved by adding a filter to the query and filtering to a specific
> dataSource. I actually ended uo creating a fairly intuitive front-end for
> my
> system with faceting, filtering, paging etc all using jsp and SolrJ. SolrJ
> is powerful enough to handle all of the backend processing.
>
> Good luck!
>
>
>
>
>
>
> joe_coder wrote:
> >
> > I missed adding some size related information in the query above.
> >
> > D1 and D2 would have close to 1 million records each
> > D3 would have ~10 million records.
> >
> > Thanks!
> >
>
> --
> View this message in context:
> http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534793.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Solr MultiCore query

Posted by ahammad <ah...@gmail.com>.
Hello,

I'm not sure what the best way is to do this, but I have done something
identical.

I have the same requirements, ie several datasources. I also used SolrJ and
jsp for this. The way I ended up doing it was to create a multi core
environment, one core per datasource. When I do a query across several
datasources, I use shards. Solr automatically returns a "hybrid" result set
that way, sorted by solr's default scoring.

Faceting comes in the picture when you want to show the number of documents
per datasource and have the ability to narrow down the result set. The way I
did it was to add a field called "dataSource" to all the documents, and
injected them with a default value of the data source name (in your case,
D1, D2 ...). You can do this by adding this in the schema:

<field name="dataSource" type="string" indexed="true" stored="true"
required="true" default="D1"/> 

When you perform a query across multiple datasources, you will use shards.
Here is an example:

http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2&q=some
query

That will search on both cores 1 and 2.

To facet on the datasource in order to be able to categorize the result set,
you can simply add this snippet to the query:

&facet=on&facet.field=dataSource

This will return the datasources that are defined with their number of
results for the query.

Making the facet results clickable in order to narrow down the results can
be achieved by adding a filter to the query and filtering to a specific
dataSource. I actually ended uo creating a fairly intuitive front-end for my
system with faceting, filtering, paging etc all using jsp and SolrJ. SolrJ
is powerful enough to handle all of the backend processing.

Good luck!






joe_coder wrote:
> 
> I missed adding some size related information in the query above.
> 
> D1 and D2 would have close to 1 million records each
> D3 would have ~10 million records.
> 
> Thanks!
> 

-- 
View this message in context: http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534793.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr MultiCore query

Posted by joe_coder <co...@gmail.com>.
I missed adding some size related information in the query above.

D1 and D2 would have close to 1 million records each
D3 would have ~10 million records.

Thanks!
-- 
View this message in context: http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534421.html
Sent from the Solr - User mailing list archive at Nabble.com.