You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by rajh <ro...@trimm.nl> on 2013/05/23 14:36:15 UTC

Restaurant availability from database

Hi,

I am are building a website that lists restaurant information and I also
like to include the availability information.

I've created a custom ValueSourceParser and ValueSource that retrieve the
availability information from a MySQL database. An example query is as
follows.

http://localhost:8983/solr/collection1/select?q=restaurant_id:*&fl=*,available:availability(2013-05-23,
2, 1700, 2359)

This results in a psuedo (boolean) field "available" per document result and
this works as expected. But my problem is that I also need the total number
of available restaurants.

Is there a way to count the number of available restaurants over the whole
result set? I tried the stats component, but it doesn't seem to work with
pseudo fields.

Thanks in advance,

Ronald





--
View this message in context: http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Restaurant availability from database

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.

Use this reference:
http://wiki.apache.org/solr/SpatialForTimeDurations


Alexandre Rafalovitch wrote
> On Thu, May 23, 2013 at 6:47 PM, Amit Nithian &lt;

> anithian@

> &gt; wrote:
>> Hossman did a presentation on something similar to this using spatial
>> data
>> at a Solr meetup some months ago.
>>
>> http://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/
> 
> This presentation rocks (I like examples too). I would love to see
> this expanded into even longer form. It was a bit mind-bending to
> stare at those rectangle constraints and understand why they actually
> do map to the problem space.
> 
> Regards,
>    Alex.
> 
> 
> Personal blog: http://blog.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)





-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609p4065939.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Restaurant availability from database

Posted by Alexandre Rafalovitch <ar...@gmail.com>.

On Thu, May 23, 2013 at 6:47 PM, Amit Nithian <an...@gmail.com> wrote:
> Hossman did a presentation on something similar to this using spatial data
> at a Solr meetup some months ago.
>
> http://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/

This presentation rocks (I like examples too). I would love to see
this expanded into even longer form. It was a bit mind-bending to
stare at those rectangle constraints and understand why they actually
do map to the problem space.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)

Re: Restaurant availability from database

Posted by Amit Nithian <an...@gmail.com>.

Hossman did a presentation on something similar to this using spatial data
at a Solr meetup some months ago.

http://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/

May be helpful to you.


On Thu, May 23, 2013 at 9:40 AM, rajh <ro...@trimm.nl> wrote:

> Thank you for your answer.
>
> Do you mean I should index the availability data as a document in Solr?
> Because the availability data in our databases is around 6,509,972 records
> and contains the availability per number of seats and per 15 minutes. I
> also
> tried this method, and as far as I know it's only possible to join the
> availability documents and not to include that information per result
> document.
>
> An example API response (created from the Solr response):
> {
>         "restaurants": [
>                 {
>                         "id": "13906",
>                         "name": "Allerlei",
>                         "zipcode": "6511DP",
>                         "house_number": "59",
>                         "available": true
>                 },
>                 {
>                         "id": "13907",
>                         "name": "Voorbeeld",
>                         "zipcode": "6512DP",
>                         "house_number": "39",
>                         "available": false
>                 }
>         ],
>         "resultCount": 12156,
>         "resultCountAvailable": 55,
> }
>
> I'm currently hacking around the problem by executing the search again with
> a very high value for the rows parameter and counting the number of
> available restaurants on the backend, but this causes a big performance
> impact (as expected).
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609p4065710.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Restaurant availability from database

Posted by rajh <ro...@trimm.nl>.

Thank you for your answer.

Do you mean I should index the availability data as a document in Solr?
Because the availability data in our databases is around 6,509,972 records
and contains the availability per number of seats and per 15 minutes. I also
tried this method, and as far as I know it's only possible to join the
availability documents and not to include that information per result
document.

An example API response (created from the Solr response):
{
	"restaurants": [
		{
			"id": "13906",
			"name": "Allerlei",
			"zipcode": "6511DP",
			"house_number": "59",
			"available": true
		},
		{
			"id": "13907",
			"name": "Voorbeeld",
			"zipcode": "6512DP",
			"house_number": "39",
			"available": false
		}
	],
	"resultCount": 12156,
	"resultCountAvailable": 55,
}

I'm currently hacking around the problem by executing the search again with
a very high value for the rows parameter and counting the number of
available restaurants on the backend, but this causes a big performance
impact (as expected).




--
View this message in context: http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609p4065710.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Restaurant availability from database

Posted by Alexandre Rafalovitch <ar...@gmail.com>.

Check out Gilt's presentation. It might give you some ideas, including
possibly on refactoring your entities around 'availability' as a
document:
http://www.lucenerevolution.org/sites/default/files/Personalized%20Search%20on%20the%20Largest%20Flash%20Sale%20Site%20in%20America.pdf

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, May 23, 2013 at 8:36 AM, rajh <ro...@trimm.nl> wrote:
> Hi,
>
> I am are building a website that lists restaurant information and I also
> like to include the availability information.
>
> I've created a custom ValueSourceParser and ValueSource that retrieve the
> availability information from a MySQL database. An example query is as
> follows.
>
> http://localhost:8983/solr/collection1/select?q=restaurant_id:*&fl=*,available:availability(2013-05-23,
> 2, 1700, 2359)
>
> This results in a psuedo (boolean) field "available" per document result and
> this works as expected. But my problem is that I also need the total number
> of available restaurants.
>
> Is there a way to count the number of available restaurants over the whole
> result set? I tried the stats component, but it doesn't seem to work with
> pseudo fields.
>
> Thanks in advance,
>
> Ronald
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Restaurant availability from database

Posted by Chris Hostetter <ho...@fucit.org>.

: I've created a custom ValueSourceParser and ValueSource that retrieve the
: availability information from a MySQL database. An example query is as
: follows.
: 
: http://localhost:8983/solr/collection1/select?q=restaurant_id:*&fl=*,available:availability(2013-05-23,
: 2, 1700, 2359)
: 
: This results in a psuedo (boolean) field "available" per document result and
: this works as expected. But my problem is that I also need the total number
: of available restaurants.

1) "restaurant_id:*" is not doing what you think it is doing, use "*:*" or 
add an "is_restaurant" boolean field and query on that instead and you 
will probably discover that your queries for all are docs (or all 
restaurants) get much much faster.

2) if you've already built a custom ValueSourceParser that you're really 
happy with, and you just want to filter your Solr results based on the 
output of that custom ValueSource, you can do so by leveraging the frange 
QParser.  If your custom value source returns a boolean, then you just 
have to me a bit tricky with the function range you ask for...

 fq={!frange cache=false cost=1000 l=1}if(availability(2013-05-23,2,1700,2359),5,0)

A few things to note in this example:

a) i'm using the if() function to map true to "5" (arbitray) and false to 
"0" (also arbitrary) and then filtering to only match documents whose 
value is "1" (arbitrary) or higher ... you can pick any values you want

b) unlike using your custom value source in the "fl" when used in an fq, 
your ValueSouce function will be called for a *lot* of documents -- so you 
probably ant to batch request the availability when the ValueSourceParser 
is called, for fast lookup on each individual document.

c) i've specified cache=false and a high cost param on the frange to 
ensure that the custom ValueSource is only ever asked about the 
availability of documents that already match our main query and any other 
filter queries.

3) if you don't want to filter or otherwise modify the result set by the 
results of your custom ValueSource, you just need the count of available 
documents matching your main query (independent of the numFound count of 
docs matching your main query), you can use the same technique in a 
"facet.query".


-Hoss