You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Stephen Weiss <sw...@stylesight.com> on 2008/11/07 05:19:02 UTC

Solr with Wordpress - Anyone doing this?

Hi,

We recently implemented Solr for one major search component of our  
site, and now that this is complete we're turning to other areas of  
our site to see where Solr can help us improve results relevancy and  
performance.

One major area where I think Solr could do a lot of good is to replace  
Wordpress's search function.  Wordpress powers a solid 1/3 of our  
site, and moving this over could provide dramatic performance  
benefits.  I see there is a Lucene plugin for WP but I have not seen  
any plugin yet using Solr.  While I'm not terribly afraid of writing  
one (we've already completely replaced the built-in routine with our  
own plugin to optimize MySQL searching), it would of course be even  
better if there was some sort of plugin already out there (why  
reinvent the wheel)?  Somehow it just seems strange that no one would  
have tried this yet.

I figure if anyone knows, someone on this list knows.  Thanks for any  
info!

--
Steve

Re: Solr with Wordpress - Anyone doing this?

Posted by Michael Kimsal <mg...@gmail.com>.
Agree with Grant - this is likely because there's a PHP-based Lucene library
(maybe more, but definitely one from Zend in Zend Framework) that can be
used to read/write directly to Lucene files.  Adding SOLR in the mix might
bring more benefit , as from what I've been told the Zend Lucene stuff isn't
terribly fast (haven't used it myself, this was the experience of a friend),
but you're also needing to mix in a Java application where most WP systems
are traditionally hosted in shared LAMP situations, where Java apps aren't
possible.
A hosted SOLR situation might be useul for people in those situation, and
I've considered building one last year, but I'm not sure the demand is
there.


On Mon, Nov 10, 2008 at 9:43 AM, Grant Ingersoll <gs...@apache.org>wrote:

> I don't know of anyone that has done this, but I would welcome it as well.
>  I suspect the main issue is that most WP users live in a shared hosting
> world, where Java doesn't play very nicely.
>
> That being said, it would be fairly easy to use the DataImportHandler's
> feed import for indexing (I think) and then it's just a matter of pointing
> the search box at the Solr instance, I suppose.
>
> On Nov 6, 2008, at 11:19 PM, Stephen Weiss wrote:
>
>  Hi,
>>
>> We recently implemented Solr for one major search component of our site,
>> and now that this is complete we're turning to other areas of our site to
>> see where Solr can help us improve results relevancy and performance.
>>
>> One major area where I think Solr could do a lot of good is to replace
>> Wordpress's search function.  Wordpress powers a solid 1/3 of our site, and
>> moving this over could provide dramatic performance benefits.  I see there
>> is a Lucene plugin for WP but I have not seen any plugin yet using Solr.
>>  While I'm not terribly afraid of writing one (we've already completely
>> replaced the built-in routine with our own plugin to optimize MySQL
>> searching), it would of course be even better if there was some sort of
>> plugin already out there (why reinvent the wheel)?  Somehow it just seems
>> strange that no one would have tried this yet.
>>
>> I figure if anyone knows, someone on this list knows.  Thanks for any
>> info!
>>
>> --
>> Steve
>>
>
> --------------------------
> Grant Ingersoll
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
>
>


-- 
Michael Kimsal
http://www.groovymag.com
for groovy and grails developers

Re: Solr with Wordpress - Anyone doing this?

Posted by Stephen Weiss <sw...@stylesight.com>.
Unfortunately I don't think it's that sophisticated...  There was a  
request out in the Wordpress world for an extendable search interface  
(like Drupal) but I don't think it got much traction.  The plugin we  
use now for searching simply implements hooks that modify the query (a  
lot) before it goes to the DB.

Our use of Wordpress is not exactly typical... for us it's more of a  
CMS than a blog.  We'd have no problem integrating it with Java...   
Right now I'm using version 1.2 still (I'm waiting for the bug reports  
on 1.3 to settle down, especially in regard to DataImportHandler), so  
I'm probably not going to be using DataImportHandler.  My plan was  
just to have it set up to send add document commands whenever a post  
is published or modified (after being published).

Right now I'm looking at a plugin that exists to integrate WP with  
Lucene (using ZendSearchLucene, I think).  I think it could be fairly  
easily modified to work with Solr instead (it just has to send  
commands over HTTP instead of using direct access to the index).  My  
only thing is we wanted to use faceting as well, which isn't really  
implemented with that plugin at all.

After I didn't find anything or get a response for a while we've  
already started working on it... Nothing complete yet but at least so  
far I can see everywhere where I think things need to be modified  
(it's not the top priority so we're not exactly moving quickly).  It's  
not so bad, just work.  Since I'm not using DataImportHandler my  
plugin may not end up being very useful to others but perhaps I'll put  
it up somewhere anyway as a rough example.

Thanks for the replies.

--
Steve


On Nov 10, 2008, at 11:27 AM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:

> I'm not sure what kind of interfaces WordPress expose. Does it have a
> DB/REST end point?
>
> If so, it would be very easy to write a sample data-config.xml for  
> wordpress.
>
> --Noble
>
> On Mon, Nov 10, 2008 at 8:13 PM, Grant Ingersoll  
> <gs...@apache.org> wrote:
>> I don't know of anyone that has done this, but I would welcome it  
>> as well.
>> I suspect the main issue is that most WP users live in a shared  
>> hosting
>> world, where Java doesn't play very nicely.
>>
>> That being said, it would be fairly easy to use the  
>> DataImportHandler's feed
>> import for indexing (I think) and then it's just a matter of  
>> pointing the
>> search box at the Solr instance, I suppose.
>>
>> On Nov 6, 2008, at 11:19 PM, Stephen Weiss wrote:
>>
>>> Hi,
>>>
>>> We recently implemented Solr for one major search component of our  
>>> site,
>>> and now that this is complete we're turning to other areas of our  
>>> site to
>>> see where Solr can help us improve results relevancy and  
>>> performance.
>>>
>>> One major area where I think Solr could do a lot of good is to  
>>> replace
>>> Wordpress's search function.  Wordpress powers a solid 1/3 of our  
>>> site, and
>>> moving this over could provide dramatic performance benefits.  I  
>>> see there
>>> is a Lucene plugin for WP but I have not seen any plugin yet using  
>>> Solr.
>>> While I'm not terribly afraid of writing one (we've already  
>>> completely
>>> replaced the built-in routine with our own plugin to optimize MySQL
>>> searching), it would of course be even better if there was some  
>>> sort of
>>> plugin already out there (why reinvent the wheel)?  Somehow it  
>>> just seems
>>> strange that no one would have tried this yet.
>>>
>>> I figure if anyone knows, someone on this list knows.  Thanks for  
>>> any
>>> info!
>>>
>>> --
>>> Steve
>>
>> --------------------------
>> Grant Ingersoll
>>
>> Lucene Helpful Hints:
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
> -- 
> --Noble Paul


Re: Solr with Wordpress - Anyone doing this?

Posted by Ryan McKinley <ry...@gmail.com>.
A simple data-config.xml hitting a SQL db is all we need.  Here is a  
wordpress DB layout:

mysql> show tables;
+-----------------------+
| Tables_in_india       |
+-----------------------+
| wp_comments           |
| wp_links              |
| wp_options            |
| wp_postmeta           |
| wp_posts              |
| wp_term_relationships |
| wp_term_taxonomy      |
| wp_terms              |
| wp_usermeta           |
| wp_users              |
+-----------------------+
10 rows in set (0.00 sec)

mysql> describe wp_posts;
+-----------------------+---------------------+------+----- 
+---------------------+----------------+
| Field                 | Type                | Null | Key |  
Default             | Extra          |
+-----------------------+---------------------+------+----- 
+---------------------+----------------+
| ID                    | bigint(20) unsigned | NO   | PRI |  
NULL                | auto_increment |
| post_author           | bigint(20)          | NO   |     |  
0                   |                |
| post_date             | datetime            | NO   |     |  
0000-00-00 00:00:00 |                |
| post_date_gmt         | datetime            | NO   |     |  
0000-00-00 00:00:00 |                |
| post_content          | longtext            | NO   |     |  
NULL                |                |
| post_title            | text                | NO   |     |  
NULL                |                |
| post_category         | int(4)              | NO   |     |  
0                   |                |
| post_excerpt          | text                | NO   |     |  
NULL                |                |
| post_status           | varchar(20)         | NO   |     |  
publish             |                |
| comment_status        | varchar(20)         | NO   |     |  
open                |                |
| ping_status           | varchar(20)         | NO   |     |  
open                |                |
| post_password         | varchar(20)         | NO   |      
|                     |                |
| post_name             | varchar(200)        | NO   | MUL  
|                     |                |
| to_ping               | text                | NO   |     |  
NULL                |                |
| pinged                | text                | NO   |     |  
NULL                |                |
| post_modified         | datetime            | NO   |     |  
0000-00-00 00:00:00 |                |
| post_modified_gmt     | datetime            | NO   |     |  
0000-00-00 00:00:00 |                |
| post_content_filtered | text                | NO   |     |  
NULL                |                |
| post_parent           | bigint(20)          | NO   |     |  
0                   |                |
| guid                  | varchar(255)        | NO   |      
|                     |                |
| menu_order            | int(11)             | NO   |     |  
0                   |                |
| post_type             | varchar(20)         | NO   | MUL |  
post                |                |
| post_mime_type        | varchar(100)        | NO   |      
|                     |                |
| comment_count         | bigint(20)          | NO   |     |  
0                   |                |
+-----------------------+---------------------+------+----- 
+---------------------+----------------+
24 rows in set (0.01 sec)





On Nov 10, 2008, at 11:27 AM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:

> I'm not sure what kind of interfaces WordPress expose. Does it have a
> DB/REST end point?
>
> If so, it would be very easy to write a sample data-config.xml for  
> wordpress.
>
> --Noble
>
> On Mon, Nov 10, 2008 at 8:13 PM, Grant Ingersoll  
> <gs...@apache.org> wrote:
>> I don't know of anyone that has done this, but I would welcome it  
>> as well.
>> I suspect the main issue is that most WP users live in a shared  
>> hosting
>> world, where Java doesn't play very nicely.
>>
>> That being said, it would be fairly easy to use the  
>> DataImportHandler's feed
>> import for indexing (I think) and then it's just a matter of  
>> pointing the
>> search box at the Solr instance, I suppose.
>>
>> On Nov 6, 2008, at 11:19 PM, Stephen Weiss wrote:
>>
>>> Hi,
>>>
>>> We recently implemented Solr for one major search component of our  
>>> site,
>>> and now that this is complete we're turning to other areas of our  
>>> site to
>>> see where Solr can help us improve results relevancy and  
>>> performance.
>>>
>>> One major area where I think Solr could do a lot of good is to  
>>> replace
>>> Wordpress's search function.  Wordpress powers a solid 1/3 of our  
>>> site, and
>>> moving this over could provide dramatic performance benefits.  I  
>>> see there
>>> is a Lucene plugin for WP but I have not seen any plugin yet using  
>>> Solr.
>>> While I'm not terribly afraid of writing one (we've already  
>>> completely
>>> replaced the built-in routine with our own plugin to optimize MySQL
>>> searching), it would of course be even better if there was some  
>>> sort of
>>> plugin already out there (why reinvent the wheel)?  Somehow it  
>>> just seems
>>> strange that no one would have tried this yet.
>>>
>>> I figure if anyone knows, someone on this list knows.  Thanks for  
>>> any
>>> info!
>>>
>>> --
>>> Steve
>>
>> --------------------------
>> Grant Ingersoll
>>
>> Lucene Helpful Hints:
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
> -- 
> --Noble Paul


Re: Solr with Wordpress - Anyone doing this?

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
I'm not sure what kind of interfaces WordPress expose. Does it have a
DB/REST end point?

If so, it would be very easy to write a sample data-config.xml for wordpress.

--Noble

On Mon, Nov 10, 2008 at 8:13 PM, Grant Ingersoll <gs...@apache.org> wrote:
> I don't know of anyone that has done this, but I would welcome it as well.
>  I suspect the main issue is that most WP users live in a shared hosting
> world, where Java doesn't play very nicely.
>
> That being said, it would be fairly easy to use the DataImportHandler's feed
> import for indexing (I think) and then it's just a matter of pointing the
> search box at the Solr instance, I suppose.
>
> On Nov 6, 2008, at 11:19 PM, Stephen Weiss wrote:
>
>> Hi,
>>
>> We recently implemented Solr for one major search component of our site,
>> and now that this is complete we're turning to other areas of our site to
>> see where Solr can help us improve results relevancy and performance.
>>
>> One major area where I think Solr could do a lot of good is to replace
>> Wordpress's search function.  Wordpress powers a solid 1/3 of our site, and
>> moving this over could provide dramatic performance benefits.  I see there
>> is a Lucene plugin for WP but I have not seen any plugin yet using Solr.
>>  While I'm not terribly afraid of writing one (we've already completely
>> replaced the built-in routine with our own plugin to optimize MySQL
>> searching), it would of course be even better if there was some sort of
>> plugin already out there (why reinvent the wheel)?  Somehow it just seems
>> strange that no one would have tried this yet.
>>
>> I figure if anyone knows, someone on this list knows.  Thanks for any
>> info!
>>
>> --
>> Steve
>
> --------------------------
> Grant Ingersoll
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
>
>



-- 
--Noble Paul

Re: Solr with Wordpress - Anyone doing this?

Posted by Grant Ingersoll <gs...@apache.org>.
I don't know of anyone that has done this, but I would welcome it as  
well.  I suspect the main issue is that most WP users live in a shared  
hosting world, where Java doesn't play very nicely.

That being said, it would be fairly easy to use the  
DataImportHandler's feed import for indexing (I think) and then it's  
just a matter of pointing the search box at the Solr instance, I  
suppose.

On Nov 6, 2008, at 11:19 PM, Stephen Weiss wrote:

> Hi,
>
> We recently implemented Solr for one major search component of our  
> site, and now that this is complete we're turning to other areas of  
> our site to see where Solr can help us improve results relevancy and  
> performance.
>
> One major area where I think Solr could do a lot of good is to  
> replace Wordpress's search function.  Wordpress powers a solid 1/3  
> of our site, and moving this over could provide dramatic performance  
> benefits.  I see there is a Lucene plugin for WP but I have not seen  
> any plugin yet using Solr.  While I'm not terribly afraid of writing  
> one (we've already completely replaced the built-in routine with our  
> own plugin to optimize MySQL searching), it would of course be even  
> better if there was some sort of plugin already out there (why  
> reinvent the wheel)?  Somehow it just seems strange that no one  
> would have tried this yet.
>
> I figure if anyone knows, someone on this list knows.  Thanks for  
> any info!
>
> --
> Steve

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ