You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Todd Hunt <To...@nisc.coop> on 2013/05/20 21:02:51 UTC

Existing Project using Hibernate, Spring and Lucene and Looking to Add Solr

Hi,

We have an existing Java based enterprise application that is bundled as a WAR file and runs on Tomcat and uses Spring 3.0.5, Hibernate 3.6.2, and Lucene 3.0.3.  We are using annotations in Hibernate that nicely couple it Lucene to index objects (documents, images, PDFs, etc.) based on key value pairs.  We use Hibernate Search to retrieve the results were are looking for.

We want to extend our indexing capability to use Tika to extract text and metadata out of documents that are uploaded to the server and index that content.

When I initially read about Solr I saw that it would provide extra functionality on top of Lucene.  I was eager to get it integrated with our application.  But now that I have fully read "Apache Solr 3 Enterprise Search Server" I feel that my initial impressions of Solr were wrong.

I saw where Solr talked about using web services to upload files for indexing and also to perform searching and download content.  I thought that was just a nice feature that was available.  But I was not interested in that due to the fact that our application already has a web service interface that is used by our own home grown client application that communicates with the enterprise application above.

I've read about SolrJ / Solr Cell, EmebbedSolrServer, BackendQueueProcessor, and DIH and researched them on the web.  But none of them have provided me with the information to take a Hibernate managed object, inside of a transaction, persist the binary data in the database (which we are already doing), extra the text / contents from the binary file via Tika (which is a separate issue for a separate thread), and index that text with either Java API code or Java Annotations.

It seems like Solr forces one to expose access to its "Cores" (indexes) via its own WAR file.  I don't want that.  I just want to be able to utilize the Solr Java API to integrate with our current web services and Hibernate framework to index text based documents.  Then allow our users to perform open text searching and utilize Solr's advance features like highlighting, MLT, spell checking, suggester and faceting.  But I just don't see how to integrate what Solr has to offer with our existing web application.  I get the feeling that I have to create a new Solr based web application and then have the current application delegate indexing and searching to the Solr application, which is not what I really want to do, if possible.

I've looked through the Solr Java Docs and I haven't found anything substantial that would allow for me to just use Java code instead of creating HTTP connections to index and search for data.  Will someone let me know if what I am looking for is out of the scope of Solr's functionality or if there is a way, please provide an example of how I can accomplish this?

Thank you,

Todd


RE: Existing Project using Hibernate, Spring and Lucene and Looking to Add Solr

Posted by Todd Hunt <To...@nisc.coop>.
Shawn,

Thank you so much for your help.  I appreciate you taking the time to confirm what I had a hunch about with Solr.  Sometimes I don't see the forest from the trees.  I was hoping Solr would help me out, but like you said, since I have an understanding of what Lucene is doing and the fact that I have Lucene integrated with Hibernate, Solr really doesn't add any value in this case.  

Thanks again,

Todd

-----Original Message-----
From: Shawn Heisey [mailto:solr@elyograg.org] 
Sent: Monday, May 20, 2013 2:36 PM
To: solr-user@lucene.apache.org
Subject: Re: Existing Project using Hibernate, Spring and Lucene and Looking to Add Solr

On 5/20/2013 1:02 PM, Todd Hunt wrote:
<snip>
> It seems like Solr forces one to expose access to its "Cores" (indexes) via its own WAR file.  I don't want that.  I just want to be able to utilize the Solr Java API to integrate with our current web services and Hibernate framework to index text based documents.  Then allow our users to perform open text searching and utilize Solr's advance features like highlighting, MLT, spell checking, suggester and faceting.  But I just don't see how to integrate what Solr has to offer with our existing web application.  I get the feeling that I have to create a new Solr based web application and then have the current application delegate indexing and searching to the Solr application, which is not what I really want to do, if possible.

I've removed most of your email and just quoted the one paragraph above. 
  You have pretty much described the right way to use Solr.  Solr is awesome for new projects, because the amount of user code required to interface with Solr is usually very small.  Most of the heavy lifting is done server-side, in the configuration.

People like yourself that are highly experienced with custom Lucene applications often find Solr too restrictive.  Solr does provide extra functionality on top of Lucene, but it does NOT expose all of Lucene's capability, especially in the newest versions.

Migrating from Lucene to Solr isn't for everyone.  If you have a deep understanding of Lucene and your existing application is intricately tied to it, you should probably stick with Lucene and just upgrade to the newest stable release, because chances are that the way Solr uses Lucene is not completely compatible with your existing methods.  From what I've been told, the upgrade from Lucene 3.x to 4.x does require a lot of refactoring work on user code.

If you do decide to implement Solr, the recommendation is to use the .war and make connections from client code with HttpSolrServer or CloudSolrServer.  Although you CAN use EmbeddedSolrServer to embed the entire Solr application in your program and avoid HTTP, this is not recommended, and it doesn't do anything to change the fact that your Lucene code may be fundamentally different than Solr.  To completely duplicate your Lucene application you might have to write custom Solr components ... and if you start doing that, you might as well simply maintain your existing code through version upgrades.  Lucene is not going away, and a given version of Lucene will likely always have functionality beyond the same version of Solr.

Thanks,
Shawn


Re: Existing Project using Hibernate, Spring and Lucene and Looking to Add Solr

Posted by Shawn Heisey <so...@elyograg.org>.
On 5/20/2013 1:02 PM, Todd Hunt wrote:
<snip>
> It seems like Solr forces one to expose access to its "Cores" (indexes) via its own WAR file.  I don't want that.  I just want to be able to utilize the Solr Java API to integrate with our current web services and Hibernate framework to index text based documents.  Then allow our users to perform open text searching and utilize Solr's advance features like highlighting, MLT, spell checking, suggester and faceting.  But I just don't see how to integrate what Solr has to offer with our existing web application.  I get the feeling that I have to create a new Solr based web application and then have the current application delegate indexing and searching to the Solr application, which is not what I really want to do, if possible.

I've removed most of your email and just quoted the one paragraph above. 
  You have pretty much described the right way to use Solr.  Solr is 
awesome for new projects, because the amount of user code required to 
interface with Solr is usually very small.  Most of the heavy lifting is 
done server-side, in the configuration.

People like yourself that are highly experienced with custom Lucene 
applications often find Solr too restrictive.  Solr does provide extra 
functionality on top of Lucene, but it does NOT expose all of Lucene's 
capability, especially in the newest versions.

Migrating from Lucene to Solr isn't for everyone.  If you have a deep 
understanding of Lucene and your existing application is intricately 
tied to it, you should probably stick with Lucene and just upgrade to 
the newest stable release, because chances are that the way Solr uses 
Lucene is not completely compatible with your existing methods.  From 
what I've been told, the upgrade from Lucene 3.x to 4.x does require a 
lot of refactoring work on user code.

If you do decide to implement Solr, the recommendation is to use the 
.war and make connections from client code with HttpSolrServer or 
CloudSolrServer.  Although you CAN use EmbeddedSolrServer to embed the 
entire Solr application in your program and avoid HTTP, this is not 
recommended, and it doesn't do anything to change the fact that your 
Lucene code may be fundamentally different than Solr.  To completely 
duplicate your Lucene application you might have to write custom Solr 
components ... and if you start doing that, you might as well simply 
maintain your existing code through version upgrades.  Lucene is not 
going away, and a given version of Lucene will likely always have 
functionality beyond the same version of Solr.

Thanks,
Shawn