You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by archit mehta <pa...@gmail.com> on 2016/08/30 17:09:55 UTC

Lucene or Apache Solr : Project Decision making

Hi,

We need to take decision whether to go for lucene or solr. There are few
points which I would like to mention.

1. If we use lucene we do not have to worry about security as it is already
taken care but need to build own distributed indexer and searcher, if we
use solr then we don't have to worry about distributed indexer and searcher
but as it is a another process we have to put some security controls.

In our case getting permission for solr is bit difficult, lucene is already
in production (withou distribution stuff)

2. Does solr uses kafka or zookeeper or other third party library? Can I
get list from somewhere?
Server is heavily loaded, new process and running kafka/zookeeper is also
an overhead for us.
With current implementation we removed kafka and wrote some of our own code.

How much easy or difficult to build distributed indexer and searcher with
core lucene?

Kindly share your views based on the point I have mentioned here. In case
any more clarification require write me back.


Regards,
Archit

Re: Lucene or Apache Solr : Project Decision making

Posted by Doug Turnbull <dt...@opensourceconnections.com>.
Hi Archit, I would make a strong argument for using Solr unless you have
some exotic requirements.

- Solr has distributed indexing and search built in, building your own
distributed system is non-trivial, just as Mark Miller :)
- Solr comes prebaked with an HTTP API for non search experts to interact
with.
- For hiring, it's more likely you'll find a Solr expert than a Lucene
expert
- Custom capabilities can be handled by Solr plugins that specialize bits
and pieces of Solr to your needs
- You can pretty easily proxy Solr for security, from anything from a dumb
nginx proxy to a tad bit of custom code

I might consider using just Lucene if the consumers of my library don't
realize there's "search" under the hood
- I really just want a Java library that does search-like operations under
the hood, but the consumers of my code don't care about search.
- I'm doing something data-sciency with Lucene, my problem doesn't resemble
search, and I want direct control (ie classification, etc).

(note Elasticsearch would have similar capabilities and pros/cons vs Solr,
but the Solr vs ES is a whole 'nother conversation and I don't want to
hijack your thread)

-Doug



On Tue, Aug 30, 2016 at 1:24 PM Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> SolerCloud uses Zookeeper. As to the rest, Solr is open source. It may be
> more efficient stripping out whatever you don't want than reinventing it on
> top of Lucene again.
>
> Regards,
>     Alex
>
> On 31 Aug 2016 12:17 AM, "archit mehta" <pa...@gmail.com> wrote:
>
>> Hi,
>>
>> We need to take decision whether to go for lucene or solr. There are few
>> points which I would like to mention.
>>
>> 1. If we use lucene we do not have to worry about security as it is
>> already taken care but need to build own distributed indexer and searcher,
>> if we use solr then we don't have to worry about distributed indexer and
>> searcher but as it is a another process we have to put some security
>> controls.
>>
>> In our case getting permission for solr is bit difficult, lucene is
>> already in production (withou distribution stuff)
>>
>> 2. Does solr uses kafka or zookeeper or other third party library? Can I
>> get list from somewhere?
>> Server is heavily loaded, new process and running kafka/zookeeper is also
>> an overhead for us.
>> With current implementation we removed kafka and wrote some of our own
>> code.
>>
>> How much easy or difficult to build distributed indexer and searcher with
>> core lucene?
>>
>> Kindly share your views based on the point I have mentioned here. In case
>> any more clarification require write me back.
>>
>>
>> Regards,
>> Archit
>>
>>

Re: Lucene or Apache Solr : Project Decision making

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
SolerCloud uses Zookeeper. As to the rest, Solr is open source. It may be
more efficient stripping out whatever you don't want than reinventing it on
top of Lucene again.

Regards,
    Alex

On 31 Aug 2016 12:17 AM, "archit mehta" <pa...@gmail.com> wrote:

> Hi,
>
> We need to take decision whether to go for lucene or solr. There are few
> points which I would like to mention.
>
> 1. If we use lucene we do not have to worry about security as it is
> already taken care but need to build own distributed indexer and searcher,
> if we use solr then we don't have to worry about distributed indexer and
> searcher but as it is a another process we have to put some security
> controls.
>
> In our case getting permission for solr is bit difficult, lucene is
> already in production (withou distribution stuff)
>
> 2. Does solr uses kafka or zookeeper or other third party library? Can I
> get list from somewhere?
> Server is heavily loaded, new process and running kafka/zookeeper is also
> an overhead for us.
> With current implementation we removed kafka and wrote some of our own
> code.
>
> How much easy or difficult to build distributed indexer and searcher with
> core lucene?
>
> Kindly share your views based on the point I have mentioned here. In case
> any more clarification require write me back.
>
>
> Regards,
> Archit
>
>