You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dc Tech <dc...@gmail.com> on 2020/01/15 04:02:17 UTC

Coming back to search after some time... SOLR or Elastic for text search?

I am SOLR fant and had implemented it in our company over 10 years ago.
I moved away from that role and the new search team in the meanwhile
implemented a proprietary (and expensive) nosql style search engine. That
the project did not go well, and now I am back to project and reviewing the
technology stack.

Some of the team think that ElasticSearch could be a good option,
especially since we can easily get hosted versions with AWS where we have
all the contractual stuff sorted out.

Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
graph, and all the knobs and dials for relevancy tuning), Elastic may be
sufficient for our needs. It does not seem to have LTR out of the box but
the relevancy tuning knobs and dials seem to be similar to what SOLR has.

The corpus size is not a challenge  - we have about one million document,
of which about 1/2 have full text, while the test are simpler (i.e. company
directory etc.).
The query volumes are also quite low (max 5/second at peak).
We have implemented the content ingestion and processing pipelines already
in python and SPARK, so most of the data will be pushed in using APIs.

I would really appreciate any guidance from the community !!

Re: Coming back to search after some time... SOLR or Elastic for text search?

Posted by Charlie Hull <ch...@flax.co.uk>.
On 15/01/2020 11:42, Dc Tech wrote:
> Thank you Jan and Charlie.
>
> I should say that in terms of posting to the community regarding Elastic vs Solr - this is probably the most civil and helpful community that I have been a part of - and your answers have only reinforced that  notion !!
>
> Thank you for your responses. I am glad to hear that both can do most of it, which was my gut feeling as well.
>
> Charlie, to your point - the team probably feels that Elastic  is easier to get started with hence the preference, as well as the hosting options (with the caveats you noted). Agree with you completely that tech is not the real issue.
>
> Jan,  agree with  the points you made on team skills.  On our previous proprietary engine - that was in fact the biggest issue - the engine was powerful enough and had good references.  However, we were not able to exploit it to good effect.

Hi again,

The dirty secret that few will voice is that...most search engines are 
basically the same. Once you've worked on a search project you can apply 
those skills to any future search engine. This is why I'm currently 
focused on supporting the search team, not the search tech - how do you 
learn and improve those relevance tuning skills, considering it's 
really, really hard to recruit people with existing high-level search 
skills (and if you can find them you probably can't afford them).

Cheers

Charlie

>
> Thank you again.
>
>> On Jan 15, 2020, at 5:10 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>>
>> Hi,
>>
>> Choosing the solr community mailing list to ask advice for whether to choose ES - you already know what to expect, not?
>> More often than not the choice comes down to policy, standardization, what skills you have in the house etc rather than ticking off feature checkboxes.
>> Sometimes company values also may drive a choice, i.e. Solr is 100% Apache and not open core, which may matter if you plan to get involved in the community, and contribute features or patches.
>>
>> However, if I were in your shoes as architect to evaluate tech stack, and there was not a clear choice based on the above, I’d do what projects normally do, to ask yourself what you really need from the engine. Maybe you have some features in your requirement list that makes one a much better choice over the other. Or maybe after that exercise you are still wondering what to choose, in which case you just follow your gut feeling and make a choice :)
>>
>> Jan
>>
>>> 15. jan. 2020 kl. 10:07 skrev Charlie Hull <ch...@flax.co.uk>:
>>>
>>>> On 15/01/2020 04:02, Dc Tech wrote:
>>>> I am SOLR fant and had implemented it in our company over 10 years ago.
>>>> I moved away from that role and the new search team in the meanwhile
>>>> implemented a proprietary (and expensive) nosql style search engine. That
>>>> the project did not go well, and now I am back to project and reviewing the
>>>> technology stack.
>>>>
>>>> Some of the team think that ElasticSearch could be a good option,
>>>> especially since we can easily get hosted versions with AWS where we have
>>>> all the contractual stuff sorted out.
>>> You can, but you should be aware that:
>>> 1. Amazon's hosted Elasticsearch isn't great, often lags behind the current version, doesn't allow plugins etc.
>>> 2.  Amazon and Elastic are currently engaged in legal battles over who is the most open sourcey,who allegedly copied code that was 'open' but commercially licensed, who would like to capture the hosted search market...not sure how this will pan out (Google for details)
>>> 3. You can also buy fully hosted Solr from several places.
>>>> Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
>>>> graph, and all the knobs and dials for relevancy tuning), Elastic may be
>>>> sufficient for our needs. It does not seem to have LTR out of the box but
>>>> the relevancy tuning knobs and dials seem to be similar to what SOLR has.
>>> Yes, they're basically the same under the hood (unsurprising as they're both based on Lucene). If you need LTR there's an ES plugin for that (disclaimer, my new employer built and maintains it: https://github.com/o19s/elasticsearch-learning-to-rank). I've lost track of the amount of times I've been asked 'Elasticsearch or Solr, which should I choose?' and my current thoughts are:
>>>
>>> 1. Don't switch from one to the other for the sake of it.  Switching search engines rarely addresses underlying issues (content quality, team skills, relevance tuning methodology)
>>> 2. Elasticsearch is easier to get started with, but at some point you'll need to learn how it all works
>>> 3. Solr is harder to get started with, but you'll know more about how it all works earlier
>>> 4. Both can be used for most search projects, most features are the same, both can scale.
>>> 5. Lots of Elasticsearch projects (and developers) are focused on logs, which is often not really a 'search' project.
>>>
>>>> The corpus size is not a challenge  - we have about one million document,
>>>> of which about 1/2 have full text, while the test are simpler (i.e. company
>>>> directory etc.).
>>>> The query volumes are also quite low (max 5/second at peak).
>>>> We have implemented the content ingestion and processing pipelines already
>>>> in python and SPARK, so most of the data will be pushed in using APIs.
>>>>
>>>> I would really appreciate any guidance from the community !!
>>>>
>>> Sounds like a pretty small setup to be honest, but as ever the devil is in the details.
>>>
>>> Cheers
>>>
>>> Charlie
>>>
>>> -- 
>>> Charlie Hull
>>> Flax - Open Source Enterprise Search (now part of OpenSourceConnections)
>>>
>>> tel/fax: +44 (0)8700 118334
>>> mobile:  +44 (0)7767 825828
>>> web: www.o19.com
>>>

-- 
Charlie Hull
Flax - Open Source Enterprise Search, now part of OSC

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19s.com


Re: Coming back to search after some time... SOLR or Elastic for text search?

Posted by Walter Underwood <wu...@wunderwood.org>.
Elasticsearch is easier to set up the first time, but that should not be a deciding factor. Decide on features, not something you’ll do once.

ES has most configuration power at query time. Solr has most at index time. If every query is different, like log search, ES will be better. If queries only differ in the text, keeping filters, fields, and weighting the same, Solr is probably best.

For product or ecommerce search, Solr has richer features.

I would not base a choice solely on LTR. 90% of sites don’t need that and you’ll need a lot of production data (queries and clicks) before you can start using it.

Finally, ES has had some pretty embarrassing issues with clustering. You might find it works great with a single host, then throws away updates when clustered in prod. Getting distributed fault tolerance right is very, very hard, which is why Solr uses Zookeeper. 

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jan 15, 2020, at 3:42 AM, Dc Tech <dc...@gmail.com> wrote:
> 
> Thank you Jan and Charlie. 
> 
> I should say that in terms of posting to the community regarding Elastic vs Solr - this is probably the most civil and helpful community that I have been a part of - and your answers have only reinforced that  notion !!
> 
> Thank you for your responses. I am glad to hear that both can do most of it, which was my gut feeling as well. 
> 
> Charlie, to your point - the team probably feels that Elastic  is easier to get started with hence the preference, as well as the hosting options (with the caveats you noted). Agree with you completely that tech is not the real issue. 
> 
> Jan,  agree with  the points you made on team skills.  On our previous proprietary engine - that was in fact the biggest issue - the engine was powerful enough and had good references.  However, we were not able to exploit it to good effect.  
> 
> Thank you again. 
> 
>> 
>> On Jan 15, 2020, at 5:10 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>> 
>> Hi,
>> 
>> Choosing the solr community mailing list to ask advice for whether to choose ES - you already know what to expect, not?
>> More often than not the choice comes down to policy, standardization, what skills you have in the house etc rather than ticking off feature checkboxes.
>> Sometimes company values also may drive a choice, i.e. Solr is 100% Apache and not open core, which may matter if you plan to get involved in the community, and contribute features or patches.
>> 
>> However, if I were in your shoes as architect to evaluate tech stack, and there was not a clear choice based on the above, I’d do what projects normally do, to ask yourself what you really need from the engine. Maybe you have some features in your requirement list that makes one a much better choice over the other. Or maybe after that exercise you are still wondering what to choose, in which case you just follow your gut feeling and make a choice :)
>> 
>> Jan
>> 
>>> 15. jan. 2020 kl. 10:07 skrev Charlie Hull <ch...@flax.co.uk>:
>>> 
>>>> On 15/01/2020 04:02, Dc Tech wrote:
>>>> I am SOLR fant and had implemented it in our company over 10 years ago.
>>>> I moved away from that role and the new search team in the meanwhile
>>>> implemented a proprietary (and expensive) nosql style search engine. That
>>>> the project did not go well, and now I am back to project and reviewing the
>>>> technology stack.
>>>> 
>>>> Some of the team think that ElasticSearch could be a good option,
>>>> especially since we can easily get hosted versions with AWS where we have
>>>> all the contractual stuff sorted out.
>>> You can, but you should be aware that:
>>> 1. Amazon's hosted Elasticsearch isn't great, often lags behind the current version, doesn't allow plugins etc.
>>> 2.  Amazon and Elastic are currently engaged in legal battles over who is the most open sourcey,who allegedly copied code that was 'open' but commercially licensed, who would like to capture the hosted search market...not sure how this will pan out (Google for details)
>>> 3. You can also buy fully hosted Solr from several places.
>>>> Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
>>>> graph, and all the knobs and dials for relevancy tuning), Elastic may be
>>>> sufficient for our needs. It does not seem to have LTR out of the box but
>>>> the relevancy tuning knobs and dials seem to be similar to what SOLR has.
>>> Yes, they're basically the same under the hood (unsurprising as they're both based on Lucene). If you need LTR there's an ES plugin for that (disclaimer, my new employer built and maintains it: https://github.com/o19s/elasticsearch-learning-to-rank). I've lost track of the amount of times I've been asked 'Elasticsearch or Solr, which should I choose?' and my current thoughts are:
>>> 
>>> 1. Don't switch from one to the other for the sake of it.  Switching search engines rarely addresses underlying issues (content quality, team skills, relevance tuning methodology)
>>> 2. Elasticsearch is easier to get started with, but at some point you'll need to learn how it all works
>>> 3. Solr is harder to get started with, but you'll know more about how it all works earlier
>>> 4. Both can be used for most search projects, most features are the same, both can scale.
>>> 5. Lots of Elasticsearch projects (and developers) are focused on logs, which is often not really a 'search' project.
>>> 
>>>> 
>>>> The corpus size is not a challenge  - we have about one million document,
>>>> of which about 1/2 have full text, while the test are simpler (i.e. company
>>>> directory etc.).
>>>> The query volumes are also quite low (max 5/second at peak).
>>>> We have implemented the content ingestion and processing pipelines already
>>>> in python and SPARK, so most of the data will be pushed in using APIs.
>>>> 
>>>> I would really appreciate any guidance from the community !!
>>>> 
>>> Sounds like a pretty small setup to be honest, but as ever the devil is in the details.
>>> 
>>> Cheers
>>> 
>>> Charlie
>>> 
>>> -- 
>>> Charlie Hull
>>> Flax - Open Source Enterprise Search (now part of OpenSourceConnections)
>>> 
>>> tel/fax: +44 (0)8700 118334
>>> mobile:  +44 (0)7767 825828
>>> web: www.o19.com
>>> 
>> 


Re: Coming back to search after some time... SOLR or Elastic for text search?

Posted by Dc Tech <dc...@gmail.com>.
Thank you Jan and Charlie. 

I should say that in terms of posting to the community regarding Elastic vs Solr - this is probably the most civil and helpful community that I have been a part of - and your answers have only reinforced that  notion !!

Thank you for your responses. I am glad to hear that both can do most of it, which was my gut feeling as well. 

Charlie, to your point - the team probably feels that Elastic  is easier to get started with hence the preference, as well as the hosting options (with the caveats you noted). Agree with you completely that tech is not the real issue. 

Jan,  agree with  the points you made on team skills.  On our previous proprietary engine - that was in fact the biggest issue - the engine was powerful enough and had good references.  However, we were not able to exploit it to good effect.  

Thank you again. 

> 
> On Jan 15, 2020, at 5:10 AM, Jan Høydahl <ja...@cominvent.com> wrote:
> 
> Hi,
> 
> Choosing the solr community mailing list to ask advice for whether to choose ES - you already know what to expect, not?
> More often than not the choice comes down to policy, standardization, what skills you have in the house etc rather than ticking off feature checkboxes.
> Sometimes company values also may drive a choice, i.e. Solr is 100% Apache and not open core, which may matter if you plan to get involved in the community, and contribute features or patches.
> 
> However, if I were in your shoes as architect to evaluate tech stack, and there was not a clear choice based on the above, I’d do what projects normally do, to ask yourself what you really need from the engine. Maybe you have some features in your requirement list that makes one a much better choice over the other. Or maybe after that exercise you are still wondering what to choose, in which case you just follow your gut feeling and make a choice :)
> 
> Jan
> 
>> 15. jan. 2020 kl. 10:07 skrev Charlie Hull <ch...@flax.co.uk>:
>> 
>>> On 15/01/2020 04:02, Dc Tech wrote:
>>> I am SOLR fant and had implemented it in our company over 10 years ago.
>>> I moved away from that role and the new search team in the meanwhile
>>> implemented a proprietary (and expensive) nosql style search engine. That
>>> the project did not go well, and now I am back to project and reviewing the
>>> technology stack.
>>> 
>>> Some of the team think that ElasticSearch could be a good option,
>>> especially since we can easily get hosted versions with AWS where we have
>>> all the contractual stuff sorted out.
>> You can, but you should be aware that:
>> 1. Amazon's hosted Elasticsearch isn't great, often lags behind the current version, doesn't allow plugins etc.
>> 2.  Amazon and Elastic are currently engaged in legal battles over who is the most open sourcey,who allegedly copied code that was 'open' but commercially licensed, who would like to capture the hosted search market...not sure how this will pan out (Google for details)
>> 3. You can also buy fully hosted Solr from several places.
>>> Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
>>> graph, and all the knobs and dials for relevancy tuning), Elastic may be
>>> sufficient for our needs. It does not seem to have LTR out of the box but
>>> the relevancy tuning knobs and dials seem to be similar to what SOLR has.
>> Yes, they're basically the same under the hood (unsurprising as they're both based on Lucene). If you need LTR there's an ES plugin for that (disclaimer, my new employer built and maintains it: https://github.com/o19s/elasticsearch-learning-to-rank). I've lost track of the amount of times I've been asked 'Elasticsearch or Solr, which should I choose?' and my current thoughts are:
>> 
>> 1. Don't switch from one to the other for the sake of it.  Switching search engines rarely addresses underlying issues (content quality, team skills, relevance tuning methodology)
>> 2. Elasticsearch is easier to get started with, but at some point you'll need to learn how it all works
>> 3. Solr is harder to get started with, but you'll know more about how it all works earlier
>> 4. Both can be used for most search projects, most features are the same, both can scale.
>> 5. Lots of Elasticsearch projects (and developers) are focused on logs, which is often not really a 'search' project.
>> 
>>> 
>>> The corpus size is not a challenge  - we have about one million document,
>>> of which about 1/2 have full text, while the test are simpler (i.e. company
>>> directory etc.).
>>> The query volumes are also quite low (max 5/second at peak).
>>> We have implemented the content ingestion and processing pipelines already
>>> in python and SPARK, so most of the data will be pushed in using APIs.
>>> 
>>> I would really appreciate any guidance from the community !!
>>> 
>> Sounds like a pretty small setup to be honest, but as ever the devil is in the details.
>> 
>> Cheers
>> 
>> Charlie
>> 
>> -- 
>> Charlie Hull
>> Flax - Open Source Enterprise Search (now part of OpenSourceConnections)
>> 
>> tel/fax: +44 (0)8700 118334
>> mobile:  +44 (0)7767 825828
>> web: www.o19.com
>> 
> 

Re: Coming back to search after some time... SOLR or Elastic for text search?

Posted by Jan Høydahl <ja...@cominvent.com>.
Hi,

Choosing the solr community mailing list to ask advice for whether to choose ES - you already know what to expect, not?
More often than not the choice comes down to policy, standardization, what skills you have in the house etc rather than ticking off feature checkboxes.
Sometimes company values also may drive a choice, i.e. Solr is 100% Apache and not open core, which may matter if you plan to get involved in the community, and contribute features or patches.

However, if I were in your shoes as architect to evaluate tech stack, and there was not a clear choice based on the above, I’d do what projects normally do, to ask yourself what you really need from the engine. Maybe you have some features in your requirement list that makes one a much better choice over the other. Or maybe after that exercise you are still wondering what to choose, in which case you just follow your gut feeling and make a choice :)

Jan

> 15. jan. 2020 kl. 10:07 skrev Charlie Hull <ch...@flax.co.uk>:
> 
> On 15/01/2020 04:02, Dc Tech wrote:
>> I am SOLR fant and had implemented it in our company over 10 years ago.
>> I moved away from that role and the new search team in the meanwhile
>> implemented a proprietary (and expensive) nosql style search engine. That
>> the project did not go well, and now I am back to project and reviewing the
>> technology stack.
>> 
>> Some of the team think that ElasticSearch could be a good option,
>> especially since we can easily get hosted versions with AWS where we have
>> all the contractual stuff sorted out.
> You can, but you should be aware that:
> 1. Amazon's hosted Elasticsearch isn't great, often lags behind the current version, doesn't allow plugins etc.
> 2.  Amazon and Elastic are currently engaged in legal battles over who is the most open sourcey,who allegedly copied code that was 'open' but commercially licensed, who would like to capture the hosted search market...not sure how this will pan out (Google for details)
> 3. You can also buy fully hosted Solr from several places.
>> Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
>> graph, and all the knobs and dials for relevancy tuning), Elastic may be
>> sufficient for our needs. It does not seem to have LTR out of the box but
>> the relevancy tuning knobs and dials seem to be similar to what SOLR has.
> Yes, they're basically the same under the hood (unsurprising as they're both based on Lucene). If you need LTR there's an ES plugin for that (disclaimer, my new employer built and maintains it: https://github.com/o19s/elasticsearch-learning-to-rank). I've lost track of the amount of times I've been asked 'Elasticsearch or Solr, which should I choose?' and my current thoughts are:
> 
> 1. Don't switch from one to the other for the sake of it.  Switching search engines rarely addresses underlying issues (content quality, team skills, relevance tuning methodology)
> 2. Elasticsearch is easier to get started with, but at some point you'll need to learn how it all works
> 3. Solr is harder to get started with, but you'll know more about how it all works earlier
> 4. Both can be used for most search projects, most features are the same, both can scale.
> 5. Lots of Elasticsearch projects (and developers) are focused on logs, which is often not really a 'search' project.
> 
>> 
>> The corpus size is not a challenge  - we have about one million document,
>> of which about 1/2 have full text, while the test are simpler (i.e. company
>> directory etc.).
>> The query volumes are also quite low (max 5/second at peak).
>> We have implemented the content ingestion and processing pipelines already
>> in python and SPARK, so most of the data will be pushed in using APIs.
>> 
>> I would really appreciate any guidance from the community !!
>> 
> Sounds like a pretty small setup to be honest, but as ever the devil is in the details.
> 
> Cheers
> 
> Charlie
> 
> -- 
> Charlie Hull
> Flax - Open Source Enterprise Search (now part of OpenSourceConnections)
> 
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.o19.com
> 


Re: Coming back to search after some time... SOLR or Elastic for text search?

Posted by Charlie Hull <ch...@flax.co.uk>.
On 15/01/2020 04:02, Dc Tech wrote:
> I am SOLR fant and had implemented it in our company over 10 years ago.
> I moved away from that role and the new search team in the meanwhile
> implemented a proprietary (and expensive) nosql style search engine. That
> the project did not go well, and now I am back to project and reviewing the
> technology stack.
>
> Some of the team think that ElasticSearch could be a good option,
> especially since we can easily get hosted versions with AWS where we have
> all the contractual stuff sorted out.
You can, but you should be aware that:
1. Amazon's hosted Elasticsearch isn't great, often lags behind the 
current version, doesn't allow plugins etc.
2.  Amazon and Elastic are currently engaged in legal battles over who 
is the most open sourcey,who allegedly copied code that was 'open' but 
commercially licensed, who would like to capture the hosted search 
market...not sure how this will pan out (Google for details)
3. You can also buy fully hosted Solr from several places.
> Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
> graph, and all the knobs and dials for relevancy tuning), Elastic may be
> sufficient for our needs. It does not seem to have LTR out of the box but
> the relevancy tuning knobs and dials seem to be similar to what SOLR has.
Yes, they're basically the same under the hood (unsurprising as they're 
both based on Lucene). If you need LTR there's an ES plugin for that 
(disclaimer, my new employer built and maintains it: 
https://github.com/o19s/elasticsearch-learning-to-rank). I've lost track 
of the amount of times I've been asked 'Elasticsearch or Solr, which 
should I choose?' and my current thoughts are:

1. Don't switch from one to the other for the sake of it.  Switching 
search engines rarely addresses underlying issues (content quality, team 
skills, relevance tuning methodology)
2. Elasticsearch is easier to get started with, but at some point you'll 
need to learn how it all works
3. Solr is harder to get started with, but you'll know more about how it 
all works earlier
4. Both can be used for most search projects, most features are the 
same, both can scale.
5. Lots of Elasticsearch projects (and developers) are focused on logs, 
which is often not really a 'search' project.

>
> The corpus size is not a challenge  - we have about one million document,
> of which about 1/2 have full text, while the test are simpler (i.e. company
> directory etc.).
> The query volumes are also quite low (max 5/second at peak).
> We have implemented the content ingestion and processing pipelines already
> in python and SPARK, so most of the data will be pushed in using APIs.
>
> I would really appreciate any guidance from the community !!
>
Sounds like a pretty small setup to be honest, but as ever the devil is 
in the details.

Cheers

Charlie

-- 
Charlie Hull
Flax - Open Source Enterprise Search (now part of OpenSourceConnections)

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19.com


Re: Coming back to search after some time... SOLR or Elastic for text search?

Posted by Nicolas Paris <ni...@riseup.net>.
> We have implemented the content ingestion and processing pipelines already
> in python and SPARK, so most of the data will be pushed in using APIs.

I use the spark-solr library in production and have looked at the ES
equivalent and the solr connector looks much more advanced for both
loading and fetching data. In particular the fetching part uses the solr
export handler which makes things incredibly fast. Also spark-solr uses
the dataframe API while ES looks to be stuck with the RDD api AFAIK.

A good connector to spark offer lot of perspectives in term of data
transformation and machine learning advanced features within the search
engine.

On Tue, Jan 14, 2020 at 11:02:17PM -0500, Dc Tech wrote:
> I am SOLR fant and had implemented it in our company over 10 years ago.
> I moved away from that role and the new search team in the meanwhile
> implemented a proprietary (and expensive) nosql style search engine. That
> the project did not go well, and now I am back to project and reviewing the
> technology stack.
> 
> Some of the team think that ElasticSearch could be a good option,
> especially since we can easily get hosted versions with AWS where we have
> all the contractual stuff sorted out.
> 
> Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
> graph, and all the knobs and dials for relevancy tuning), Elastic may be
> sufficient for our needs. It does not seem to have LTR out of the box but
> the relevancy tuning knobs and dials seem to be similar to what SOLR has.
> 
> The corpus size is not a challenge  - we have about one million document,
> of which about 1/2 have full text, while the test are simpler (i.e. company
> directory etc.).
> The query volumes are also quite low (max 5/second at peak).
> We have implemented the content ingestion and processing pipelines already
> in python and SPARK, so most of the data will be pushed in using APIs.
> 
> I would really appreciate any guidance from the community !!

-- 
nicolas

Re: Coming back to search after some time... SOLR or Elastic for text search?

Posted by Emir Arnautović <em...@sematext.com>.
Hi Jan,
Here is a blog post related to this topic: https://sematext.com/blog/solr-vs-elasticsearch-differences/ <https://sematext.com/blog/solr-vs-elasticsearch-differences/>
It also contains links to other resources that might help you make a decision.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 15 Jan 2020, at 05:02, Dc Tech <dc...@gmail.com> wrote:
> 
> I am SOLR fant and had implemented it in our company over 10 years ago.
> I moved away from that role and the new search team in the meanwhile
> implemented a proprietary (and expensive) nosql style search engine. That
> the project did not go well, and now I am back to project and reviewing the
> technology stack.
> 
> Some of the team think that ElasticSearch could be a good option,
> especially since we can easily get hosted versions with AWS where we have
> all the contractual stuff sorted out.
> 
> Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
> graph, and all the knobs and dials for relevancy tuning), Elastic may be
> sufficient for our needs. It does not seem to have LTR out of the box but
> the relevancy tuning knobs and dials seem to be similar to what SOLR has.
> 
> The corpus size is not a challenge  - we have about one million document,
> of which about 1/2 have full text, while the test are simpler (i.e. company
> directory etc.).
> The query volumes are also quite low (max 5/second at peak).
> We have implemented the content ingestion and processing pipelines already
> in python and SPARK, so most of the data will be pushed in using APIs.
> 
> I would really appreciate any guidance from the community !!