You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by David Smiley <da...@gmail.com> on 2017/10/27 21:32:22 UTC

Solr DistributedUpdateProcessor complexity

DistributedUpdateProcessor (or what I call DURP for short) is very
complex.  One aspect of the complexity is that it appears it tries to
support SolrCloud and classic Solr.  Do we still need it to support classic
Solr?  When/why?  Forever?

If it needs to continue to operate in both modes, perhaps it could be
refactored into a base class and ZooKeeper subclass?  It's a code smell to
see the current code with "if (zkEnabled)" all over the place.

Any other ideas on making this code more maintainable?

~ David
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Re: Solr DistributedUpdateProcessor complexity

Posted by David Smiley <da...@gmail.com>.

Hoss: Thanks very much for your insight!  Hmmm; this seems tractable.  It
could probably be addressed with a "urp phase" enum of sorts... and then
splitting the DURP into some pieces that honor certain phases.  I'm just
thinking out loud here... I don't think I have time to undertake a large
refactoring of this magnitude any way.

Do you know if DURP has functionality that can be used without SolrCloud
other than the atomic updates feature?

On Sun, Oct 29, 2017 at 4:53 PM Chris Hostetter <ho...@fucit.org>
wrote:

>
> : DistributedUpdateProcessor (or what I call DURP for short) is very
> : complex.  One aspect of the complexity is that it appears it tries to
> : support SolrCloud and classic Solr.  Do we still need it to support
> classic
> : Solr?  When/why?  Forever?
>
> Part of the issue here is that DUP is responsible for applying atomic
> update operations regardless of wehter here is any actual "distribtued"
> updating happening -- because it has to be in cloud mode in order to
> ensure the atomic updates happen exactly once on the leader.
>
> that's the fundemental reason the UpdateProcessorChain can't just "optmize
> away" the need for DUP in non-cloud use cases and let the DUP
> internals just assert "zkEnabled" in all cases
>
> As far as the broader question of: "Do we still need it to support classic
> Solr?  When/why?  Forever?" ... that feels like a much more significant
> question that warrants it's own thread/jira with a very clear subject
> since it's a much bigger/broader topic of cnversation (with a much
> greater impact on user experience) then just a dicussion of the
> (internal) complexity involved in DistributedUpdateProcessor.
>
>
> -Hoss
> http://www.lucidworks.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Re: Solr DistributedUpdateProcessor complexity

Posted by Chris Hostetter <ho...@fucit.org>.

: DistributedUpdateProcessor (or what I call DURP for short) is very
: complex.  One aspect of the complexity is that it appears it tries to
: support SolrCloud and classic Solr.  Do we still need it to support classic
: Solr?  When/why?  Forever?

Part of the issue here is that DUP is responsible for applying atomic 
update operations regardless of wehter here is any actual "distribtued" 
updating happening -- because it has to be in cloud mode in order to 
ensure the atomic updates happen exactly once on the leader.

that's the fundemental reason the UpdateProcessorChain can't just "optmize 
away" the need for DUP in non-cloud use cases and let the DUP 
internals just assert "zkEnabled" in all cases

As far as the broader question of: "Do we still need it to support classic 
Solr?  When/why?  Forever?" ... that feels like a much more significant 
question that warrants it's own thread/jira with a very clear subject 
since it's a much bigger/broader topic of cnversation (with a much 
greater impact on user experience) then just a dicussion of the 
(internal) complexity involved in DistributedUpdateProcessor.


-Hoss
http://www.lucidworks.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Solr DistributedUpdateProcessor complexity

Posted by Eric Pugh <ep...@opensourceconnections.com>.

I wish that we only had one Solr mode as well.   We now have features that only work in SolrCloud mode, like the SQL handler….   For a very small 1000 document set up, I recently deployed it as a single node single shard no replica SolrCloud setup using embedded ZK, just to take advantage of the SQL handler!


> On Oct 27, 2017, at 5:32 PM, David Smiley <da...@gmail.com> wrote:
> 
> DistributedUpdateProcessor (or what I call DURP for short) is very complex.  One aspect of the complexity is that it appears it tries to support SolrCloud and classic Solr.  Do we still need it to support classic Solr?  When/why?  Forever?
> 
> If it needs to continue to operate in both modes, perhaps it could be refactored into a base class and ZooKeeper subclass?  It's a code smell to see the current code with "if (zkEnabled)" all over the place.
> 
> Any other ideas on making this code more maintainable?
> 
> ~ David
> -- 
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley <http://linkedin.com/in/davidwsmiley> | Book: http://www.solrenterprisesearchserver.com <http://www.solrenterprisesearchserver.com/>