You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Jan Høydahl <ja...@cominvent.com> on 2016/04/04 20:57:08 UTC

Re: Splitting Solr artifacts so the main download is smaller

A difference from ES is that they have a working plugin ecosystem, so
you can tell users to run "bin/plugin install kuromoji” or whatever.
Could we not continue working on SOLR-5103, and the size issue will
solve itself in a much more elegant way...

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 19. mar. 2016 kl. 19.15 skrev Shawn Heisey <ap...@elyograg.org>:
> 
> I'd like to see some motion on this, which probably means I need to do
> it myself.  I'd like to know who I can talk to about the build/packaging
> system so I can find what needs to change, and especially so I don't
> break it.
> 
> There's already a jira issue -- SOLR-6806, with some related bits in
> SOLR-5103.
> 
> The Solr download for 5.5.0 is 130 or 138 megabytes, depending on what
> OS you're going to install it on.  For the rest of this email, let's
> focus on the .zip version (138MB), since my client is Windows and I'd
> like to compare apples to apples.
> 
> We have a .zip download size of 138MB, which thankfully is down in size
> since we completely dropped the war file.  That *other* search engine
> based on Lucene has a .zip download size of 28MB.
> 
> I started fiddling with the download archive on my Windows machine,
> pulling out obvious pieces at the root of the extracted archive, and
> managed to get the .zip size down to 40MB.
> 
> If I dig further and remove the lucene-analyzers-kuromoji jar (over 4MB)
> and the hadoop jars (10MB), which the majority of Solr's users will
> *never* need, Solr 5.5's .zip file drops to 25MB.
> 
> I'm not suggesting that we just remove these pieces.  We would need to
> have a main artifact and several supporting artifacts.  The total size
> would be virtually the same, so the concerns in LUCENE-5589 and
> LUCENE-6247 will not get worse.  They also won't get better.
> 
> There's plenty of opportunity for bikeshedding here, but that should be
> done in Jira.  For this email, I'd like to know if anyone has strong
> opposition to this, and if not, who would be willing to provide guidance
> for how to do it right.
> 
> Thanks,
> Shawn
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Splitting Solr artifacts so the main download is smaller

Posted by Steve Davids <sd...@gmail.com>.
>
> A tangent to think about later: RPM and DEB packaging.  That's a lot to
> discuss, so I won't go into it here.


Even though you didn't want to get into it here, I did create a Solr
RPM/DEB builder here: https://github.com/sdavids13/solr-os-packager

Sure would be pretty sweet to get an official RPM distribution, I think
that would make a lot of admin's lives easier (primarily for upgrades).

-Steve


On Mon, Apr 4, 2016 at 6:56 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 4/4/2016 12:57 PM, Jan Høydahl wrote:
> > A difference from ES is that they have a working plugin ecosystem, so
> > you can tell users to run "bin/plugin install kuromoji” or whatever.
> > Could we not continue working on SOLR-5103, and the size issue will
> > solve itself in a much more elegant way...
>
> Sure.  I love the idea of a plugin system that can reach out and install
> functionality from the Internet.  Would that need something new from
> Infra?  That's something we can hammer out on the Jira issue.
>
> I think step one for SOLR-5103 is to split the artifacts in a manner
> similar to what I outlined in SOLR-6806.  Then the other artifacts can
> be further diced up into small pieces that can be handled by a plugin
> system.  We don't necessarily need to do these separately, though --
> SOLR-5103 could absorb and replace SOLR-6806.
>
> A tangent to think about later: RPM and DEB packaging.  That's a lot to
> discuss, so I won't go into it here.
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Splitting Solr artifacts so the main download is smaller

Posted by Shawn Heisey <ap...@elyograg.org>.
On 4/4/2016 12:57 PM, Jan Høydahl wrote:
> A difference from ES is that they have a working plugin ecosystem, so
> you can tell users to run "bin/plugin install kuromoji” or whatever.
> Could we not continue working on SOLR-5103, and the size issue will
> solve itself in a much more elegant way...

Sure.  I love the idea of a plugin system that can reach out and install
functionality from the Internet.  Would that need something new from
Infra?  That's something we can hammer out on the Jira issue.

I think step one for SOLR-5103 is to split the artifacts in a manner
similar to what I outlined in SOLR-6806.  Then the other artifacts can
be further diced up into small pieces that can be handled by a plugin
system.  We don't necessarily need to do these separately, though --
SOLR-5103 could absorb and replace SOLR-6806.

A tangent to think about later: RPM and DEB packaging.  That's a lot to
discuss, so I won't go into it here.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org