You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shawn Heisey <ap...@elyograg.org> on 2016/03/19 19:15:04 UTC

Splitting Solr artifacts so the main download is smaller

I'd like to see some motion on this, which probably means I need to do
it myself.  I'd like to know who I can talk to about the build/packaging
system so I can find what needs to change, and especially so I don't
break it.

There's already a jira issue -- SOLR-6806, with some related bits in
SOLR-5103.

The Solr download for 5.5.0 is 130 or 138 megabytes, depending on what
OS you're going to install it on.  For the rest of this email, let's
focus on the .zip version (138MB), since my client is Windows and I'd
like to compare apples to apples.

We have a .zip download size of 138MB, which thankfully is down in size
since we completely dropped the war file.  That *other* search engine
based on Lucene has a .zip download size of 28MB.

I started fiddling with the download archive on my Windows machine,
pulling out obvious pieces at the root of the extracted archive, and
managed to get the .zip size down to 40MB.

If I dig further and remove the lucene-analyzers-kuromoji jar (over 4MB)
and the hadoop jars (10MB), which the majority of Solr's users will
*never* need, Solr 5.5's .zip file drops to 25MB.

I'm not suggesting that we just remove these pieces.  We would need to
have a main artifact and several supporting artifacts.  The total size
would be virtually the same, so the concerns in LUCENE-5589 and
LUCENE-6247 will not get worse.  They also won't get better.

There's plenty of opportunity for bikeshedding here, but that should be
done in Jira.  For this email, I'd like to know if anyone has strong
opposition to this, and if not, who would be willing to provide guidance
for how to do it right.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Splitting Solr artifacts so the main download is smaller

Posted by Steve Davids <sd...@gmail.com>.
>
> A tangent to think about later: RPM and DEB packaging.  That's a lot to
> discuss, so I won't go into it here.


Even though you didn't want to get into it here, I did create a Solr
RPM/DEB builder here: https://github.com/sdavids13/solr-os-packager

Sure would be pretty sweet to get an official RPM distribution, I think
that would make a lot of admin's lives easier (primarily for upgrades).

-Steve


On Mon, Apr 4, 2016 at 6:56 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 4/4/2016 12:57 PM, Jan Høydahl wrote:
> > A difference from ES is that they have a working plugin ecosystem, so
> > you can tell users to run "bin/plugin install kuromoji” or whatever.
> > Could we not continue working on SOLR-5103, and the size issue will
> > solve itself in a much more elegant way...
>
> Sure.  I love the idea of a plugin system that can reach out and install
> functionality from the Internet.  Would that need something new from
> Infra?  That's something we can hammer out on the Jira issue.
>
> I think step one for SOLR-5103 is to split the artifacts in a manner
> similar to what I outlined in SOLR-6806.  Then the other artifacts can
> be further diced up into small pieces that can be handled by a plugin
> system.  We don't necessarily need to do these separately, though --
> SOLR-5103 could absorb and replace SOLR-6806.
>
> A tangent to think about later: RPM and DEB packaging.  That's a lot to
> discuss, so I won't go into it here.
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Splitting Solr artifacts so the main download is smaller

Posted by Shawn Heisey <ap...@elyograg.org>.
On 4/4/2016 12:57 PM, Jan Høydahl wrote:
> A difference from ES is that they have a working plugin ecosystem, so
> you can tell users to run "bin/plugin install kuromoji” or whatever.
> Could we not continue working on SOLR-5103, and the size issue will
> solve itself in a much more elegant way...

Sure.  I love the idea of a plugin system that can reach out and install
functionality from the Internet.  Would that need something new from
Infra?  That's something we can hammer out on the Jira issue.

I think step one for SOLR-5103 is to split the artifacts in a manner
similar to what I outlined in SOLR-6806.  Then the other artifacts can
be further diced up into small pieces that can be handled by a plugin
system.  We don't necessarily need to do these separately, though --
SOLR-5103 could absorb and replace SOLR-6806.

A tangent to think about later: RPM and DEB packaging.  That's a lot to
discuss, so I won't go into it here.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Splitting Solr artifacts so the main download is smaller

Posted by Jan Høydahl <ja...@cominvent.com>.
A difference from ES is that they have a working plugin ecosystem, so
you can tell users to run "bin/plugin install kuromoji” or whatever.
Could we not continue working on SOLR-5103, and the size issue will
solve itself in a much more elegant way...

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 19. mar. 2016 kl. 19.15 skrev Shawn Heisey <ap...@elyograg.org>:
> 
> I'd like to see some motion on this, which probably means I need to do
> it myself.  I'd like to know who I can talk to about the build/packaging
> system so I can find what needs to change, and especially so I don't
> break it.
> 
> There's already a jira issue -- SOLR-6806, with some related bits in
> SOLR-5103.
> 
> The Solr download for 5.5.0 is 130 or 138 megabytes, depending on what
> OS you're going to install it on.  For the rest of this email, let's
> focus on the .zip version (138MB), since my client is Windows and I'd
> like to compare apples to apples.
> 
> We have a .zip download size of 138MB, which thankfully is down in size
> since we completely dropped the war file.  That *other* search engine
> based on Lucene has a .zip download size of 28MB.
> 
> I started fiddling with the download archive on my Windows machine,
> pulling out obvious pieces at the root of the extracted archive, and
> managed to get the .zip size down to 40MB.
> 
> If I dig further and remove the lucene-analyzers-kuromoji jar (over 4MB)
> and the hadoop jars (10MB), which the majority of Solr's users will
> *never* need, Solr 5.5's .zip file drops to 25MB.
> 
> I'm not suggesting that we just remove these pieces.  We would need to
> have a main artifact and several supporting artifacts.  The total size
> would be virtually the same, so the concerns in LUCENE-5589 and
> LUCENE-6247 will not get worse.  They also won't get better.
> 
> There's plenty of opportunity for bikeshedding here, but that should be
> done in Jira.  For this email, I'd like to know if anyone has strong
> opposition to this, and if not, who would be willing to provide guidance
> for how to do it right.
> 
> Thanks,
> Shawn
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Splitting Solr artifacts so the main download is smaller

Posted by David Smiley <da...@gmail.com>.
Apparently no strong opinions against; so go for it!

On Sat, Mar 19, 2016 at 2:14 PM Shawn Heisey <ap...@elyograg.org> wrote:

> I'd like to see some motion on this, which probably means I need to do
> it myself.  I'd like to know who I can talk to about the build/packaging
> system so I can find what needs to change, and especially so I don't
> break it.
>
> There's already a jira issue -- SOLR-6806, with some related bits in
> SOLR-5103.
>
> The Solr download for 5.5.0 is 130 or 138 megabytes, depending on what
> OS you're going to install it on.  For the rest of this email, let's
> focus on the .zip version (138MB), since my client is Windows and I'd
> like to compare apples to apples.
>
> We have a .zip download size of 138MB, which thankfully is down in size
> since we completely dropped the war file.  That *other* search engine
> based on Lucene has a .zip download size of 28MB.
>
> I started fiddling with the download archive on my Windows machine,
> pulling out obvious pieces at the root of the extracted archive, and
> managed to get the .zip size down to 40MB.
>
> If I dig further and remove the lucene-analyzers-kuromoji jar (over 4MB)
> and the hadoop jars (10MB), which the majority of Solr's users will
> *never* need, Solr 5.5's .zip file drops to 25MB.
>
> I'm not suggesting that we just remove these pieces.  We would need to
> have a main artifact and several supporting artifacts.  The total size
> would be virtually the same, so the concerns in LUCENE-5589 and
> LUCENE-6247 will not get worse.  They also won't get better.
>
> There's plenty of opportunity for bikeshedding here, but that should be
> done in Jira.  For this email, I'd like to know if anyone has strong
> opposition to this, and if not, who would be willing to provide guidance
> for how to do it right.
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com