You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Demian Katz <de...@villanova.edu> on 2016/08/01 15:04:04 UTC

Installing Solr with Ivy

As a follow-up to last week's thread about loading Solr via dependency manager, I started experimenting with using Ivy to install Solr. Here's what I have (note that I'm trying to install Solr 5.5.0 as an arbitrary example, but that detail should not be important):

ivy.xml:

<ivy-module version="2.0">
    <info organisation="org.vufind" module="vufind"/>
    <dependencies>
        <dependency org="org.apache.solr" name="solr-parent" rev="5.5.0" />
    </dependencies>
</ivy-module>

build.xml:

<project xmlns:ivy="antlib:org.apache.ivy.ant" name="vufind" default="resolve">
    <target name="resolve" description="--> retrieve dependencies with ivy">
        <ivy:retrieve />
    </target>
</project>

My hope, based on a quick read of some Ivy tutorials, was that simply running "ant" with the above configs would give me a copy of Solr in my lib directory. When I use example libraries from the tutorials in my ivy.xml, I do indeed get files installed... but when I try to substitute the Solr package,
no files are installed ("0 artifacts copied"). I'm not very experienced with any of these tools or repositories, so I'm not sure where I'm going wrong.

- Do I need to add some extra configuration somewhere to tell Ivy to download the constituent parts of the solr-parent package?
- Is the solr-parent package the wrong thing to be using? (I tried replacing solr-parent with solr-core and ended up with many .jar files in my lib directory, which was better than nothing, but the .jar files were not organized into a directory structure and were not accompanied by any of the non-.jar files like shell scripts that make Solr tick).
- Am I just completely on the wrong track? (I do realize that there may not be a way to pull a fully-functional Solr out of the core Maven repository... but it seemed worth a try!)

Any suggestions would be greatly appreciated!

thanks,
Demian

RE: Installing Solr with Ivy

Posted by "Davis, Daniel (NIH/NLM) [C]" <da...@nih.gov>.

I think the free versions of either Artifactory or Sonatype Nexus would be able to be this cache in a very effective, cloud ready way.   This way, you would not be dependent on shared directories.    You would just need some task to pull down Solr and checksums and publish them into the repository.

I've done php, but I shudder.   I know you do a lot - we've looked at VuFind for our discovery layer here at NLM.

-----Original Message-----
From: Demian Katz [mailto:demian.katz@villanova.edu] 
Sent: Wednesday, August 03, 2016 9:31 AM
To: solr-user@lucene.apache.org
Subject: RE: Installing Solr with Ivy

Dan,

In case you, or anyone else, is interested, let me share my current solution-in-progress:

https://github.com/vufind-org/vufind/pull/769

I've written a Phing task for my project (Phing is the PHP equivalent of Ant) which takes some loose inspiration from your Ant download task. The task uses a local directory to cache Solr distributions and only hits Apache servers if the cache lacks the requested version. This cache can be retained on my continuous integration and development servers, so I think this should get me the effect I desire without putting an unreasonable amount of load on the archive servers. I'd still love in theory to find a solution that's a little more future-proof than "build a URL and download from it," but for now, I think this will get me through.

Thanks again!

- Demian

-----Original Message-----
From: Davis, Daniel (NIH/NLM) [C] [mailto:daniel.davis@nih.gov] 
Sent: Tuesday, August 02, 2016 11:33 AM
To: solr-user@lucene.apache.org
Subject: RE: Installing Solr with Ivy

Demian,

I've long meant to upload my own "automated installation" - it is ant without ivy, but with checksums.   I suppose gpg signatures could also be worked in.
It is only semi-automated, because our DevOps group does not have root, but here is a clean version - https://github.com/danizen/solr-ant-install

System administrators prepare the environment:
- creating a directory for solr (/opt/solr) and logs (/var/logs/solr), maybe a different volume for solr data.
- create an administrative user with a shell (owns the code)
- create an operational user who runs solr (no shell, cannot modify the code)
- install the initscripts
- setup sudoers rules

The installation this supports is very, very small, and I do not intend to support the cleaned version of this going forward.   I will update the README.md to make that clear.

I agree with your summary of the difference.   One more aspect of maturity/fullness of solution - MySQL/PostgreSQL etc. support multiple projects on the same server, at least administratively.   Solr is getting there, but until role-based access control (RBAC) is strong enough out-of-the-box, it is hard to setup a *shared* Solr server.    Yet it is very common to do that with database servers, and in fact doing this is a common way to avoid siloed applications.    Unfortunately, HTTP auth is not quite good enough for me; but it is only my own fault I haven't contributed something more.

Dan Davis, Systems/Applications Architect (Contractor), Office of Computer and Communications Systems, National Library of Medicine, NIH

-----Original Message-----
From: Demian Katz [mailto:demian.katz@villanova.edu]
Sent: Tuesday, August 02, 2016 8:37 AM
To: solr-user@lucene.apache.org
Subject: RE: Installing Solr with Ivy

Thanks, Shawn, for confirming my suspicions.

Regarding your question about how Solr differs from a database server, I agree with you in theory, but the problem is in the practice: there are very easy, familiar, well-established techniques for installing and maintaining database platforms, and these platforms are mature enough that they evolve slowly and most versions are closely functionally equivalent to one another. Solr is comparatively young (not immature, but young).

Solr still (as far as I can tell) lacks standard package support in the default repos of the major Linux distros, and frequently breaks backward compatibility between versions in large and small ways (particularly in the internal API, but sometimes also in the configuration files). Those are not intended as criticisms of Solr -- they're to a large extent positive signs of activity and growth -- but they are, as far as I can tell, the current realities of working with the software.

For a developer with the right experience and knowledge, it's no big deal to navigate these challenges. However, my package is designed to be friendly to a less experienced, more generalized non-technical audience, and bundling Solr in the package instead of trying to guide the user through a potentially confusing manual installation process greatly simplifies the task of getting things up and running, saving me from having to field support emails from people who can't figure out how to install Solr on their platform, or those who end up with a version that's incompatible with my project's configurations and custom handlers.

At this point, my main goal is to revise the bundling process so that instead of storing Solr in Git, I can install it on-demand with a simple automated process during continuous integration builds and packaging for release. In the longer term, if the environmental factors change, I'd certainly prefer to stop bundling it entirely... but I don't think that is practical for my audience at this stage.

In any case, sorry for the long-winded reply, but hopefully that helps clarify my situation.

- Demian

-----Original Message-----

[...snip...]

In a theoretical situation where your program talked an SQL database, would you include a database server in your project?  How much time would you invest in automating the download and install of MySQL, Postgres, or some other database?  I think what you would do in that situation is include client code to talk to the database and expect the user to provide the server and prepare it for your program.  In this respect, how is a Solr server any different than a database server?

Thanks,
Shawn

RE: Installing Solr with Ivy

Posted by Demian Katz <de...@villanova.edu>.

Dan,

In case you, or anyone else, is interested, let me share my current solution-in-progress:

https://github.com/vufind-org/vufind/pull/769

I've written a Phing task for my project (Phing is the PHP equivalent of Ant) which takes some loose inspiration from your Ant download task. The task uses a local directory to cache Solr distributions and only hits Apache servers if the cache lacks the requested version. This cache can be retained on my continuous integration and development servers, so I think this should get me the effect I desire without putting an unreasonable amount of load on the archive servers. I'd still love in theory to find a solution that's a little more future-proof than "build a URL and download from it," but for now, I think this will get me through.

Thanks again!

- Demian

-----Original Message-----
From: Davis, Daniel (NIH/NLM) [C] [mailto:daniel.davis@nih.gov] 
Sent: Tuesday, August 02, 2016 11:33 AM
To: solr-user@lucene.apache.org
Subject: RE: Installing Solr with Ivy

Demian,

I've long meant to upload my own "automated installation" - it is ant without ivy, but with checksums.   I suppose gpg signatures could also be worked in.
It is only semi-automated, because our DevOps group does not have root, but here is a clean version - https://github.com/danizen/solr-ant-install

System administrators prepare the environment:
- creating a directory for solr (/opt/solr) and logs (/var/logs/solr), maybe a different volume for solr data.
- create an administrative user with a shell (owns the code)
- create an operational user who runs solr (no shell, cannot modify the code)
- install the initscripts
- setup sudoers rules

The installation this supports is very, very small, and I do not intend to support the cleaned version of this going forward.   I will update the README.md to make that clear.

I agree with your summary of the difference.   One more aspect of maturity/fullness of solution - MySQL/PostgreSQL etc. support multiple projects on the same server, at least administratively.   Solr is getting there, but until role-based access control (RBAC) is strong enough out-of-the-box, it is hard to setup a *shared* Solr server.    Yet it is very common to do that with database servers, and in fact doing this is a common way to avoid siloed applications.    Unfortunately, HTTP auth is not quite good enough for me; but it is only my own fault I haven't contributed something more.

Dan Davis, Systems/Applications Architect (Contractor), Office of Computer and Communications Systems, National Library of Medicine, NIH







-----Original Message-----
From: Demian Katz [mailto:demian.katz@villanova.edu]
Sent: Tuesday, August 02, 2016 8:37 AM
To: solr-user@lucene.apache.org
Subject: RE: Installing Solr with Ivy

Thanks, Shawn, for confirming my suspicions.

Regarding your question about how Solr differs from a database server, I agree with you in theory, but the problem is in the practice: there are very easy, familiar, well-established techniques for installing and maintaining database platforms, and these platforms are mature enough that they evolve slowly and most versions are closely functionally equivalent to one another. Solr is comparatively young (not immature, but young).

Solr still (as far as I can tell) lacks standard package support in the default repos of the major Linux distros, and frequently breaks backward compatibility between versions in large and small ways (particularly in the internal API, but sometimes also in the configuration files). Those are not intended as criticisms of Solr -- they're to a large extent positive signs of activity and growth -- but they are, as far as I can tell, the current realities of working with the software.

For a developer with the right experience and knowledge, it's no big deal to navigate these challenges. However, my package is designed to be friendly to a less experienced, more generalized non-technical audience, and bundling Solr in the package instead of trying to guide the user through a potentially confusing manual installation process greatly simplifies the task of getting things up and running, saving me from having to field support emails from people who can't figure out how to install Solr on their platform, or those who end up with a version that's incompatible with my project's configurations and custom handlers.

At this point, my main goal is to revise the bundling process so that instead of storing Solr in Git, I can install it on-demand with a simple automated process during continuous integration builds and packaging for release. In the longer term, if the environmental factors change, I'd certainly prefer to stop bundling it entirely... but I don't think that is practical for my audience at this stage.

In any case, sorry for the long-winded reply, but hopefully that helps clarify my situation.

- Demian

-----Original Message-----

[...snip...]

In a theoretical situation where your program talked an SQL database, would you include a database server in your project?  How much time would you invest in automating the download and install of MySQL, Postgres, or some other database?  I think what you would do in that situation is include client code to talk to the database and expect the user to provide the server and prepare it for your program.  In this respect, how is a Solr server any different than a database server?

Thanks,
Shawn

RE: Installing Solr with Ivy

Posted by Demian Katz <de...@villanova.edu>.

Dan,

Thanks for taking the time to share this! I'll give it a test run in the near future and will happily share improvements if I come up with any (though I'll most likely be focusing on the download steps rather than the subsequent configuration).

- Demian

-----Original Message-----
From: Davis, Daniel (NIH/NLM) [C] [mailto:daniel.davis@nih.gov] 
Sent: Tuesday, August 02, 2016 11:33 AM
To: solr-user@lucene.apache.org
Subject: RE: Installing Solr with Ivy

Demian,

I've long meant to upload my own "automated installation" - it is ant without ivy, but with checksums.   I suppose gpg signatures could also be worked in.
It is only semi-automated, because our DevOps group does not have root, but here is a clean version - https://github.com/danizen/solr-ant-install

System administrators prepare the environment:
- creating a directory for solr (/opt/solr) and logs (/var/logs/solr), maybe a different volume for solr data.
- create an administrative user with a shell (owns the code)
- create an operational user who runs solr (no shell, cannot modify the code)
- install the initscripts
- setup sudoers rules

The installation this supports is very, very small, and I do not intend to support the cleaned version of this going forward.   I will update the README.md to make that clear.

I agree with your summary of the difference.   One more aspect of maturity/fullness of solution - MySQL/PostgreSQL etc. support multiple projects on the same server, at least administratively.   Solr is getting there, but until role-based access control (RBAC) is strong enough out-of-the-box, it is hard to setup a *shared* Solr server.    Yet it is very common to do that with database servers, and in fact doing this is a common way to avoid siloed applications.    Unfortunately, HTTP auth is not quite good enough for me; but it is only my own fault I haven't contributed something more.

Dan Davis, Systems/Applications Architect (Contractor), Office of Computer and Communications Systems, National Library of Medicine, NIH

-----Original Message-----
From: Demian Katz [mailto:demian.katz@villanova.edu]
Sent: Tuesday, August 02, 2016 8:37 AM
To: solr-user@lucene.apache.org
Subject: RE: Installing Solr with Ivy

Thanks, Shawn, for confirming my suspicions.

Regarding your question about how Solr differs from a database server, I agree with you in theory, but the problem is in the practice: there are very easy, familiar, well-established techniques for installing and maintaining database platforms, and these platforms are mature enough that they evolve slowly and most versions are closely functionally equivalent to one another. Solr is comparatively young (not immature, but young).

Solr still (as far as I can tell) lacks standard package support in the default repos of the major Linux distros, and frequently breaks backward compatibility between versions in large and small ways (particularly in the internal API, but sometimes also in the configuration files). Those are not intended as criticisms of Solr -- they're to a large extent positive signs of activity and growth -- but they are, as far as I can tell, the current realities of working with the software.

For a developer with the right experience and knowledge, it's no big deal to navigate these challenges. However, my package is designed to be friendly to a less experienced, more generalized non-technical audience, and bundling Solr in the package instead of trying to guide the user through a potentially confusing manual installation process greatly simplifies the task of getting things up and running, saving me from having to field support emails from people who can't figure out how to install Solr on their platform, or those who end up with a version that's incompatible with my project's configurations and custom handlers.

At this point, my main goal is to revise the bundling process so that instead of storing Solr in Git, I can install it on-demand with a simple automated process during continuous integration builds and packaging for release. In the longer term, if the environmental factors change, I'd certainly prefer to stop bundling it entirely... but I don't think that is practical for my audience at this stage.

In any case, sorry for the long-winded reply, but hopefully that helps clarify my situation.

- Demian

-----Original Message-----

[...snip...]

In a theoretical situation where your program talked an SQL database, would you include a database server in your project?  How much time would you invest in automating the download and install of MySQL, Postgres, or some other database?  I think what you would do in that situation is include client code to talk to the database and expect the user to provide the server and prepare it for your program.  In this respect, how is a Solr server any different than a database server?

Thanks,
Shawn

RE: Installing Solr with Ivy

Posted by "Davis, Daniel (NIH/NLM) [C]" <da...@nih.gov>.

Demian,

I've long meant to upload my own "automated installation" - it is ant without ivy, but with checksums.   I suppose gpg signatures could also be worked in.
It is only semi-automated, because our DevOps group does not have root, but here is a clean version - https://github.com/danizen/solr-ant-install

System administrators prepare the environment:
- creating a directory for solr (/opt/solr) and logs (/var/logs/solr), maybe a different volume for solr data.
- create an administrative user with a shell (owns the code)
- create an operational user who runs solr (no shell, cannot modify the code)
- install the initscripts
- setup sudoers rules

The installation this supports is very, very small, and I do not intend to support the cleaned version of this going forward.   I will update the README.md to make that clear.

I agree with your summary of the difference.   One more aspect of maturity/fullness of solution - MySQL/PostgreSQL etc. support multiple projects on the same server, at least administratively.   Solr is getting there, but until role-based access control (RBAC) is strong enough out-of-the-box, it is hard to setup a *shared* Solr server.    Yet it is very common to do that with database servers, and in fact doing this is a common way to avoid siloed applications.    Unfortunately, HTTP auth is not quite good enough for me; but it is only my own fault I haven't contributed something more.

Dan Davis, Systems/Applications Architect (Contractor),
Office of Computer and Communications Systems,
National Library of Medicine, NIH







-----Original Message-----
From: Demian Katz [mailto:demian.katz@villanova.edu] 
Sent: Tuesday, August 02, 2016 8:37 AM
To: solr-user@lucene.apache.org
Subject: RE: Installing Solr with Ivy

Thanks, Shawn, for confirming my suspicions.

Regarding your question about how Solr differs from a database server, I agree with you in theory, but the problem is in the practice: there are very easy, familiar, well-established techniques for installing and maintaining database platforms, and these platforms are mature enough that they evolve slowly and most versions are closely functionally equivalent to one another. Solr is comparatively young (not immature, but young).

Solr still (as far as I can tell) lacks standard package support in the default repos of the major Linux distros, and frequently breaks backward compatibility between versions in large and small ways (particularly in the internal API, but sometimes also in the configuration files). Those are not intended as criticisms of Solr -- they're to a large extent positive signs of activity and growth -- but they are, as far as I can tell, the current realities of working with the software.

For a developer with the right experience and knowledge, it's no big deal to navigate these challenges. However, my package is designed to be friendly to a less experienced, more generalized non-technical audience, and bundling Solr in the package instead of trying to guide the user through a potentially confusing manual installation process greatly simplifies the task of getting things up and running, saving me from having to field support emails from people who can't figure out how to install Solr on their platform, or those who end up with a version that's incompatible with my project's configurations and custom handlers.

At this point, my main goal is to revise the bundling process so that instead of storing Solr in Git, I can install it on-demand with a simple automated process during continuous integration builds and packaging for release. In the longer term, if the environmental factors change, I'd certainly prefer to stop bundling it entirely... but I don't think that is practical for my audience at this stage.

In any case, sorry for the long-winded reply, but hopefully that helps clarify my situation.

- Demian

-----Original Message-----

[...snip...]

In a theoretical situation where your program talked an SQL database, would you include a database server in your project?  How much time would you invest in automating the download and install of MySQL, Postgres, or some other database?  I think what you would do in that situation is include client code to talk to the database and expect the user to provide the server and prepare it for your program.  In this respect, how is a Solr server any different than a database server?

Thanks,
Shawn

RE: Installing Solr with Ivy

Posted by Demian Katz <de...@villanova.edu>.

Thanks, Shawn, for confirming my suspicions.

Regarding your question about how Solr differs from a database server, I agree with you in theory, but the problem is in the practice: there are very easy, familiar, well-established techniques for installing and maintaining database platforms, and these platforms are mature enough that they evolve slowly and most versions are closely functionally equivalent to one another. Solr is comparatively young (not immature, but young).

Solr still (as far as I can tell) lacks standard package support in the default repos of the major Linux distros, and frequently breaks backward compatibility between versions in large and small ways (particularly in the internal API, but sometimes also in the configuration files). Those are not intended as criticisms of Solr -- they're to a large extent positive signs of activity and growth -- but they are, as far as I can tell, the current realities of working with the software.

For a developer with the right experience and knowledge, it's no big deal to navigate these challenges. However, my package is designed to be friendly to a less experienced, more generalized non-technical audience, and bundling Solr in the package instead of trying to guide the user through a potentially confusing manual installation process greatly simplifies the task of getting things up and running, saving me from having to field support emails from people who can't figure out how to install Solr on their platform, or those who end up with a version that's incompatible with my project's configurations and custom handlers.

At this point, my main goal is to revise the bundling process so that instead of storing Solr in Git, I can install it on-demand with a simple automated process during continuous integration builds and packaging for release. In the longer term, if the environmental factors change, I'd certainly prefer to stop bundling it entirely... but I don't think that is practical for my audience at this stage.

In any case, sorry for the long-winded reply, but hopefully that helps clarify my situation.

- Demian

-----Original Message-----

[...snip...]

In a theoretical situation where your program talked an SQL database, would you include a database server in your project?  How much time would you invest in automating the download and install of MySQL, Postgres, or some other database?  I think what you would do in that situation is include client code to talk to the database and expect the user to provide the server and prepare it for your program.  In this respect, how is a Solr server any different than a database server?

Thanks,
Shawn

Re: Installing Solr with Ivy

Posted by Shawn Heisey <ap...@elyograg.org>.

On 8/1/2016 9:04 AM, Demian Katz wrote:
> As a follow-up to last week's thread about loading Solr via dependency manager, I started experimenting with using Ivy to install Solr. Here's what I have (note that I'm trying to install Solr 5.5.0 as an arbitrary example, but that detail should not be important):
<snip>
> My hope, based on a quick read of some Ivy tutorials, was that simply running "ant" with the above configs would give me a copy of Solr in my lib directory. When I use example libraries from the tutorials in my ivy.xml, I do indeed get files installed... but when I try to substitute the Solr package,
> no files are installed ("0 artifacts copied"). I'm not very experienced with any of these tools or repositories, so I'm not sure where I'm going wrong.
>
> - Do I need to add some extra configuration somewhere to tell Ivy to download the constituent parts of the solr-parent package?
> - Is the solr-parent package the wrong thing to be using? (I tried replacing solr-parent with solr-core and ended up with many .jar files in my lib directory, which was better than nothing, but the .jar files were not organized into a directory structure and were not accompanied by any of the non-.jar files like shell scripts that make Solr tick).
> - Am I just completely on the wrong track? (I do realize that there may not be a way to pull a fully-functional Solr out of the core Maven repository... but it seemed worth a try!)

The general use for ivy is to download development libraries as part of
the build process.  Downloading applications might be possible, but it's
a little outside what it was designed to do.

Looking into what's in solr-parent in maven central, it appears that the
only thing this contains is a Maven POM (an XML file) -- no binary
artifacts at all.  I doubt that's useful.  As you already noticed,
solr-core just gives you lots of jars that would let you embed a Solr
server into your own code -- it's not a full application.

In a theoretical situation where your program talked an SQL database,
would you include a database server in your project?  How much time
would you invest in automating the download and install of MySQL,
Postgres, or some other database?  I think what you would do in that
situation is include client code to talk to the database and expect the
user to provide the server and prepare it for your program.  In this
respect, how is a Solr server any different than a database server?

Thanks,
Shawn