You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@s2graph.apache.org by Jong Wook Kim <jo...@nyu.edu> on 2016/04/26 02:33:37 UTC

Need to make a distribution package of s2graph

>From a newcomer's perspective, there are number of pitfalls that makes it
difficult to run S2graph. For instance, the only way for an outsider to run
s2graph is to do via Vagrant as the README says. However:

- JDK 7 compiler (in the vagrant image) fails because the codebase now uses
JDK 8 features (e.g. java.util.Base64)
- To properly run the database we have to run HDFS and HBase services,
which is inconvenient and not documented anywhere
- "sbt run" requires a nontrivial time to resolve all the dependencies and
building. Because sbt compiles the project files upon the first HTTP
request, the user experience of laggy first page is not very ideal.

I suppose one way to lower the entry barrier and make this project
user-friendly is to make a release package, so that first-comers can just
download and unzip the package, and run the script in any unix-like
environment. Tools like Vagrant (or docker, puppet, chef, whatnot) are
convenient for developers, but not so much for end-users.

One prerequisite for doing this is to make the package self-contained, so
that users are not required to set up any dependencies, where currently
MySQL and HBase are the main hassle. I think we can make the S2graph
package self-contained by doing the followings:

- Since hbase-server project is already in the dependency (why?), with a
few configuration scripts we should be able to run the HBase server using
the local filesystem without incurring an enormous space requirement.
- With a minimal modification to scalikejdbc codes, we should be able to
avoid using MySQL in favor of an embedded or in-memory database like
derby/h2.

We could follow what Play! framework does
<https://github.com/playframework/playframework/tree/master/framework/src/sbt-plugin/src/sbt-test/play-sbt-plugin/distribution>
and
make an SBT plugin that makes the package , or make a custom script that
assembles all jars and makes the directory layout for the distribution
package like Apache Spark does
<https://github.com/apache/spark/blob/master/dev/make-distribution.sh>.

I'm willing to go ahead and start working on this distribution packaging,
let me know what you guys think.



Jong Wook

Re: Need to make a distribution package of s2graph

Posted by DO YUNG YOON <sh...@gmail.com>.

Hi Jong Wook.

Thanks for suggestion and I totally agree with your point.
+1 on make s2graph more user friendly and also on suggested direction.

I am happy to help especially on scalikejdbc code modification part.
Can you create jira issue? I would love to work on this together.






On Tue, Apr 26, 2016 at 9:33 AM Jong Wook Kim <jo...@nyu.edu> wrote:

> From a newcomer's perspective, there are number of pitfalls that makes it
> difficult to run S2graph. For instance, the only way for an outsider to run
> s2graph is to do via Vagrant as the README says. However:
>
> - JDK 7 compiler (in the vagrant image) fails because the codebase now uses
> JDK 8 features (e.g. java.util.Base64)
> - To properly run the database we have to run HDFS and HBase services,
> which is inconvenient and not documented anywhere
> - "sbt run" requires a nontrivial time to resolve all the dependencies and
> building. Because sbt compiles the project files upon the first HTTP
> request, the user experience of laggy first page is not very ideal.
>
> I suppose one way to lower the entry barrier and make this project
> user-friendly is to make a release package, so that first-comers can just
> download and unzip the package, and run the script in any unix-like
> environment. Tools like Vagrant (or docker, puppet, chef, whatnot) are
> convenient for developers, but not so much for end-users.
>
> One prerequisite for doing this is to make the package self-contained, so
> that users are not required to set up any dependencies, where currently
> MySQL and HBase are the main hassle. I think we can make the S2graph
> package self-contained by doing the followings:
>
> - Since hbase-server project is already in the dependency (why?), with a
> few configuration scripts we should be able to run the HBase server using
> the local filesystem without incurring an enormous space requirement.
> - With a minimal modification to scalikejdbc codes, we should be able to
> avoid using MySQL in favor of an embedded or in-memory database like
> derby/h2.
>
> We could follow what Play! framework does
> <
> https://github.com/playframework/playframework/tree/master/framework/src/sbt-plugin/src/sbt-test/play-sbt-plugin/distribution
> >
> and
> make an SBT plugin that makes the package , or make a custom script that
> assembles all jars and makes the directory layout for the distribution
> package like Apache Spark does
> <https://github.com/apache/spark/blob/master/dev/make-distribution.sh>.
>
> I'm willing to go ahead and start working on this distribution packaging,
> let me know what you guys think.
>
>
>
> Jong Wook
>