You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Paolo Castagna <ca...@googlemail.com> on 2012/03/12 10:37:59 UTC

Fuseki: any issue which need to be closed before a release?

Hi,
I had a look at the open issues related to Fuseki. Here is the list:

 - https://issues.apache.org/jira/browse/JENA-162 (major) (Rob, does this need to be included in the first Fuseki release?)
 - https://issues.apache.org/jira/browse/JENA-218 (major) (... a patch from Alexander might come for this)
 - https://issues.apache.org/jira/browse/JENA-205 (minor)
 - https://issues.apache.org/jira/browse/JENA-171 (minor)
 - https://issues.apache.org/jira/browse/JENA-172 (minor)
 - https://issues.apache.org/jira/browse/JENA-214 (minor)
 - https://issues.apache.org/jira/browse/JENA-201 (minor)
 - https://issues.apache.org/jira/browse/JENA-220 (minor)

Is any of this a blocker for a first release?

I am confused about JENA-220. Other than that, they all seems quite useful and reasonable.
Two are flagged as 'major' but I have not looked in details how much work is needed yet. They don't seem to me to be 'blockers' for a release. What do you think?

JENA-201, even if flagged as minor, I think it's quite important if we want to successfully allow Fuseki to be deployed and used in 'enterprisy' environments. In such places, you have an app server or
a servlet container and that's it, no discussion, you need to use that (and therefore deploy via standard .war which we know are well supported by all the app server and/or servlet containers around.
Fortunately!). Ok, it's not all places like this, but mostly... correct me if I am wrong.
I had a look at JENA-201 and Andy's proposal of a "single dispatcher to work on URI patterns" seems, on paper, to be the best option. It allows us to take full control of requests dispatching and it
minimizes, if not evict, any Jetty and/or other containers specific code. We just have our config and configure the dispatcher via standard web.xml. A lot of web framework and/or libraries which work
in a web app do this way. It makes sense. Don't get me wrong, I like Jetty and the unzip and run approach. But we could (and should) support both scenarios: unzip and run as well as deploy a war in a
servlet container.

I opened JENA-214 and had a look at how it could be implemented. However, considering JENA-201, a solution depending on Jetty isn't the best option. A single dispatcher would allow us to easily
control read-only/read-write mode for datasets as well and make JENA-214 simple to implement and portable across different servlet containers. The rationale behind JENA-214 is that this simple
management function is the minimum to enable all sort of things which are useful in production environment. For example, one could easily deploy Fuseki on multiple machines in a master/slave
configuration (similar to what Apache Solr and many other solutions used to do): you have one master which receive all your updates and as many slaves as you want to server your read requests. You
achieve high-availability on the read path. The master is a single point of failures, if it goes down you cannot update. There might be a time gap between an update and when the update is available on
all the slaves (but many use cases can live with that). Replication can be done externally to Fuseki, for example, using rsync. Backup is another very useful functionality and it is now included in
Fuseki (kudos to Andy).

I'll have a look at JENA-201 and make some experiments to see how feasible the "single dispatcher" is. There should not be much difference or implications in terms of performances right? Servlets are
instantiated from a thread pool and so long no expensive objects are created on a per-request basis a single dispatcher should be ok, right? Anyway, I'll come back this when I look (*) at it more.

Paolo

PS:
(*) 'looking' does not mean "assign the issue to me". I am not sure I have time for it and/or I am best positioned to fix it with the best solution. But, I want to learn and understand how it can be
done and I think it's important, therefore: I am looking into it. :-)

Re: Fuseki: any issue which need to be closed before a release?

Posted by Robert Vesse <rv...@yarcdata.com>.
JENA-162 is closable IMO - Trunk has allowed queries with FROM/FROM NAMED for some time and leaves the interpretation of that dataset up to the underlying query engine which I think is a reasonable implementation

if no-one objects I will close this out soon

JENA-201 should not be a blocker for this release, it is definitely a nice to have longer term but short term I'd prefer to have a new Fuseki release sooner rather than later.

Rob

On Mar 12, 2012, at 4:37 AM, Andy Seaborne wrote:

> On 12/03/12 10:07, Alexander Dutton wrote:
>> Hi Paolo,
>> 
>> On 12/03/12 09:37, Paolo Castagna wrote:
>>> https://issues.apache.org/jira/browse/JENA-218 (major) (... a patch
>>> from Alexander might come for this)
>> 
>> I'm happy to push this further up my todo list if people feel it would
>> be a useful thing to include in the release. It would make our lives
>> easier here were it to be in the release, but if there's not that much
>> demand for it then it needn't block. Also, I don't remember any concious
>> decision to mark it as major, so I probably forgot to set that field
>> appropriately.
>> 
>>> Is any of this a blocker for a first release?
>> 
>> ^^
> 
> This would be a good feature to have; IMHO, no, it's not a blocker.
> 
> This isn't enterprise product release cycles with promised functionality on a once-in-X-years release basic.  It's more continuous.
> 
> I was going to build the release this week (subject to finding time).
> 
> It will help to have a versioned version.
> 
>> 
>>> JENA-201, even if flagged as minor, I think it's quite important if
>>> we want to successfully allow Fuseki to be deployed and used in
>>> 'enterprisy' environments.
>> 
>> Another thing that might be useful for 'enterprisy' environments would
>> be Debian and/or RPM packaging, though these probably want to wait until
>> after the release. (I've had a look at the Debian side, but I'm no
>> expert, and dealing with non-tarball upstreams (as things are at the
>> moment) is messy).
> 
> Also good to have.
> 
>> 
>> All the best,
>> 
>> Alex
> 
> 	Andy


Re: Fuseki: any issue which need to be closed before a release?

Posted by Andy Seaborne <an...@apache.org>.
On 12/03/12 10:07, Alexander Dutton wrote:
> Hi Paolo,
>
> On 12/03/12 09:37, Paolo Castagna wrote:
>> https://issues.apache.org/jira/browse/JENA-218 (major) (... a patch
>> from Alexander might come for this)
>
> I'm happy to push this further up my todo list if people feel it would
> be a useful thing to include in the release. It would make our lives
> easier here were it to be in the release, but if there's not that much
> demand for it then it needn't block. Also, I don't remember any concious
> decision to mark it as major, so I probably forgot to set that field
> appropriately.
>
>> Is any of this a blocker for a first release?
>
> ^^

This would be a good feature to have; IMHO, no, it's not a blocker.

This isn't enterprise product release cycles with promised functionality 
on a once-in-X-years release basic.  It's more continuous.

I was going to build the release this week (subject to finding time).

It will help to have a versioned version.

>
>> JENA-201, even if flagged as minor, I think it's quite important if
>> we want to successfully allow Fuseki to be deployed and used in
>> 'enterprisy' environments.
>
> Another thing that might be useful for 'enterprisy' environments would
> be Debian and/or RPM packaging, though these probably want to wait until
> after the release. (I've had a look at the Debian side, but I'm no
> expert, and dealing with non-tarball upstreams (as things are at the
> moment) is messy).

Also good to have.

>
> All the best,
>
> Alex

	Andy

Re: Fuseki: any issue which need to be closed before a release?

Posted by Paolo Castagna <ca...@googlemail.com>.
Hi

Leo Simons wrote:
> Hey folks,
> 
> On Mon, Mar 12, 2012 at 11:07 AM, Alexander Dutton
> <al...@oucs.ox.ac.uk> wrote:
>>> JENA-201, even if flagged as minor, I think it's quite important if
>>> we want to successfully allow Fuseki to be deployed and used in
>>> 'enterprisy' environments.
>> Another thing that might be useful for 'enterprisy' environments would
>> be Debian and/or RPM packaging, though these probably want to wait until
>> after the release. (I've had a look at the Debian side, but I'm no
>> expert, and dealing with non-tarball upstreams (as things are at the
>> moment) is messy).
> 
> For the record, as someone with some experience with the 'enterpricy'
> stuff, I personally wouldn't try to provide .deb/.rpm/.msi for java
> webapps, at least not for the enterprise.

Thanks Leo for sharing your expertise and advice.

I always follow informed advices which imply no work needed. ;-)

> The issue is "where do we actually install stuff". I.e. a big java
> shop will have a standard web container setup and will want to
> integrate a fuseki deployment into that, and you won't be able to know
> the details of that setup. It tends to not be the standard tomcat that
> comes with the OS.
> 
> I think you need three things:
> * a standard .war that drag-and-drop installs in any servlet container

Yep.

> * an svn tag corresponding to a versioned release that is readily
> buildable using standard tools
> * a description of actual real life deployment environments
> (hardware+software+config+service dependencies), associated data sets
> and achieved performance/uptime, easily findable on your website

That would be really be useful, alongside best practices for:

 - high availability
 - security
 - load balancing
 - monitoring
 - request throttling
 - ...

Even if you get away with these saying: "it's just a standard web app".

Just checking, by "your website" do you mean Apache Jena website?

What other Apache projects do that? (sorry for my ignorance).
We use TDB, but not Fuseki (when we started Fuseki wasn't there).
Some of us, use Fuseki internally.

Having said that, things such as performances and/or uptime do not
depend only on a specific software, it's a mix of different software,
architecture, processes and people. Isn't it?

> * links to available commercial support

Yep.

> The first will be used to evaluate how much work needs to be done to
> beat the app into submission, following the standard rules.
> 
> The second will get used to create a vendor branch and then customize
> the build to produce a custom-enterprise-style rpm containing a
> custom-enterprise-style war to go into a custom-enterprise-style
> tomcat.
> 
> The third will be used for capacity planning and to inform load
> testing as to whether the achieved performance is reasonable. Since
> the people doing those things may be different from the people that
> are wanting to use fuseki, and so they won't be familiar with jena or
> fuseki, it's pretty important to make it easy enough to find.
> 
> The commercial support may not get used, but it tends to at least be
> considered a lower risk to use something if you know you can go and
> pay for someone to come in and fix messed up systems a year later.
> 
> All the above *doesn't* mean providing a .deb/.rpm/.msi/.whatever is
> useless. But it'll mostly be because its useful for 'normal' end
> users, not for the enterpricy folks.
> 
> Heh. I didn't mean to write *all* that :-). Hope it helps, anyway.

Thanks.

Paolo

> 
> 
> cheers!
> 
> 
> Leo


Re: Fuseki: any issue which need to be closed before a release?

Posted by Leo Simons <ma...@leosimons.com>.
Hey folks,

On Mon, Mar 12, 2012 at 11:07 AM, Alexander Dutton
<al...@oucs.ox.ac.uk> wrote:
>> JENA-201, even if flagged as minor, I think it's quite important if
>> we want to successfully allow Fuseki to be deployed and used in
>> 'enterprisy' environments.
>
> Another thing that might be useful for 'enterprisy' environments would
> be Debian and/or RPM packaging, though these probably want to wait until
> after the release. (I've had a look at the Debian side, but I'm no
> expert, and dealing with non-tarball upstreams (as things are at the
> moment) is messy).

For the record, as someone with some experience with the 'enterpricy'
stuff, I personally wouldn't try to provide .deb/.rpm/.msi for java
webapps, at least not for the enterprise.

The issue is "where do we actually install stuff". I.e. a big java
shop will have a standard web container setup and will want to
integrate a fuseki deployment into that, and you won't be able to know
the details of that setup. It tends to not be the standard tomcat that
comes with the OS.

I think you need three things:
* a standard .war that drag-and-drop installs in any servlet container
* an svn tag corresponding to a versioned release that is readily
buildable using standard tools
* a description of actual real life deployment environments
(hardware+software+config+service dependencies), associated data sets
and achieved performance/uptime, easily findable on your website
* links to available commercial support

The first will be used to evaluate how much work needs to be done to
beat the app into submission, following the standard rules.

The second will get used to create a vendor branch and then customize
the build to produce a custom-enterprise-style rpm containing a
custom-enterprise-style war to go into a custom-enterprise-style
tomcat.

The third will be used for capacity planning and to inform load
testing as to whether the achieved performance is reasonable. Since
the people doing those things may be different from the people that
are wanting to use fuseki, and so they won't be familiar with jena or
fuseki, it's pretty important to make it easy enough to find.

The commercial support may not get used, but it tends to at least be
considered a lower risk to use something if you know you can go and
pay for someone to come in and fix messed up systems a year later.

All the above *doesn't* mean providing a .deb/.rpm/.msi/.whatever is
useless. But it'll mostly be because its useful for 'normal' end
users, not for the enterpricy folks.

Heh. I didn't mean to write *all* that :-). Hope it helps, anyway.


cheers!


Leo

Re: Fuseki: any issue which need to be closed before a release?

Posted by Paolo Castagna <ca...@googlemail.com>.
Hi Alexander

Alexander Dutton wrote:
> Hi Paolo,
> 
> On 12/03/12 09:37, Paolo Castagna wrote:
>> https://issues.apache.org/jira/browse/JENA-218 (major) (... a patch
>> from Alexander might come for this)
> 
> I'm happy to push this further up my todo list if people feel it would
> be a useful thing to include in the release. It would make our lives
> easier here were it to be in the release, but if there's not that much
> demand for it then it needn't block. Also, I don't remember any concious
> decision to mark it as major, so I probably forgot to set that field
> appropriately.

Do you have a patch ready for this?
Or, have you started already working on it? If so, what's missing from the patch?

With open source projects and Apache projects, if you really need something it's better you do it yourself. :-)
I can review a patch if you submit it and help refining/testing it, if necessary.
But, I know I'll have no time to do that myself before the first Fuseki release. Also, it is a feature which (although useful) I've not seen many people asking for.

>> Is any of this a blocker for a first release?
> 
> ^^
> 
>> JENA-201, even if flagged as minor, I think it's quite important if
>> we want to successfully allow Fuseki to be deployed and used in
>> 'enterprisy' environments.
> 
> Another thing that might be useful for 'enterprisy' environments would
> be Debian and/or RPM packaging, though these probably want to wait until
> after the release. (I've had a look at the Debian side, but I'm no
> expert, and dealing with non-tarball upstreams (as things are at the
> moment) is messy).

Oh, well... .deb and/or .rpm packages are useful.
But, you need to be careful with Java, since Java typically have different assumption than most of C stuff and/or shared libraries.
I tend to avoid installing Java .deb and/or .rpm packages because I find the fact of having everything (i.e. binaries, configuration, etc.) in a single (or two) folders quite useful.
Different Linux distributions have different 'conventions' on where to put stuff and you need to deal with yet another 'dependencies' management system, etc.
What do you find inconvenient with a download and unzip/untar approach? Other than you do not get the new software when you type sudo apt-get update? Oh, yeah, and perhaps scripts in /etc/init.d/...
to start/stop things...
Having said that, if someone goes ahead and make .deb and/or .rpm packages for Fuseki or other Apache Jena components, I am certainly not going to vote against that. :-)
... but, I am not sure I am going to use those (as I do not use them for many other Java apps or the JVM itself).
(We use .deb packages for Java services at work).

Thanks Alex for your reply.

Paolo

> 
> All the best,
> 
> Alex

Re: Fuseki: any issue which need to be closed before a release?

Posted by Alexander Dutton <al...@oucs.ox.ac.uk>.
Hi Paolo,

On 12/03/12 09:37, Paolo Castagna wrote:
> https://issues.apache.org/jira/browse/JENA-218 (major) (... a patch
> from Alexander might come for this)

I'm happy to push this further up my todo list if people feel it would
be a useful thing to include in the release. It would make our lives
easier here were it to be in the release, but if there's not that much
demand for it then it needn't block. Also, I don't remember any concious
decision to mark it as major, so I probably forgot to set that field
appropriately.

> Is any of this a blocker for a first release?

^^

> JENA-201, even if flagged as minor, I think it's quite important if
> we want to successfully allow Fuseki to be deployed and used in
> 'enterprisy' environments.

Another thing that might be useful for 'enterprisy' environments would
be Debian and/or RPM packaging, though these probably want to wait until
after the release. (I've had a look at the Debian side, but I'm no
expert, and dealing with non-tarball upstreams (as things are at the
moment) is messy).

All the best,

Alex