You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@marmotta.apache.org by Sergio Fernández <wi...@apache.org> on 2013/03/28 12:17:21 UTC

documentation sprint

Hi all,

in the meantime we get passed the vote for 3.0.0-incubating, we should 
take a look to the public documentation we are providing through our web 
site, staging at http://marmotta.staging.apache.org

That's why we have talked to do a documentation sprint early next week, 
to have it ready for the actual publication of the release, whenever it 
will happen...

Usually for the people we are so deep into the code, it is not always 
easy to see the deficiencies on the documentation. So I'd like to kindly 
ask all of you (the farther from the code, the better) what are those 
missing things you don't find in the documentation: what is not so 
clear, what is missing, what you think should be possible but you can't 
find how, what would need other kind of documentation (screencast or 
whatever), and so on.

Thanks!

Cheers,

-- 
Sergio Fernández

Re: documentation sprint

Posted by Sebastian Schaffert <ss...@apache.org>.

Hi Raffaele,

you can have a look now at

http://marmotta.incubator.apache.org/platform/introduction.html

where I started describing the platform architecture. More to do (after
Easter) but at least a start ...

Greetings,

Sebastian


2013/3/29 Sebastian Schaffert <se...@gmail.com>

> Hi Raffaele,
>
> thanks for summarizing. :-)
>
> I already started working on the architecture diagram. I'll include it
> once it is ready (but I cannot promise it will be today).
>
> The GeoNames import is currently still a module provided by the LMF,
> because we thought it would not be so useful for most people. You are
> however right that we should work more on how to import existing datasets
> easily. The whole GeoNames import (with 140 million triples) on a decent
> server with PostgreSQL (2x Quadcore = 8 cores, SSD disk, 24GB memory) took
> 3:40 hours (versioning turned on, which slows down the import by about 30%).
>
> I have already an issue open on further improving this in a similar way to
> Jena (i.e. by setting the triplestore into a maintenance mode and then
> doing a dedicated batch import). I'll work on this when I have time. I
> expect that under certain conditions the import time can be reduced by a
> factor of about 10, because many things that slow down the import
> performance currently are related to transactions and concurrency (I need
> to make sure data is always consistent even under concurrent access, so
> there are many checks and I cannot really batch SQL executions).
>
> OTOH, I think most datasets are not really so big anyways, so no high
> priority. Would just be nice to be able to offer a better and more reliable
> DBPedia through Marmotta ... ;-)
>
> Greetings,
>
> Sebastian
>
>
> 2013/3/29 Raffaele Palmieri <ra...@gmail.com>
>
>> Hi Sergio,
>> I gave a look to site's documentation,the following links are incomplete:
>>
>>    - Apache Marmotta->Download Marmotta
>>    - Apache Marmotta->Development->Development practices
>>    - Apache Marmotta->Acknowledgements
>>    - Platform->Introduction
>>    - Platform->Core module
>>    - Platform->LDCache module
>>    - Platform->LDPath module
>>    - Platform->Reasoner module
>>    - Platform->SPARQL module
>>    - Platform->User module
>>    - Platform->Client library(broken link)
>>    - Platform->Sesame tools(broken link)
>>    - LDCache->Wrappers
>>    - LDPath->Backends
>>    - LDPath->Functions
>>    - Wiki->Dependencies protocol and various modules and libraries
>>
>> There are in documentation again sparse references to LMF.
>> I think that a picture showing architectural overview could be useful,
>> showing also some possible applications of platform, as in the past it has
>> been showed for LMF, when possible with some screencasts.
>> For example, a couple of use cases could regard importing content from LOD
>> using new linked data client modules(Youtube, Vimeo, Facebook, etc.) and
>> retrieval of content, maybe using the integration of lmf-search.
>> Regarding performance considerations, it could be useful showing how to
>> import in parallel way data in Marmotta, for example showing how to
>> perform
>> Geonames import.
>> Cheers,
>> and Happy Easter to all of you!
>> Raffaele.
>>
>>
>> On 28 March 2013 12:17, Sergio Fernández <wi...@apache.org> wrote:
>>
>> > Hi all,
>> >
>> > in the meantime we get passed the vote for 3.0.0-incubating, we should
>> > take a look to the public documentation we are providing through our web
>> > site, staging at http://marmotta.staging.**apache.org<
>> http://marmotta.staging.apache.org>
>> >
>> > That's why we have talked to do a documentation sprint early next week,
>> to
>> > have it ready for the actual publication of the release, whenever it
>> will
>> > happen...
>> >
>> > Usually for the people we are so deep into the code, it is not always
>> easy
>> > to see the deficiencies on the documentation. So I'd like to kindly ask
>> all
>> > of you (the farther from the code, the better) what are those missing
>> > things you don't find in the documentation: what is not so clear, what
>> is
>> > missing, what you think should be possible but you can't find how, what
>> > would need other kind of documentation (screencast or whatever), and so
>> on.
>> >
>> > Thanks!
>> >
>> > Cheers,
>> >
>> > --
>> > Sergio Fernández
>> >
>>
>
>

Re: documentation sprint

Posted by Sebastian Schaffert <se...@gmail.com>.

Hi Raffaele,

thanks for summarizing. :-)

I already started working on the architecture diagram. I'll include it once
it is ready (but I cannot promise it will be today).

The GeoNames import is currently still a module provided by the LMF,
because we thought it would not be so useful for most people. You are
however right that we should work more on how to import existing datasets
easily. The whole GeoNames import (with 140 million triples) on a decent
server with PostgreSQL (2x Quadcore = 8 cores, SSD disk, 24GB memory) took
3:40 hours (versioning turned on, which slows down the import by about 30%).

I have already an issue open on further improving this in a similar way to
Jena (i.e. by setting the triplestore into a maintenance mode and then
doing a dedicated batch import). I'll work on this when I have time. I
expect that under certain conditions the import time can be reduced by a
factor of about 10, because many things that slow down the import
performance currently are related to transactions and concurrency (I need
to make sure data is always consistent even under concurrent access, so
there are many checks and I cannot really batch SQL executions).

OTOH, I think most datasets are not really so big anyways, so no high
priority. Would just be nice to be able to offer a better and more reliable
DBPedia through Marmotta ... ;-)

Greetings,

Sebastian

2013/3/29 Raffaele Palmieri <ra...@gmail.com>

> Hi Sergio,
> I gave a look to site's documentation,the following links are incomplete:
>
>    - Apache Marmotta->Download Marmotta
>    - Apache Marmotta->Development->Development practices
>    - Apache Marmotta->Acknowledgements
>    - Platform->Introduction
>    - Platform->Core module
>    - Platform->LDCache module
>    - Platform->LDPath module
>    - Platform->Reasoner module
>    - Platform->SPARQL module
>    - Platform->User module
>    - Platform->Client library(broken link)
>    - Platform->Sesame tools(broken link)
>    - LDCache->Wrappers
>    - LDPath->Backends
>    - LDPath->Functions
>    - Wiki->Dependencies protocol and various modules and libraries
>
> There are in documentation again sparse references to LMF.
> I think that a picture showing architectural overview could be useful,
> showing also some possible applications of platform, as in the past it has
> been showed for LMF, when possible with some screencasts.
> For example, a couple of use cases could regard importing content from LOD
> using new linked data client modules(Youtube, Vimeo, Facebook, etc.) and
> retrieval of content, maybe using the integration of lmf-search.
> Regarding performance considerations, it could be useful showing how to
> import in parallel way data in Marmotta, for example showing how to perform
> Geonames import.
> Cheers,
> and Happy Easter to all of you!
> Raffaele.
>
>
> On 28 March 2013 12:17, Sergio Fernández <wi...@apache.org> wrote:
>
> > Hi all,
> >
> > in the meantime we get passed the vote for 3.0.0-incubating, we should
> > take a look to the public documentation we are providing through our web
> > site, staging at http://marmotta.staging.**apache.org<
> http://marmotta.staging.apache.org>
> >
> > That's why we have talked to do a documentation sprint early next week,
> to
> > have it ready for the actual publication of the release, whenever it will
> > happen...
> >
> > Usually for the people we are so deep into the code, it is not always
> easy
> > to see the deficiencies on the documentation. So I'd like to kindly ask
> all
> > of you (the farther from the code, the better) what are those missing
> > things you don't find in the documentation: what is not so clear, what is
> > missing, what you think should be possible but you can't find how, what
> > would need other kind of documentation (screencast or whatever), and so
> on.
> >
> > Thanks!
> >
> > Cheers,
> >
> > --
> > Sergio Fernández
> >
>

Re: documentation sprint

Posted by Raffaele Palmieri <ra...@gmail.com>.

Hi Sergio,
I gave a look to site's documentation,the following links are incomplete:

   - Apache Marmotta->Download Marmotta
   - Apache Marmotta->Development->Development practices
   - Apache Marmotta->Acknowledgements
   - Platform->Introduction
   - Platform->Core module
   - Platform->LDCache module
   - Platform->LDPath module
   - Platform->Reasoner module
   - Platform->SPARQL module
   - Platform->User module
   - Platform->Client library(broken link)
   - Platform->Sesame tools(broken link)
   - LDCache->Wrappers
   - LDPath->Backends
   - LDPath->Functions
   - Wiki->Dependencies protocol and various modules and libraries

There are in documentation again sparse references to LMF.
I think that a picture showing architectural overview could be useful,
showing also some possible applications of platform, as in the past it has
been showed for LMF, when possible with some screencasts.
For example, a couple of use cases could regard importing content from LOD
using new linked data client modules(Youtube, Vimeo, Facebook, etc.) and
retrieval of content, maybe using the integration of lmf-search.
Regarding performance considerations, it could be useful showing how to
import in parallel way data in Marmotta, for example showing how to perform
Geonames import.
Cheers,
and Happy Easter to all of you!
Raffaele.

On 28 March 2013 12:17, Sergio Fernández <wi...@apache.org> wrote:

> Hi all,
>
> in the meantime we get passed the vote for 3.0.0-incubating, we should
> take a look to the public documentation we are providing through our web
> site, staging at http://marmotta.staging.**apache.org<http://marmotta.staging.apache.org>
>
> That's why we have talked to do a documentation sprint early next week, to
> have it ready for the actual publication of the release, whenever it will
> happen...
>
> Usually for the people we are so deep into the code, it is not always easy
> to see the deficiencies on the documentation. So I'd like to kindly ask all
> of you (the farther from the code, the better) what are those missing
> things you don't find in the documentation: what is not so clear, what is
> missing, what you think should be possible but you can't find how, what
> would need other kind of documentation (screencast or whatever), and so on.
>
> Thanks!
>
> Cheers,
>
> --
> Sergio Fernández
>