You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@marmotta.apache.org by Junyue Wang <ju...@gmail.com> on 2015/05/16 14:19:42 UTC

Re: [GSoC2015] list of accepted projects

Hello Sergio, Peter,

It's my honor to be a GSoC student. I appreciate your help for the comments
of the project proposal.
I read the proposed methodology you pointed out. But it seems my project is
only related to Sesame and RDF HDT, without touching the code base of
Marmotta. Should I fork Marmotta in github, or start a new repository there?
Will my code be merged into Marmotta in the end? If so, which module of
Marmotta?

yours,
junyue

On Thu, Apr 30, 2015 at 2:41 PM, Sergio Fernández <wi...@apache.org> wrote:

> Hi Peter,
>
> On Wed, Apr 29, 2015 at 1:12 AM, Peter Ansell <an...@gmail.com>
> wrote:
>>
>> Those guidelines look great to me, especially the suggestion about the
>> branch name including the Jira issue, which I have found very useful
>> in all of my git-based projects. In the RDF/HDT case, and possibly in
>> the GeoSPARQL case, the contributed code could be in the form of a new
>> module, so there won't be much interference with the rest of the
>> codebase during that time. However, it is still useful to regularly
>> merge the "develop" branch into each of the branches to keep up to
>> date and reduce the number of merge conflicts occurring near the end
>> when the students will be rushing to complete the project.
>
>
> Great you like it, Peter :-)
>
> I expect less merge conflicts, nevertheless it's a more concrete library;
> with the GeoSPARQL project that workflow is much more important.
>
> I've just have one concern about the documentation. Last year I had
> formatting issues bringing that documentation into the wiki (MoinMoin
> syntax is not markdown, unfortunately). Do you think is better to do it
> directly in the wiki?
>
> I'd love to hear comments from our students, after all you're the ones who
> need to follow that proposed methodology.
>
> Cheers,
>
> --
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 6602747925
> e: sergio.fernandez@redlink.co
> w: http://redlink.co
>

Re: [GSoC2015] list of accepted projects

Posted by Sergio Fernández <wi...@apache.org>.

On Mon, Jun 8, 2015 at 9:38 AM, Sergio Fernández <wi...@apache.org> wrote:
>
> Maybe we can start with that: collecting a set of files for the project
> testing.
>

Just pick some from http://www.rdfhdt.org/datasets/

I'd say http://gaia.infor.uva.es/hdt/swdf-2012-11-28.hdt.gz would be your
first candidate for early testing.

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: [GSoC2015] list of accepted projects

Posted by Junyue Wang <ju...@gmail.com>.

Hello Rob,

Thanks a lot!
I just posted the question on legal-discuss@.

yours,
junyue




On Mon, Jun 15, 2015 at 4:27 PM, Rob Vesse <rv...@dotnetrdf.org> wrote:

> Note that Apache projects are allowed to rely on LGPL libraries provided
> they are optional dependencies
>
> http://www.apache.org/legal/resolved.html#optional
>
> So if most users won't want/need RDF/HT (which I strongly suspect is the
> case) then it would be perfectly fine to have the RDF/HT parser module
> rely on the Java HDT library provided that Marmotta users have to manually
> opt-in to enabling RDF/HT support and it isn't included by default
>
> Btw the "don't look at it" argument has always seemed dumb to me since
> when you get down to it there are only so many ways to do things
> particularly when you are implementing a formal specification like this.
> This is especially true in the case of binary formats where the low level
> code is always going to boil down to something along the lines of "try to
> read next N bytes, interpret those bytes appropriately" so the only real
> differences will be in the higher level APIs you wrap over it
>
> However I would suggest posting a thread on legal-discuss@ to get
> clarification for people better versed in the legal issues involved
>
> Rob
>
> On 12/06/2015 08:46, "Sergio Fernández" <wi...@apache.org> wrote:
>
> >Hi,
> >
> >On Fri, Jun 12, 2015 at 9:39 AM, Andreas Kuckartz <a....@ping.de>
> >wrote:
> >>
> >> Is that statement ("don't look at it") about LGPL-licensed source-code
> >> (which implements a specification) a result of advise from the Apache
> >> legal team or stated anywhere on the apache.org website ?
> >>
> >
> >No, a personal statement. FMPOV inspiring (looking to) GPL code could be
> >considered a derived work. Therefore affected by the copyleft clauses of
> >the license.
> >
> >But if you think I could be wrong, we can involve ASF Legal for a proper
> >checking.
> >
> >Thanks for your feedback.
> >
> >Cheers,
> >
> >--
> >Sergio Fernández
> >Partner Technology Manager
> >Redlink GmbH
> >m: +43 6602747925
> >e: sergio.fernandez@redlink.co
> >w: http://redlink.co
>
>
>
>
>

Re: [GSoC2015] list of accepted projects

Posted by Sergio Fernández <wi...@apache.org>.

HI Peter,

On Wed, Jun 17, 2015 at 1:38 AM, Peter Ansell <an...@gmail.com>
wrote:
>
> I have not yet subscribed to the legal mailing list, but if there was
> a response since Junyue's query there, I will find it in an archive
> and respond to it.
>

Neither do I... but I think legal-discuss@ accepts mails from
non-subscribers for such occasional questions.

Junyue, any reply to your inquiry?

> I am no more of an expert than anyone else, but my intention was
> definitely to have a completely new implementation, but we may not
> have communicated that clearly enough.
>

I though that was clear... the spec should be as good as the current code
to provide a reliable implementation in the available time frame. Even
Javier has offer volunteer to improve the last details documented there and
help if necessary.

Let's see...

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: [GSoC2015] list of accepted projects

Posted by Peter Ansell <an...@gmail.com>.

On 16 June 2015 at 19:10, Sergio Fernández <wi...@apache.org> wrote:
> Hi Rob,
>
> thanks for jumping in the discussion, all points of view are welcomed.
>
> I'll answer you inline.
>
> On Mon, Jun 15, 2015 at 10:27 AM, Rob Vesse <rv...@dotnetrdf.org> wrote:
>
>> Note that Apache projects are allowed to rely on LGPL libraries provided
>> they are optional dependencies
>>
>> http://www.apache.org/legal/resolved.html#optional
>>
>> So if most users won't want/need RDF/HT (which I strongly suspect is the
>> case) then it would be perfectly fine to have the RDF/HT parser module
>> rely on the Java HDT library provided that Marmotta users have to manually
>> opt-in to enabling RDF/HT support and it isn't included by default
>>
>
> That could be right if we would want to use such dependency. But knowing
> the licensing issue, the goal of MARMOTTA-593 was always "the mplementation
> of RDF HDT from scratch". Therefore what Junyue was asking about
> "inspiring" himself with that code we are in a completely different
> scenario. So, in that context, we are actually out of the clauses that LGPL
> provides for the usage as library, and we directly jump in the GPL clauses
> as derived work. And that's the IP issue here now.
>
> Btw the "don't look at it" argument has always seemed dumb to me since
>> when you get down to it there are only so many ways to do things
>> particularly when you are implementing a formal specification like this.
>> This is especially true in the case of binary formats where the low level
>> code is always going to boil down to something along the lines of "try to
>> read next N bytes, interpret those bytes appropriately" so the only real
>> differences will be in the higher level APIs you wrap over it
>>
>
> Could sound dumb, but legally it's the only safe way.
>
> However I would suggest posting a thread on legal-discuss@ to get
>> clarification for people better versed in the legal issues involved
>>

Hi Sergio,

> I think that's the best idea. I'd prefer the mentor of the project (Peter)
> takes care about dealing with this issue. But if he is busy and can't do
> it, I'll ask tomorrow.

I have not yet subscribed to the legal mailing list, but if there was
a response since Junyue's query there, I will find it in an archive
and respond to it.

I am no more of an expert than anyone else, but my intention was
definitely to have a completely new implementation, but we may not
have communicated that clearly enough.

Thanks,

Peter

Re: [GSoC2015] list of accepted projects

Posted by Sergio Fernández <wi...@apache.org>.

Hi Rob,

thanks for jumping in the discussion, all points of view are welcomed.

I'll answer you inline.

On Mon, Jun 15, 2015 at 10:27 AM, Rob Vesse <rv...@dotnetrdf.org> wrote:

> Note that Apache projects are allowed to rely on LGPL libraries provided
> they are optional dependencies
>
> http://www.apache.org/legal/resolved.html#optional
>
> So if most users won't want/need RDF/HT (which I strongly suspect is the
> case) then it would be perfectly fine to have the RDF/HT parser module
> rely on the Java HDT library provided that Marmotta users have to manually
> opt-in to enabling RDF/HT support and it isn't included by default
>

That could be right if we would want to use such dependency. But knowing
the licensing issue, the goal of MARMOTTA-593 was always "the mplementation
of RDF HDT from scratch". Therefore what Junyue was asking about
"inspiring" himself with that code we are in a completely different
scenario. So, in that context, we are actually out of the clauses that LGPL
provides for the usage as library, and we directly jump in the GPL clauses
as derived work. And that's the IP issue here now.

Btw the "don't look at it" argument has always seemed dumb to me since
> when you get down to it there are only so many ways to do things
> particularly when you are implementing a formal specification like this.
> This is especially true in the case of binary formats where the low level
> code is always going to boil down to something along the lines of "try to
> read next N bytes, interpret those bytes appropriately" so the only real
> differences will be in the higher level APIs you wrap over it
>

Could sound dumb, but legally it's the only safe way.

However I would suggest posting a thread on legal-discuss@ to get
> clarification for people better versed in the legal issues involved
>

I think that's the best idea. I'd prefer the mentor of the project (Peter)
takes care about dealing with this issue. But if he is busy and can't do
it, I'll ask tomorrow.

Cheers,

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: [GSoC2015] list of accepted projects

Posted by Rob Vesse <rv...@dotnetrdf.org>.

Note that Apache projects are allowed to rely on LGPL libraries provided
they are optional dependencies

http://www.apache.org/legal/resolved.html#optional

So if most users won't want/need RDF/HT (which I strongly suspect is the
case) then it would be perfectly fine to have the RDF/HT parser module
rely on the Java HDT library provided that Marmotta users have to manually
opt-in to enabling RDF/HT support and it isn't included by default

Btw the "don't look at it" argument has always seemed dumb to me since
when you get down to it there are only so many ways to do things
particularly when you are implementing a formal specification like this.
This is especially true in the case of binary formats where the low level
code is always going to boil down to something along the lines of "try to
read next N bytes, interpret those bytes appropriately" so the only real
differences will be in the higher level APIs you wrap over it

However I would suggest posting a thread on legal-discuss@ to get
clarification for people better versed in the legal issues involved

Rob

On 12/06/2015 08:46, "Sergio Fernández" <wi...@apache.org> wrote:

>Hi,
>
>On Fri, Jun 12, 2015 at 9:39 AM, Andreas Kuckartz <a....@ping.de>
>wrote:
>>
>> Is that statement ("don't look at it") about LGPL-licensed source-code
>> (which implements a specification) a result of advise from the Apache
>> legal team or stated anywhere on the apache.org website ?
>>
>
>No, a personal statement. FMPOV inspiring (looking to) GPL code could be
>considered a derived work. Therefore affected by the copyleft clauses of
>the license.
>
>But if you think I could be wrong, we can involve ASF Legal for a proper
>checking.
>
>Thanks for your feedback.
>
>Cheers,
>
>-- 
>Sergio Fernández
>Partner Technology Manager
>Redlink GmbH
>m: +43 6602747925
>e: sergio.fernandez@redlink.co
>w: http://redlink.co

Re: [GSoC2015] list of accepted projects

Posted by Sergio Fernández <wi...@apache.org>.

Hi,

On Fri, Jun 12, 2015 at 9:39 AM, Andreas Kuckartz <a....@ping.de>
wrote:
>
> Is that statement ("don't look at it") about LGPL-licensed source-code
> (which implements a specification) a result of advise from the Apache
> legal team or stated anywhere on the apache.org website ?
>

No, a personal statement. FMPOV inspiring (looking to) GPL code could be
considered a derived work. Therefore affected by the copyleft clauses of
the license.

But if you think I could be wrong, we can involve ASF Legal for a proper
checking.

Thanks for your feedback.

Cheers,

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: [GSoC2015] list of accepted projects

Posted by Andreas Kuckartz <a....@ping.de>.

Sergio Fernández wrote:
> So please, don't "get inspired" by that code base, don't even llok to it,
> since we'll have an intellectual property issue.

Is that statement ("don't look at it") about LGPL-licensed source-code
(which implements a specification) a result of advise from the Apache
legal team or stated anywhere on the apache.org website ?

Cheers,
Andreas

Re: [GSoC2015] list of accepted projects

Posted by Sergio Fernández <wi...@apache.org>.

Hi Junyue,

On Mon, Jun 8, 2015 at 9:09 AM, Junyue Wang <ju...@gmail.com> wrote:
>
> I went through the W3C document. I think coding from scratch is too
> difficult for me. In the project proposal I submitted, Java HDT library[1]
> is to be reused for parsing and writing hdt files. The jena integration is
> built on top of Java HDT library as well. I reviewed the source code of
> Java HDT library, which does not strictly conform to the W3C document. If
> we follow the specification precisely, the new sesame-rio-rdfhdt module may
> not be able to dealing with the hdt files generated by Java HDT library.
>

The current Java implementation is GPL! That means that any derived work
would need to be licensed with the same terms. So please, don't "get
inspired" by that code base, don't even llok to it, since we'll have an
intellectual property issue.

If you find any example HDT file that does not parse with either the W3C
Member Submission or the current spec http://www.rdfhdt.org/hdt-internals/
we should contact the authors of the format.

Maybe we can start with that: collecting a set of files for the project
testing.

> I hope it's OK to stick to the original idea in the proposal. Or we may
> have problems to complete the project within the 3-month period.
>

We're still in the 2nd week of the program. Then the sooner we'd figure out
such things, the better for the results.

Cheers,

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: [GSoC2015] list of accepted projects

Posted by Sergio Fernández <wi...@apache.org>.

On Thu, Jun 11, 2015 at 6:18 PM, Junyue Wang <ju...@gmail.com> wrote:
>
> It seems licence of the java implementation is LGPL:
>
>    - *The libraries are open source (LGPL)*. You can adapt the libraries to
>    your needs, and the community can spot and fix issues [1].
>
> Is that good news? Can we just use/link to the java library without
> modifying its code?
>

No Junyue, LGPL just helps to avoid the copyleft on using code as library;
but any derived work of the code itself is under the same terms than GLP.

So you _must_ not use that code as reference. That's a strong requirement
of the project. You must implement the parser from scratch based on the
available specification.

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: [GSoC2015] list of accepted projects

Posted by Junyue Wang <ju...@gmail.com>.

Hello,

It seems licence of the java implementation is LGPL:

   - *The libraries are open source (LGPL)*. You can adapt the libraries to
   your needs, and the community can spot and fix issues [1].

Is that good news? Can we just use/link to the java library without
modifying its code?

yours,
junyue

[1] http://www.rdfhdt.org/what-is-hdt/

On Tue, Jun 9, 2015 at 6:57 AM, Peter Ansell <an...@gmail.com> wrote:

> Hi Junyue,
>
> Sorry for any confusion that we may have caused you by not emphasising
> the licensing issue as the main factor in this project, and hence you
> not realising that it required an actual parser to be written (and
> that you can't look at the GPL/LGPL parser for inspiration).
>
> We are still early and I think you should try to follow the W3C
> submission to see how difficult parsing a binary format is to see
> whether you want to continue or not in a week or two after trying to
> write a binary parser from scratch. Don't focus on the writer at this
> point if you think the parser will be enough for you.
>
> Once the RDF-HDT people release a newer version of the specification,
> you can switch to using that, but it would be great to see if you can
> get a basic parser up and running based on the older W3C submission.
> To start off with you could try just parsing the header, and see how
> difficult that turns out to be before deciding about the rest of the
> time.
>
> Sorry in advance btw, this is my first time being a GSOC mentor and I
> may do things wrong.
>
> Cheers,
>
> Peter
>
>
>
> On 8 June 2015 at 17:09, Junyue Wang <ju...@gmail.com> wrote:
> > Hello Peter,
> >
> > I went through the W3C document. I think coding from scratch is too
> > difficult for me. In the project proposal I submitted, Java HDT
> library[1]
> > is to be reused for parsing and writing hdt files. The jena integration
> is
> > built on top of Java HDT library as well. I reviewed the source code of
> > Java HDT library, which does not strictly conform to the W3C document. If
> > we follow the specification precisely, the new sesame-rio-rdfhdt module
> may
> > not be able to dealing with the hdt files generated by Java HDT library.
> >
> > I hope it's OK to stick to the original idea in the proposal. Or we may
> > have problems to complete the project within the 3-month period.
> >
> > [1] http://www.rdfhdt.org/manual-of-the-java-hdt-library/
> > [2] http://www.rdfhdt.org/manual-of-hdt-integration-with-jena/
> >
> > yours,
> > junyue
> >
> >
> > On Mon, Jun 8, 2015 at 8:48 AM, Peter Ansell <an...@gmail.com>
> wrote:
> >
> >> Hi Junyue,
> >>
> >> You are not going to be using or linking to the existing RDF/HDT
> >> implementations so their use of TripleString internally should not be
> >> an issue for you and you do not need to look at the RDF/HDT Java
> >> source code for this project.
> >>
> >> The sole reference for your implementation is the following document
> >> that the RDF/HDT team submitted to the W3C:
> >>
> >> http://www.w3.org/Submission/2011/SUBM-HDT-20110330/
> >>
> >> Specifically, you need to implement a binary parser from scratch based
> >> on the specification given in section 3:
> >>
> >> http://www.w3.org/Submission/2011/SUBM-HDT-20110330/#syntax
> >>
> >> Cheers,
> >>
> >> Peter
> >>
> >> On 8 June 2015 at 01:43, Junyue Wang <ju...@gmail.com> wrote:
> >> > Hello Peter,
> >> >
> >> > I've done with creating the new module and the new format. Now I'm
> >> > implementing the RDFHDTParser.
> >> > One question: If I search RDF HDT, it provides TripleString for each
> >> > triple. TripleString contains 3 Strings for subject, predicate and
> object
> >> > respectively. I need to transform the Strings into Sesame Values,
> which
> >> may
> >> > be URI, Resource, Literal or BlankNode. But I don't know before hand
> >> which
> >> > concrete types of Value they are. Is there a neat way to do this?
> >> >
> >> > I checked out ValueFactory in Sesame. It only does the transformation
> for
> >> > the given concrete type.
> >> >
> >> > yours,
> >> > junyue
> >> >
> >> > On Sun, May 17, 2015 at 9:09 AM, Peter Ansell <ansell.peter@gmail.com
> >
> >> > wrote:
> >> >
> >> >> Hi Junjue,
> >> >>
> >> >> It will be simplest to track if you fork the Marmotta repository at
> >> >> GitHub and create a branch named "MARMOTTA-593".
> >> >>
> >> >> Add me as a collaborator to the GitHub repository. My GitHub id is
> >> >> "ansell".
> >> >>
> >> >> The collaborators list for my fork is at:
> >> >>
> >> >> https://github.com/ansell/marmotta/settings/collaboration
> >> >>
> >> >> When you fork it, you can replace "ansell" with your GitHub id and
> use
> >> >> that page to add me to the list of collaborators.
> >> >>
> >> >> Yes, the code will be merged to Marmotta in the end.
> >> >>
> >> >> You should create a new module inside of marmotta-sesame-tools named
> >> >> "marmotta-rio-rdfht"
> >> >>
> >> >>
> >> >>
> >>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools
> >> >>
> >> >> You will also need to add a format constant into marmotta-rio-api as
> a
> >> >> new folder in the following directory, similar to the current 3
> >> >> folders there:
> >> >>
> >> >>
> >> >>
> >>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-api/src/main/java/org/apache/marmotta/commons/sesame/rio
> >> >>
> >> >> Cheers,
> >> >>
> >> >> Peter
> >> >>
> >> >>
> >> >> Cheers,
> >> >>
> >> >> Peter
> >> >>
> >> >> On 16 May 2015 at 22:19, Junyue Wang <ju...@gmail.com> wrote:
> >> >> > Hello Sergio, Peter,
> >> >> >
> >> >> > It's my honor to be a GSoC student. I appreciate your help for the
> >> >> comments
> >> >> > of the project proposal.
> >> >> > I read the proposed methodology you pointed out. But it seems my
> >> project
> >> >> is
> >> >> > only related to Sesame and RDF HDT, without touching the code base
> of
> >> >> > Marmotta. Should I fork Marmotta in github, or start a new
> repository
> >> >> there?
> >> >> > Will my code be merged into Marmotta in the end? If so, which
> module
> >> of
> >> >> > Marmotta?
> >> >> >
> >> >> > yours,
> >> >> > junyue
> >> >> >
> >> >> > On Thu, Apr 30, 2015 at 2:41 PM, Sergio Fernández <
> wikier@apache.org>
> >> >> wrote:
> >> >> >
> >> >> >> Hi Peter,
> >> >> >>
> >> >> >> On Wed, Apr 29, 2015 at 1:12 AM, Peter Ansell <
> >> ansell.peter@gmail.com>
> >> >> >> wrote:
> >> >> >>>
> >> >> >>> Those guidelines look great to me, especially the suggestion
> about
> >> the
> >> >> >>> branch name including the Jira issue, which I have found very
> useful
> >> >> >>> in all of my git-based projects. In the RDF/HDT case, and
> possibly
> >> in
> >> >> >>> the GeoSPARQL case, the contributed code could be in the form of
> a
> >> new
> >> >> >>> module, so there won't be much interference with the rest of the
> >> >> >>> codebase during that time. However, it is still useful to
> regularly
> >> >> >>> merge the "develop" branch into each of the branches to keep up
> to
> >> >> >>> date and reduce the number of merge conflicts occurring near the
> end
> >> >> >>> when the students will be rushing to complete the project.
> >> >> >>
> >> >> >>
> >> >> >> Great you like it, Peter :-)
> >> >> >>
> >> >> >> I expect less merge conflicts, nevertheless it's a more concrete
> >> >> library;
> >> >> >> with the GeoSPARQL project that workflow is much more important.
> >> >> >>
> >> >> >> I've just have one concern about the documentation. Last year I
> had
> >> >> >> formatting issues bringing that documentation into the wiki
> (MoinMoin
> >> >> >> syntax is not markdown, unfortunately). Do you think is better to
> do
> >> it
> >> >> >> directly in the wiki?
> >> >> >>
> >> >> >> I'd love to hear comments from our students, after all you're the
> >> ones
> >> >> who
> >> >> >> need to follow that proposed methodology.
> >> >> >>
> >> >> >> Cheers,
> >> >> >>
> >> >> >> --
> >> >> >> Sergio Fernández
> >> >> >> Partner Technology Manager
> >> >> >> Redlink GmbH
> >> >> >> m: +43 6602747925
> >> >> >> e: sergio.fernandez@redlink.co
> >> >> >> w: http://redlink.co
> >> >> >>
> >> >>
> >>
>

Re: [GSoC2015] list of accepted projects

Posted by Peter Ansell <an...@gmail.com>.

Hi Junyue,

Sorry for any confusion that we may have caused you by not emphasising
the licensing issue as the main factor in this project, and hence you
not realising that it required an actual parser to be written (and
that you can't look at the GPL/LGPL parser for inspiration).

We are still early and I think you should try to follow the W3C
submission to see how difficult parsing a binary format is to see
whether you want to continue or not in a week or two after trying to
write a binary parser from scratch. Don't focus on the writer at this
point if you think the parser will be enough for you.

Once the RDF-HDT people release a newer version of the specification,
you can switch to using that, but it would be great to see if you can
get a basic parser up and running based on the older W3C submission.
To start off with you could try just parsing the header, and see how
difficult that turns out to be before deciding about the rest of the
time.

Sorry in advance btw, this is my first time being a GSOC mentor and I
may do things wrong.

Cheers,

Peter



On 8 June 2015 at 17:09, Junyue Wang <ju...@gmail.com> wrote:
> Hello Peter,
>
> I went through the W3C document. I think coding from scratch is too
> difficult for me. In the project proposal I submitted, Java HDT library[1]
> is to be reused for parsing and writing hdt files. The jena integration is
> built on top of Java HDT library as well. I reviewed the source code of
> Java HDT library, which does not strictly conform to the W3C document. If
> we follow the specification precisely, the new sesame-rio-rdfhdt module may
> not be able to dealing with the hdt files generated by Java HDT library.
>
> I hope it's OK to stick to the original idea in the proposal. Or we may
> have problems to complete the project within the 3-month period.
>
> [1] http://www.rdfhdt.org/manual-of-the-java-hdt-library/
> [2] http://www.rdfhdt.org/manual-of-hdt-integration-with-jena/
>
> yours,
> junyue
>
>
> On Mon, Jun 8, 2015 at 8:48 AM, Peter Ansell <an...@gmail.com> wrote:
>
>> Hi Junyue,
>>
>> You are not going to be using or linking to the existing RDF/HDT
>> implementations so their use of TripleString internally should not be
>> an issue for you and you do not need to look at the RDF/HDT Java
>> source code for this project.
>>
>> The sole reference for your implementation is the following document
>> that the RDF/HDT team submitted to the W3C:
>>
>> http://www.w3.org/Submission/2011/SUBM-HDT-20110330/
>>
>> Specifically, you need to implement a binary parser from scratch based
>> on the specification given in section 3:
>>
>> http://www.w3.org/Submission/2011/SUBM-HDT-20110330/#syntax
>>
>> Cheers,
>>
>> Peter
>>
>> On 8 June 2015 at 01:43, Junyue Wang <ju...@gmail.com> wrote:
>> > Hello Peter,
>> >
>> > I've done with creating the new module and the new format. Now I'm
>> > implementing the RDFHDTParser.
>> > One question: If I search RDF HDT, it provides TripleString for each
>> > triple. TripleString contains 3 Strings for subject, predicate and object
>> > respectively. I need to transform the Strings into Sesame Values, which
>> may
>> > be URI, Resource, Literal or BlankNode. But I don't know before hand
>> which
>> > concrete types of Value they are. Is there a neat way to do this?
>> >
>> > I checked out ValueFactory in Sesame. It only does the transformation for
>> > the given concrete type.
>> >
>> > yours,
>> > junyue
>> >
>> > On Sun, May 17, 2015 at 9:09 AM, Peter Ansell <an...@gmail.com>
>> > wrote:
>> >
>> >> Hi Junjue,
>> >>
>> >> It will be simplest to track if you fork the Marmotta repository at
>> >> GitHub and create a branch named "MARMOTTA-593".
>> >>
>> >> Add me as a collaborator to the GitHub repository. My GitHub id is
>> >> "ansell".
>> >>
>> >> The collaborators list for my fork is at:
>> >>
>> >> https://github.com/ansell/marmotta/settings/collaboration
>> >>
>> >> When you fork it, you can replace "ansell" with your GitHub id and use
>> >> that page to add me to the list of collaborators.
>> >>
>> >> Yes, the code will be merged to Marmotta in the end.
>> >>
>> >> You should create a new module inside of marmotta-sesame-tools named
>> >> "marmotta-rio-rdfht"
>> >>
>> >>
>> >>
>> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools
>> >>
>> >> You will also need to add a format constant into marmotta-rio-api as a
>> >> new folder in the following directory, similar to the current 3
>> >> folders there:
>> >>
>> >>
>> >>
>> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-api/src/main/java/org/apache/marmotta/commons/sesame/rio
>> >>
>> >> Cheers,
>> >>
>> >> Peter
>> >>
>> >>
>> >> Cheers,
>> >>
>> >> Peter
>> >>
>> >> On 16 May 2015 at 22:19, Junyue Wang <ju...@gmail.com> wrote:
>> >> > Hello Sergio, Peter,
>> >> >
>> >> > It's my honor to be a GSoC student. I appreciate your help for the
>> >> comments
>> >> > of the project proposal.
>> >> > I read the proposed methodology you pointed out. But it seems my
>> project
>> >> is
>> >> > only related to Sesame and RDF HDT, without touching the code base of
>> >> > Marmotta. Should I fork Marmotta in github, or start a new repository
>> >> there?
>> >> > Will my code be merged into Marmotta in the end? If so, which module
>> of
>> >> > Marmotta?
>> >> >
>> >> > yours,
>> >> > junyue
>> >> >
>> >> > On Thu, Apr 30, 2015 at 2:41 PM, Sergio Fernández <wi...@apache.org>
>> >> wrote:
>> >> >
>> >> >> Hi Peter,
>> >> >>
>> >> >> On Wed, Apr 29, 2015 at 1:12 AM, Peter Ansell <
>> ansell.peter@gmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> Those guidelines look great to me, especially the suggestion about
>> the
>> >> >>> branch name including the Jira issue, which I have found very useful
>> >> >>> in all of my git-based projects. In the RDF/HDT case, and possibly
>> in
>> >> >>> the GeoSPARQL case, the contributed code could be in the form of a
>> new
>> >> >>> module, so there won't be much interference with the rest of the
>> >> >>> codebase during that time. However, it is still useful to regularly
>> >> >>> merge the "develop" branch into each of the branches to keep up to
>> >> >>> date and reduce the number of merge conflicts occurring near the end
>> >> >>> when the students will be rushing to complete the project.
>> >> >>
>> >> >>
>> >> >> Great you like it, Peter :-)
>> >> >>
>> >> >> I expect less merge conflicts, nevertheless it's a more concrete
>> >> library;
>> >> >> with the GeoSPARQL project that workflow is much more important.
>> >> >>
>> >> >> I've just have one concern about the documentation. Last year I had
>> >> >> formatting issues bringing that documentation into the wiki (MoinMoin
>> >> >> syntax is not markdown, unfortunately). Do you think is better to do
>> it
>> >> >> directly in the wiki?
>> >> >>
>> >> >> I'd love to hear comments from our students, after all you're the
>> ones
>> >> who
>> >> >> need to follow that proposed methodology.
>> >> >>
>> >> >> Cheers,
>> >> >>
>> >> >> --
>> >> >> Sergio Fernández
>> >> >> Partner Technology Manager
>> >> >> Redlink GmbH
>> >> >> m: +43 6602747925
>> >> >> e: sergio.fernandez@redlink.co
>> >> >> w: http://redlink.co
>> >> >>
>> >>
>>

Re: [GSoC2015] list of accepted projects

Posted by Junyue Wang <ju...@gmail.com>.

Hello Peter,

I went through the W3C document. I think coding from scratch is too
difficult for me. In the project proposal I submitted, Java HDT library[1]
is to be reused for parsing and writing hdt files. The jena integration is
built on top of Java HDT library as well. I reviewed the source code of
Java HDT library, which does not strictly conform to the W3C document. If
we follow the specification precisely, the new sesame-rio-rdfhdt module may
not be able to dealing with the hdt files generated by Java HDT library.

I hope it's OK to stick to the original idea in the proposal. Or we may
have problems to complete the project within the 3-month period.

[1] http://www.rdfhdt.org/manual-of-the-java-hdt-library/
[2] http://www.rdfhdt.org/manual-of-hdt-integration-with-jena/

yours,
junyue


On Mon, Jun 8, 2015 at 8:48 AM, Peter Ansell <an...@gmail.com> wrote:

> Hi Junyue,
>
> You are not going to be using or linking to the existing RDF/HDT
> implementations so their use of TripleString internally should not be
> an issue for you and you do not need to look at the RDF/HDT Java
> source code for this project.
>
> The sole reference for your implementation is the following document
> that the RDF/HDT team submitted to the W3C:
>
> http://www.w3.org/Submission/2011/SUBM-HDT-20110330/
>
> Specifically, you need to implement a binary parser from scratch based
> on the specification given in section 3:
>
> http://www.w3.org/Submission/2011/SUBM-HDT-20110330/#syntax
>
> Cheers,
>
> Peter
>
> On 8 June 2015 at 01:43, Junyue Wang <ju...@gmail.com> wrote:
> > Hello Peter,
> >
> > I've done with creating the new module and the new format. Now I'm
> > implementing the RDFHDTParser.
> > One question: If I search RDF HDT, it provides TripleString for each
> > triple. TripleString contains 3 Strings for subject, predicate and object
> > respectively. I need to transform the Strings into Sesame Values, which
> may
> > be URI, Resource, Literal or BlankNode. But I don't know before hand
> which
> > concrete types of Value they are. Is there a neat way to do this?
> >
> > I checked out ValueFactory in Sesame. It only does the transformation for
> > the given concrete type.
> >
> > yours,
> > junyue
> >
> > On Sun, May 17, 2015 at 9:09 AM, Peter Ansell <an...@gmail.com>
> > wrote:
> >
> >> Hi Junjue,
> >>
> >> It will be simplest to track if you fork the Marmotta repository at
> >> GitHub and create a branch named "MARMOTTA-593".
> >>
> >> Add me as a collaborator to the GitHub repository. My GitHub id is
> >> "ansell".
> >>
> >> The collaborators list for my fork is at:
> >>
> >> https://github.com/ansell/marmotta/settings/collaboration
> >>
> >> When you fork it, you can replace "ansell" with your GitHub id and use
> >> that page to add me to the list of collaborators.
> >>
> >> Yes, the code will be merged to Marmotta in the end.
> >>
> >> You should create a new module inside of marmotta-sesame-tools named
> >> "marmotta-rio-rdfht"
> >>
> >>
> >>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools
> >>
> >> You will also need to add a format constant into marmotta-rio-api as a
> >> new folder in the following directory, similar to the current 3
> >> folders there:
> >>
> >>
> >>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-api/src/main/java/org/apache/marmotta/commons/sesame/rio
> >>
> >> Cheers,
> >>
> >> Peter
> >>
> >>
> >> Cheers,
> >>
> >> Peter
> >>
> >> On 16 May 2015 at 22:19, Junyue Wang <ju...@gmail.com> wrote:
> >> > Hello Sergio, Peter,
> >> >
> >> > It's my honor to be a GSoC student. I appreciate your help for the
> >> comments
> >> > of the project proposal.
> >> > I read the proposed methodology you pointed out. But it seems my
> project
> >> is
> >> > only related to Sesame and RDF HDT, without touching the code base of
> >> > Marmotta. Should I fork Marmotta in github, or start a new repository
> >> there?
> >> > Will my code be merged into Marmotta in the end? If so, which module
> of
> >> > Marmotta?
> >> >
> >> > yours,
> >> > junyue
> >> >
> >> > On Thu, Apr 30, 2015 at 2:41 PM, Sergio Fernández <wi...@apache.org>
> >> wrote:
> >> >
> >> >> Hi Peter,
> >> >>
> >> >> On Wed, Apr 29, 2015 at 1:12 AM, Peter Ansell <
> ansell.peter@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> Those guidelines look great to me, especially the suggestion about
> the
> >> >>> branch name including the Jira issue, which I have found very useful
> >> >>> in all of my git-based projects. In the RDF/HDT case, and possibly
> in
> >> >>> the GeoSPARQL case, the contributed code could be in the form of a
> new
> >> >>> module, so there won't be much interference with the rest of the
> >> >>> codebase during that time. However, it is still useful to regularly
> >> >>> merge the "develop" branch into each of the branches to keep up to
> >> >>> date and reduce the number of merge conflicts occurring near the end
> >> >>> when the students will be rushing to complete the project.
> >> >>
> >> >>
> >> >> Great you like it, Peter :-)
> >> >>
> >> >> I expect less merge conflicts, nevertheless it's a more concrete
> >> library;
> >> >> with the GeoSPARQL project that workflow is much more important.
> >> >>
> >> >> I've just have one concern about the documentation. Last year I had
> >> >> formatting issues bringing that documentation into the wiki (MoinMoin
> >> >> syntax is not markdown, unfortunately). Do you think is better to do
> it
> >> >> directly in the wiki?
> >> >>
> >> >> I'd love to hear comments from our students, after all you're the
> ones
> >> who
> >> >> need to follow that proposed methodology.
> >> >>
> >> >> Cheers,
> >> >>
> >> >> --
> >> >> Sergio Fernández
> >> >> Partner Technology Manager
> >> >> Redlink GmbH
> >> >> m: +43 6602747925
> >> >> e: sergio.fernandez@redlink.co
> >> >> w: http://redlink.co
> >> >>
> >>
>

Re: [GSoC2015] list of accepted projects

Posted by Peter Ansell <an...@gmail.com>.

Hi Junyue,

You are not going to be using or linking to the existing RDF/HDT
implementations so their use of TripleString internally should not be
an issue for you and you do not need to look at the RDF/HDT Java
source code for this project.

The sole reference for your implementation is the following document
that the RDF/HDT team submitted to the W3C:

http://www.w3.org/Submission/2011/SUBM-HDT-20110330/

Specifically, you need to implement a binary parser from scratch based
on the specification given in section 3:

http://www.w3.org/Submission/2011/SUBM-HDT-20110330/#syntax

Cheers,

Peter

On 8 June 2015 at 01:43, Junyue Wang <ju...@gmail.com> wrote:
> Hello Peter,
>
> I've done with creating the new module and the new format. Now I'm
> implementing the RDFHDTParser.
> One question: If I search RDF HDT, it provides TripleString for each
> triple. TripleString contains 3 Strings for subject, predicate and object
> respectively. I need to transform the Strings into Sesame Values, which may
> be URI, Resource, Literal or BlankNode. But I don't know before hand which
> concrete types of Value they are. Is there a neat way to do this?
>
> I checked out ValueFactory in Sesame. It only does the transformation for
> the given concrete type.
>
> yours,
> junyue
>
> On Sun, May 17, 2015 at 9:09 AM, Peter Ansell <an...@gmail.com>
> wrote:
>
>> Hi Junjue,
>>
>> It will be simplest to track if you fork the Marmotta repository at
>> GitHub and create a branch named "MARMOTTA-593".
>>
>> Add me as a collaborator to the GitHub repository. My GitHub id is
>> "ansell".
>>
>> The collaborators list for my fork is at:
>>
>> https://github.com/ansell/marmotta/settings/collaboration
>>
>> When you fork it, you can replace "ansell" with your GitHub id and use
>> that page to add me to the list of collaborators.
>>
>> Yes, the code will be merged to Marmotta in the end.
>>
>> You should create a new module inside of marmotta-sesame-tools named
>> "marmotta-rio-rdfht"
>>
>>
>> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools
>>
>> You will also need to add a format constant into marmotta-rio-api as a
>> new folder in the following directory, similar to the current 3
>> folders there:
>>
>>
>> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-api/src/main/java/org/apache/marmotta/commons/sesame/rio
>>
>> Cheers,
>>
>> Peter
>>
>>
>> Cheers,
>>
>> Peter
>>
>> On 16 May 2015 at 22:19, Junyue Wang <ju...@gmail.com> wrote:
>> > Hello Sergio, Peter,
>> >
>> > It's my honor to be a GSoC student. I appreciate your help for the
>> comments
>> > of the project proposal.
>> > I read the proposed methodology you pointed out. But it seems my project
>> is
>> > only related to Sesame and RDF HDT, without touching the code base of
>> > Marmotta. Should I fork Marmotta in github, or start a new repository
>> there?
>> > Will my code be merged into Marmotta in the end? If so, which module of
>> > Marmotta?
>> >
>> > yours,
>> > junyue
>> >
>> > On Thu, Apr 30, 2015 at 2:41 PM, Sergio Fernández <wi...@apache.org>
>> wrote:
>> >
>> >> Hi Peter,
>> >>
>> >> On Wed, Apr 29, 2015 at 1:12 AM, Peter Ansell <an...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Those guidelines look great to me, especially the suggestion about the
>> >>> branch name including the Jira issue, which I have found very useful
>> >>> in all of my git-based projects. In the RDF/HDT case, and possibly in
>> >>> the GeoSPARQL case, the contributed code could be in the form of a new
>> >>> module, so there won't be much interference with the rest of the
>> >>> codebase during that time. However, it is still useful to regularly
>> >>> merge the "develop" branch into each of the branches to keep up to
>> >>> date and reduce the number of merge conflicts occurring near the end
>> >>> when the students will be rushing to complete the project.
>> >>
>> >>
>> >> Great you like it, Peter :-)
>> >>
>> >> I expect less merge conflicts, nevertheless it's a more concrete
>> library;
>> >> with the GeoSPARQL project that workflow is much more important.
>> >>
>> >> I've just have one concern about the documentation. Last year I had
>> >> formatting issues bringing that documentation into the wiki (MoinMoin
>> >> syntax is not markdown, unfortunately). Do you think is better to do it
>> >> directly in the wiki?
>> >>
>> >> I'd love to hear comments from our students, after all you're the ones
>> who
>> >> need to follow that proposed methodology.
>> >>
>> >> Cheers,
>> >>
>> >> --
>> >> Sergio Fernández
>> >> Partner Technology Manager
>> >> Redlink GmbH
>> >> m: +43 6602747925
>> >> e: sergio.fernandez@redlink.co
>> >> w: http://redlink.co
>> >>
>>

Re: [GSoC2015] list of accepted projects

Posted by Junyue Wang <ju...@gmail.com>.

Hello Peter,

I've done with creating the new module and the new format. Now I'm
implementing the RDFHDTParser.
One question: If I search RDF HDT, it provides TripleString for each
triple. TripleString contains 3 Strings for subject, predicate and object
respectively. I need to transform the Strings into Sesame Values, which may
be URI, Resource, Literal or BlankNode. But I don't know before hand which
concrete types of Value they are. Is there a neat way to do this?

I checked out ValueFactory in Sesame. It only does the transformation for
the given concrete type.

yours,
junyue

On Sun, May 17, 2015 at 9:09 AM, Peter Ansell <an...@gmail.com>
wrote:

> Hi Junjue,
>
> It will be simplest to track if you fork the Marmotta repository at
> GitHub and create a branch named "MARMOTTA-593".
>
> Add me as a collaborator to the GitHub repository. My GitHub id is
> "ansell".
>
> The collaborators list for my fork is at:
>
> https://github.com/ansell/marmotta/settings/collaboration
>
> When you fork it, you can replace "ansell" with your GitHub id and use
> that page to add me to the list of collaborators.
>
> Yes, the code will be merged to Marmotta in the end.
>
> You should create a new module inside of marmotta-sesame-tools named
> "marmotta-rio-rdfht"
>
>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools
>
> You will also need to add a format constant into marmotta-rio-api as a
> new folder in the following directory, similar to the current 3
> folders there:
>
>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-api/src/main/java/org/apache/marmotta/commons/sesame/rio
>
> Cheers,
>
> Peter
>
>
> Cheers,
>
> Peter
>
> On 16 May 2015 at 22:19, Junyue Wang <ju...@gmail.com> wrote:
> > Hello Sergio, Peter,
> >
> > It's my honor to be a GSoC student. I appreciate your help for the
> comments
> > of the project proposal.
> > I read the proposed methodology you pointed out. But it seems my project
> is
> > only related to Sesame and RDF HDT, without touching the code base of
> > Marmotta. Should I fork Marmotta in github, or start a new repository
> there?
> > Will my code be merged into Marmotta in the end? If so, which module of
> > Marmotta?
> >
> > yours,
> > junyue
> >
> > On Thu, Apr 30, 2015 at 2:41 PM, Sergio Fernández <wi...@apache.org>
> wrote:
> >
> >> Hi Peter,
> >>
> >> On Wed, Apr 29, 2015 at 1:12 AM, Peter Ansell <an...@gmail.com>
> >> wrote:
> >>>
> >>> Those guidelines look great to me, especially the suggestion about the
> >>> branch name including the Jira issue, which I have found very useful
> >>> in all of my git-based projects. In the RDF/HDT case, and possibly in
> >>> the GeoSPARQL case, the contributed code could be in the form of a new
> >>> module, so there won't be much interference with the rest of the
> >>> codebase during that time. However, it is still useful to regularly
> >>> merge the "develop" branch into each of the branches to keep up to
> >>> date and reduce the number of merge conflicts occurring near the end
> >>> when the students will be rushing to complete the project.
> >>
> >>
> >> Great you like it, Peter :-)
> >>
> >> I expect less merge conflicts, nevertheless it's a more concrete
> library;
> >> with the GeoSPARQL project that workflow is much more important.
> >>
> >> I've just have one concern about the documentation. Last year I had
> >> formatting issues bringing that documentation into the wiki (MoinMoin
> >> syntax is not markdown, unfortunately). Do you think is better to do it
> >> directly in the wiki?
> >>
> >> I'd love to hear comments from our students, after all you're the ones
> who
> >> need to follow that proposed methodology.
> >>
> >> Cheers,
> >>
> >> --
> >> Sergio Fernández
> >> Partner Technology Manager
> >> Redlink GmbH
> >> m: +43 6602747925
> >> e: sergio.fernandez@redlink.co
> >> w: http://redlink.co
> >>
>

Re: [GSoC2015] list of accepted projects

Posted by Peter Ansell <an...@gmail.com>.

Hi Junyue,

In future, make sure that you can replicate the Eclipse errors by
trying to compile it with command line maven using "mvn clean
install". You also need to put the text of any fatal errors you see
from mvn clean install into the email.

Thanks,

Peter

On 6 June 2015 at 21:15, Junyue Wang <ju...@gmail.com> wrote:
> Hello Sergio,
>
> I'm using Eclipse with m2e plugin in Windows 7. The problems are resolved
> by installing some m2e connectors for maven plugins. Thanks!
>
> yours,
> junyue
>
> On Mon, Jun 1, 2015 at 4:58 AM, Sergio Fernández <wi...@apache.org> wrote:
>
>> Hi Junyue
>>
>> On Sat, May 30, 2015 at 10:57 AM, Junyue Wang <ju...@gmail.com> wrote:
>> >
>> > For example, marmotta-rio-rss lacks of dependency jars for importing
>> > packages of org.rometools and com.sun.syndication. Is there something
>> wrong
>> > with the pom of marmotta-rio-rss?
>> >
>>
>> As described in the NOTICE file, the source code of both libraries is
>> distributed with marmotta under src/ext:
>>
>>
>> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-rss/src/ext/java
>>
>> If you take a look in the pom of that module, we're using
>> the build-helper-maven-plugin for allowing add external sources. Nobody had
>> issues with that plugin before. So if you could give some more information
>> about your system (operative system, maven version, jdk version, etc),
>> maybe we could come with a solution.
>>
>>
>> > Another example, KWRLProgramParserBase[1] does not compile, because
>> there's
>> > no code for ParseExceptioin and KWRLProgramParser.
>> >
>>
>> In the reasoner, as well as for ldpath, some code is dynamically generated
>> by the javacc-maven-plugin. Maybe you have issues with that plugin too?
>>
>> I guess having the stack traces  would help to debug those issues...
>>
>> Cheers,
>>
>>
>> --
>> Sergio Fernández
>> Partner Technology Manager
>> Redlink GmbH
>> m: +43 6602747925
>> e: sergio.fernandez@redlink.co
>> w: http://redlink.co
>>

Re: [GSoC2015] list of accepted projects

Posted by Junyue Wang <ju...@gmail.com>.

Hello Sergio,

I'm using Eclipse with m2e plugin in Windows 7. The problems are resolved
by installing some m2e connectors for maven plugins. Thanks!

yours,
junyue

On Mon, Jun 1, 2015 at 4:58 AM, Sergio Fernández <wi...@apache.org> wrote:

> Hi Junyue
>
> On Sat, May 30, 2015 at 10:57 AM, Junyue Wang <ju...@gmail.com> wrote:
> >
> > For example, marmotta-rio-rss lacks of dependency jars for importing
> > packages of org.rometools and com.sun.syndication. Is there something
> wrong
> > with the pom of marmotta-rio-rss?
> >
>
> As described in the NOTICE file, the source code of both libraries is
> distributed with marmotta under src/ext:
>
>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-rss/src/ext/java
>
> If you take a look in the pom of that module, we're using
> the build-helper-maven-plugin for allowing add external sources. Nobody had
> issues with that plugin before. So if you could give some more information
> about your system (operative system, maven version, jdk version, etc),
> maybe we could come with a solution.
>
>
> > Another example, KWRLProgramParserBase[1] does not compile, because
> there's
> > no code for ParseExceptioin and KWRLProgramParser.
> >
>
> In the reasoner, as well as for ldpath, some code is dynamically generated
> by the javacc-maven-plugin. Maybe you have issues with that plugin too?
>
> I guess having the stack traces  would help to debug those issues...
>
> Cheers,
>
>
> --
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 6602747925
> e: sergio.fernandez@redlink.co
> w: http://redlink.co
>

Re: [GSoC2015] list of accepted projects

Posted by Sergio Fernández <wi...@apache.org>.

Hi Junyue

On Sat, May 30, 2015 at 10:57 AM, Junyue Wang <ju...@gmail.com> wrote:
>
> For example, marmotta-rio-rss lacks of dependency jars for importing
> packages of org.rometools and com.sun.syndication. Is there something wrong
> with the pom of marmotta-rio-rss?
>

As described in the NOTICE file, the source code of both libraries is
distributed with marmotta under src/ext:

https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-rss/src/ext/java

If you take a look in the pom of that module, we're using
the build-helper-maven-plugin for allowing add external sources. Nobody had
issues with that plugin before. So if you could give some more information
about your system (operative system, maven version, jdk version, etc),
maybe we could come with a solution.

> Another example, KWRLProgramParserBase[1] does not compile, because there's
> no code for ParseExceptioin and KWRLProgramParser.
>

In the reasoner, as well as for ldpath, some code is dynamically generated
by the javacc-maven-plugin. Maybe you have issues with that plugin too?

I guess having the stack traces  would help to debug those issues...

Cheers,

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: [GSoC2015] list of accepted projects

Posted by Junyue Wang <ju...@gmail.com>.

Hello,

For example, marmotta-rio-rss lacks of dependency jars for importing
packages of org.rometools and com.sun.syndication. Is there something wrong
with the pom of marmotta-rio-rss?

Another example, KWRLProgramParserBase[1] does not compile, because there's
no code for ParseExceptioin and KWRLProgramParser.

yours,
junyue

[1]
https://github.com/apache/marmotta/blob/develop/libraries/kiwi/kiwi-reasoner/src/main/java/org/apache/marmotta/kiwi/reasoner/parser/KWRLProgramParserBase.java

On Fri, May 29, 2015 at 2:14 PM, Sergio Fernández <wi...@apache.org> wrote:

> On Fri, May 29, 2015 at 5:10 AM, Junyue Wang <ju...@gmail.com> wrote:
> >
> > I've forked apache/marmotta here [1] and added you in my collaborators
> > list.
> > I also cloned the code from github. But it seems some of the maven
> modules
> > in the develop branch do not compile, e.g. marmotta-commons and
> > marmotta-rio-rss. Is that the right situation now?
>
>
> If you could provide some more details maybe we can support you on getting
> the build work...
>
>
> --
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 6602747925
> e: sergio.fernandez@redlink.co
> w: http://redlink.co
>

Re: [GSoC2015] list of accepted projects

Posted by Sergio Fernández <wi...@apache.org>.

On Fri, May 29, 2015 at 5:10 AM, Junyue Wang <ju...@gmail.com> wrote:
>
> I've forked apache/marmotta here [1] and added you in my collaborators
> list.
> I also cloned the code from github. But it seems some of the maven modules
> in the develop branch do not compile, e.g. marmotta-commons and
> marmotta-rio-rss. Is that the right situation now?


If you could provide some more details maybe we can support you on getting
the build work...


-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernandez@redlink.co
w: http://redlink.co

Re: [GSoC2015] list of accepted projects

Posted by Junyue Wang <ju...@gmail.com>.

Hello Peter,

I've forked apache/marmotta here [1] and added you in my collaborators list.
I also cloned the code from github. But it seems some of the maven modules
in the develop branch do not compile, e.g. marmotta-commons and
marmotta-rio-rss. Is that the right situation now?

yours,
hello

[1] https://github.com/junyuew/marmotta



On Sun, May 17, 2015 at 9:09 AM, Peter Ansell <an...@gmail.com>
wrote:

> Hi Junjue,
>
> It will be simplest to track if you fork the Marmotta repository at
> GitHub and create a branch named "MARMOTTA-593".
>
> Add me as a collaborator to the GitHub repository. My GitHub id is
> "ansell".
>
> The collaborators list for my fork is at:
>
> https://github.com/ansell/marmotta/settings/collaboration
>
> When you fork it, you can replace "ansell" with your GitHub id and use
> that page to add me to the list of collaborators.
>
> Yes, the code will be merged to Marmotta in the end.
>
> You should create a new module inside of marmotta-sesame-tools named
> "marmotta-rio-rdfht"
>
>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools
>
> You will also need to add a format constant into marmotta-rio-api as a
> new folder in the following directory, similar to the current 3
> folders there:
>
>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-api/src/main/java/org/apache/marmotta/commons/sesame/rio
>
> Cheers,
>
> Peter
>
>
> Cheers,
>
> Peter
>
> On 16 May 2015 at 22:19, Junyue Wang <ju...@gmail.com> wrote:
> > Hello Sergio, Peter,
> >
> > It's my honor to be a GSoC student. I appreciate your help for the
> comments
> > of the project proposal.
> > I read the proposed methodology you pointed out. But it seems my project
> is
> > only related to Sesame and RDF HDT, without touching the code base of
> > Marmotta. Should I fork Marmotta in github, or start a new repository
> there?
> > Will my code be merged into Marmotta in the end? If so, which module of
> > Marmotta?
> >
> > yours,
> > junyue
> >
> > On Thu, Apr 30, 2015 at 2:41 PM, Sergio Fernández <wi...@apache.org>
> wrote:
> >
> >> Hi Peter,
> >>
> >> On Wed, Apr 29, 2015 at 1:12 AM, Peter Ansell <an...@gmail.com>
> >> wrote:
> >>>
> >>> Those guidelines look great to me, especially the suggestion about the
> >>> branch name including the Jira issue, which I have found very useful
> >>> in all of my git-based projects. In the RDF/HDT case, and possibly in
> >>> the GeoSPARQL case, the contributed code could be in the form of a new
> >>> module, so there won't be much interference with the rest of the
> >>> codebase during that time. However, it is still useful to regularly
> >>> merge the "develop" branch into each of the branches to keep up to
> >>> date and reduce the number of merge conflicts occurring near the end
> >>> when the students will be rushing to complete the project.
> >>
> >>
> >> Great you like it, Peter :-)
> >>
> >> I expect less merge conflicts, nevertheless it's a more concrete
> library;
> >> with the GeoSPARQL project that workflow is much more important.
> >>
> >> I've just have one concern about the documentation. Last year I had
> >> formatting issues bringing that documentation into the wiki (MoinMoin
> >> syntax is not markdown, unfortunately). Do you think is better to do it
> >> directly in the wiki?
> >>
> >> I'd love to hear comments from our students, after all you're the ones
> who
> >> need to follow that proposed methodology.
> >>
> >> Cheers,
> >>
> >> --
> >> Sergio Fernández
> >> Partner Technology Manager
> >> Redlink GmbH
> >> m: +43 6602747925
> >> e: sergio.fernandez@redlink.co
> >> w: http://redlink.co
> >>
>

Re: [GSoC2015] list of accepted projects

Posted by Peter Ansell <an...@gmail.com>.

Hi Junjue,

It will be simplest to track if you fork the Marmotta repository at
GitHub and create a branch named "MARMOTTA-593".

Add me as a collaborator to the GitHub repository. My GitHub id is "ansell".

The collaborators list for my fork is at:

https://github.com/ansell/marmotta/settings/collaboration

When you fork it, you can replace "ansell" with your GitHub id and use
that page to add me to the list of collaborators.

Yes, the code will be merged to Marmotta in the end.

You should create a new module inside of marmotta-sesame-tools named
"marmotta-rio-rdfht"

https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools

You will also need to add a format constant into marmotta-rio-api as a
new folder in the following directory, similar to the current 3
folders there:

https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-api/src/main/java/org/apache/marmotta/commons/sesame/rio

Cheers,

Peter


Cheers,

Peter

On 16 May 2015 at 22:19, Junyue Wang <ju...@gmail.com> wrote:
> Hello Sergio, Peter,
>
> It's my honor to be a GSoC student. I appreciate your help for the comments
> of the project proposal.
> I read the proposed methodology you pointed out. But it seems my project is
> only related to Sesame and RDF HDT, without touching the code base of
> Marmotta. Should I fork Marmotta in github, or start a new repository there?
> Will my code be merged into Marmotta in the end? If so, which module of
> Marmotta?
>
> yours,
> junyue
>
> On Thu, Apr 30, 2015 at 2:41 PM, Sergio Fernández <wi...@apache.org> wrote:
>
>> Hi Peter,
>>
>> On Wed, Apr 29, 2015 at 1:12 AM, Peter Ansell <an...@gmail.com>
>> wrote:
>>>
>>> Those guidelines look great to me, especially the suggestion about the
>>> branch name including the Jira issue, which I have found very useful
>>> in all of my git-based projects. In the RDF/HDT case, and possibly in
>>> the GeoSPARQL case, the contributed code could be in the form of a new
>>> module, so there won't be much interference with the rest of the
>>> codebase during that time. However, it is still useful to regularly
>>> merge the "develop" branch into each of the branches to keep up to
>>> date and reduce the number of merge conflicts occurring near the end
>>> when the students will be rushing to complete the project.
>>
>>
>> Great you like it, Peter :-)
>>
>> I expect less merge conflicts, nevertheless it's a more concrete library;
>> with the GeoSPARQL project that workflow is much more important.
>>
>> I've just have one concern about the documentation. Last year I had
>> formatting issues bringing that documentation into the wiki (MoinMoin
>> syntax is not markdown, unfortunately). Do you think is better to do it
>> directly in the wiki?
>>
>> I'd love to hear comments from our students, after all you're the ones who
>> need to follow that proposed methodology.
>>
>> Cheers,
>>
>> --
>> Sergio Fernández
>> Partner Technology Manager
>> Redlink GmbH
>> m: +43 6602747925
>> e: sergio.fernandez@redlink.co
>> w: http://redlink.co
>>