You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@marmotta.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/06/08 02:57:00 UTC

[jira] [Commented] (MARMOTTA-593) RDF HDT implementation for Sesame RIO

    [ https://issues.apache.org/jira/browse/MARMOTTA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576526#comment-14576526 ] 

ASF GitHub Bot commented on MARMOTTA-593:
-----------------------------------------

Github user ansell commented on a diff in the pull request:

    https://github.com/apache/marmotta/pull/12#discussion_r31883142
  
    --- Diff: commons/marmotta-sesame-tools/marmotta-rio-api/src/main/java/org/apache/marmotta/commons/sesame/rio/rdfhdt/RDFHDTFormat.java ---
    @@ -0,0 +1,20 @@
    +package org.apache.marmotta.commons.sesame.rio.rdfhdt;
    +
    +import java.util.Arrays;
    +
    +import org.openrdf.rio.RDFFormat;
    +
    +/**
    + * HDT (Header, Dictionary, Triples) is a compact data structure and binary
    + * serialization format for RDF that keeps big datasets compressed to save space
    + * while maintaining search and browse operations without prior decompression.
    + * <p/>
    + * Author: Junyue Wang
    + */
    +public class RDFHDTFormat {
    +
    +	public static final RDFFormat FORMAT = new RDFFormat("RDFHDT",
    +			Arrays.asList("application/rdf+hdt"), null, Arrays.asList("hdt"),
    --- End diff --
    
    The application/rdf+hdt media type is not standardised but feel free to keep using it at this point. 
    
    The W3C submission has the strange idea that a Media Type can be a combination of other media types, which doesn't give any top level reference to work on so this is as good as any.


> RDF HDT implementation for Sesame RIO
> -------------------------------------
>
>                 Key: MARMOTTA-593
>                 URL: https://issues.apache.org/jira/browse/MARMOTTA-593
>             Project: Marmotta
>          Issue Type: Task
>          Components: KiWi Triple Store
>            Reporter: Sergio Fernández
>              Labels: gsoc, gsoc2015, hdt, java, linkeddata, rdf, sesame
>   Original Estimate: 480h
>  Remaining Estimate: 480h
>
> [RDF HDT|http://www.rdfhdt.org] is a compact data structure and binary serialization format for RDF that keeps big datasets compressed to save space while maintaining search and browse operations without prior decompression. This makes it an ideal format for storing and sharing RDF datasets on the Web.
> Currently the [Java Implementation|http://www.rdfhdt.org/manual-of-the-java-hdt-library/] only provides bindings for jena RIOT, with a license that does not enable it to be integrated into the main Sesame codebase, or any Apache codebase.
> The idea consist on implementing an Apache licensed implementation of RDF HDT from scratch and support the [Sesame RIO|http://rdf4j.org/sesame/2.8/apidocs/org/openrdf/rio/Rio.html] infrastructure (RDFParser/RDFWriter/RDFHandler). 
> The implementation would require to have good knowledge of Java programming, plus some basic understanding of parsers concepts and the RDF and HDT data models.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)