You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Francisco Carriedo Scher <fc...@gmail.com> on 2012/03/20 21:59:25 UTC

JR Future idea

Hi there,

i wrote two posts last week

http://mail-archives.apache.org/mod_mbox/jackrabbit-users/201203.mbox/ajax/%3CCAFWtOcMMPZC6zn9AGX3PBUM2%2BmAiEcibay%3DxfL2v6QDtzCoGYw%40mail.gmail.com%3E

and

http://mail-archives.apache.org/mod_mbox/jackrabbit-users/201203.mbox/ajax/%3CCAFWtOcO3dpR0AQd3gpiuasa2d7MO45fxGn3YVXR3YMZt2iEyyw%40mail.gmail.com%3E

asking about issues related to saving the transfer of big binary values
when they are already present in the repository. I have been inspecting the
source code and i must confess that figuring the way the binary data
follows overcomes me a little bit... So to say, i tried to go from the
abstract point of view:

***************************************************************************************************
C: client side
S: server side

C: i want to publish the (large) file with hash 1234, already present?
S: no
THEN TRANSFER TAKES PLACE

C: i want to publish the (large) file with hash 1234, already present?
S: yes
THEN CREATE A LINK
**************************************************************************************************

I can imagine lots of questions to properly integrate this (efficient
methods for asking about the existence, for instance), but i really need
some (expertise) tips to do this... Let's say, an entry point...

By the way, just to clarify: would be a good (and efficient) solution
defining a custom node type (with some additional properties, being one the
hash of the content) and then querying normally the repository? Depending
on the query result, create a nt:file or a nt:linkedFile... Would this be a
solution?

Thanks for your attention!

Re: JR Future idea

Posted by Francisco Carriedo Scher <fc...@gmail.com>.
Thanks Jukka, great news, it will help for sure!

I will check the extension you described and the code snipped to give it a
try and i will tell back!

Thank you so much for your attention!

2012/3/21 Jukka Zitting <ju...@gmail.com>

> Hi Francisco,
>
> On Tue, Mar 20, 2012 at 9:59 PM, Francisco Carriedo Scher
> <fc...@gmail.com> wrote:
> > I can imagine lots of questions to properly integrate this (efficient
> > methods for asking about the existence, for instance), but i really need
> > some (expertise) tips to do this... Let's say, an entry point...
>
> For now one of the main goals of the data store design was to avoid
> duplicating binaries that already exist in the repository, for example
> when copying or versioning existing content. We didn't put much effort
> into thinking how an external client could also leverage this feature
> like you're suggesting. But I agree that it's a good use case, so
> thanks for following up on this!
>
> As for how to implement this on top of the JCR API, the main thing
> you'd need is a Value reference that matches a given content hash. To
> start with this identifier was completely internal, but in JCR-1892
> [1] we already started exposing it through the
> JackrabbitValue.getContentIdentity() [2] extension. What you'd need
> then is another extension method, for example
> JackrabbitValueFactory.createBinary(String) or something similar, that
> turns a given content hash to a matching binary Value (or returns null
> if a matching value is not found).
>
> With such a method, your client could work roughly like this:
>
>    File file = ...;
>    String hash = computeContentHash(file);
>
>    // Check if the binary already exists in the repository
>    JackrabbitValueFactory factory = ...;
>    Binary binary = factory.createBinary(hash);
>    if (binary == null) {
>        // Doesn't exist yet, so stream it to the repository
>        binary = factory.creatBinary(new FileInputStream(file));
>    }
>
>    Node node = ...;
>    node.setProperty(Property.JCR_DATA, binary);
>
> Does this help you forward?
>
> [1] https://issues.apache.org/jira/browse/JCR-1892
> [2] http://jackrabbit.apache.org/api/2.4/org/apache/jackraasĂ­ que es algo
> bbit/api/JackrabbitValue.html#getContentIdentity()<http://jackrabbit.apache.org/api/2.4/org/apache/jackrabbit/api/JackrabbitValue.html#getContentIdentity%28%29>
>
> BR,
>
> Jukka Zitting
>

Re: JR Future idea

Posted by Jukka Zitting <ju...@gmail.com>.
Hi Francisco,

On Tue, Mar 20, 2012 at 9:59 PM, Francisco Carriedo Scher
<fc...@gmail.com> wrote:
> I can imagine lots of questions to properly integrate this (efficient
> methods for asking about the existence, for instance), but i really need
> some (expertise) tips to do this... Let's say, an entry point...

For now one of the main goals of the data store design was to avoid
duplicating binaries that already exist in the repository, for example
when copying or versioning existing content. We didn't put much effort
into thinking how an external client could also leverage this feature
like you're suggesting. But I agree that it's a good use case, so
thanks for following up on this!

As for how to implement this on top of the JCR API, the main thing
you'd need is a Value reference that matches a given content hash. To
start with this identifier was completely internal, but in JCR-1892
[1] we already started exposing it through the
JackrabbitValue.getContentIdentity() [2] extension. What you'd need
then is another extension method, for example
JackrabbitValueFactory.createBinary(String) or something similar, that
turns a given content hash to a matching binary Value (or returns null
if a matching value is not found).

With such a method, your client could work roughly like this:

    File file = ...;
    String hash = computeContentHash(file);

    // Check if the binary already exists in the repository
    JackrabbitValueFactory factory = ...;
    Binary binary = factory.createBinary(hash);
    if (binary == null) {
        // Doesn't exist yet, so stream it to the repository
        binary = factory.creatBinary(new FileInputStream(file));
    }

    Node node = ...;
    node.setProperty(Property.JCR_DATA, binary);

Does this help you forward?

[1] https://issues.apache.org/jira/browse/JCR-1892
[2] http://jackrabbit.apache.org/api/2.4/org/apache/jackrabbit/api/JackrabbitValue.html#getContentIdentity()

BR,

Jukka Zitting