You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-dev@jackrabbit.apache.org by Alex Parvulescu <al...@gmail.com> on 2014/08/22 16:31:10 UTC

NodeStore#checkpoint api reevaluation

Hi,

Following OAK-2039 there was a discussion around the current design of the
#checkpoint apis. [0]

It looks a bit confusing that you can call the apis to create a checkpoint
and get back a reference but when retrieving it, it might not exist, even
if the calls are back to back.
With OAK-2039 I've added some warning logs when a checkpoint cannot be
created but a ref is still returned, to understand if this is a system load
problem, or something more profound.

I believe that nobody has any issues with the #retrieve method, all the
confusion is really about the #checkpoint parts, currently marked as
'@Nonnull'.

Alternatives mentioned are
 - return null if the checkpoint was not created
 - throw en exception

I vote -0 for the change, I believe that making this more complicated than
it needs to be (more null checks, or a try/catch) doesn't really benefit
anybody.

If there are thoughts around how this should change, please feel free to
join in.

best,
alex


[0]
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/spi/state/NodeStore.java#L124

Re: NodeStore#checkpoint api reevaluation

Posted by Davide Giannella <da...@apache.org>.

On 25/08/2014 08:35, Michael Dürig wrote:
>
>
> On 22.8.14 4:31 , Alex Parvulescu wrote:
>> Hi,
>>
>> Following OAK-2039 there was a discussion around the current design
>> of the
>> #checkpoint apis. [0]
>>
>> It looks a bit confusing that you can call the apis to create a
>> checkpoint
>> and get back a reference but when retrieving it, it might not exist,
>> even
>> if the calls are back to back.
>
> Reading the Javadoc carefully this is to be expected. However I think
> this could be improved. Either by making the Javadoc for #checkpoint
> more explicit about it or by reflecting it in the return value.
>
> Instead of returning null for the later option we could also return a
> constant value representing a not available checkpoint. With that
> client code wouldn't need to change but could check the returned value
> if desired.
>
Don't know if I'm understanding you correctly but I think that either we
return null or a proper object with error codes/messages. Returning an
error as String and having the client coding something like `if
(NodeStore.CHECKPOINT_ERROR.equals(checkpoint))` doesn't add any real
value versus `if (checkpoint == null)`.

Davide

Re: NodeStore#checkpoint api reevaluation

Posted by Michael Dürig <md...@apache.org>.


On 22.8.14 4:31 , Alex Parvulescu wrote:
> Hi,
>
> Following OAK-2039 there was a discussion around the current design of the
> #checkpoint apis. [0]
>
> It looks a bit confusing that you can call the apis to create a checkpoint
> and get back a reference but when retrieving it, it might not exist, even
> if the calls are back to back.

Reading the Javadoc carefully this is to be expected. However I think 
this could be improved. Either by making the Javadoc for #checkpoint 
more explicit about it or by reflecting it in the return value.

Instead of returning null for the later option we could also return a 
constant value representing a not available checkpoint. With that client 
code wouldn't need to change but could check the returned value if desired.

Michael

> With OAK-2039 I've added some warning logs when a checkpoint cannot be
> created but a ref is still returned, to understand if this is a system load
> problem, or something more profound.
>
> I believe that nobody has any issues with the #retrieve method, all the
> confusion is really about the #checkpoint parts, currently marked as
> '@Nonnull'.
>
> Alternatives mentioned are
>   - return null if the checkpoint was not created
>   - throw en exception
>
> I vote -0 for the change, I believe that making this more complicated than
> it needs to be (more null checks, or a try/catch) doesn't really benefit
> anybody.
>
> If there are thoughts around how this should change, please feel free to
> join in.
>
> best,
> alex
>
>
> [0]
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/spi/state/NodeStore.java#L124
>

Re: NodeStore#checkpoint api reevaluation

Posted by Davide Giannella <da...@apache.org>.

On 28/08/2014 09:02, Thomas Mueller wrote:
> Hi,
>
> I think we need to decide whether "not beeing able to create checkpoint"
> is an important and common case or not.
>
> * If it's an common case (if it uses tryAcquire which may or may not
> work), then I would prefer "tryCheckpoint" with a possible return value of
> null.
>
> * If it's not a common case, then throwing an exception would be fine.
>
> Or maybe we need both "tryCheckpoint" and "checkpoint"?
>
> But using "checkpoint" as the method name, and using a "special return
> value" if it didn't work, that's dangerous, as it's so easy (specially for
> new developers) to forget to check the return value. The Java API has some
> similarly dangerous methods, for example InputStream.read(byte[] b...)
> which may not read fully, that's an example on what to avoid I think.
>
+1 on everything for me. :)

D.

Re: NodeStore#checkpoint api reevaluation

Posted by Thomas Mueller <mu...@adobe.com>.

Hi,

I think we need to decide whether "not beeing able to create checkpoint"
is an important and common case or not.

* If it's an common case (if it uses tryAcquire which may or may not
work), then I would prefer "tryCheckpoint" with a possible return value of
null.

* If it's not a common case, then throwing an exception would be fine.

Or maybe we need both "tryCheckpoint" and "checkpoint"?

But using "checkpoint" as the method name, and using a "special return
value" if it didn't work, that's dangerous, as it's so easy (specially for
new developers) to forget to check the return value. The Java API has some
similarly dangerous methods, for example InputStream.read(byte[] b...)
which may not read fully, that's an example on what to avoid I think.

Regards,
Thomas

On 27/08/14 18:09, "Jukka Zitting" <ju...@zitting.name> wrote:

>Hi,
>
>2014-08-22 10:31 GMT-04:00 Alex Parvulescu <al...@gmail.com>:
>> It looks a bit confusing that you can call the apis to create a
>>checkpoint
>> and get back a reference but when retrieving it, it might not exist,
>>even
>> if the calls are back to back.
>
>I'd call that a bug, as the behavior clearly breaks the documented
>contract of the method.
>
>> Alternatives mentioned are
>>  - return null if the checkpoint was not created
>>  - throw en exception
>
>Throwing an exception seems like the correct solution here, as not
>being able to create a checkpoint is an exceptional situation, much
>like when a commit fails for similar reasons.
>
>> I vote -0 for the change, I believe that making this more complicated
>>than
>> it needs to be (more null checks, or a try/catch) doesn't really benefit
>> anybody.
>
>I think there's more extra complication in having to deal with a
>potentially non-existent return value.
>
>BR.
>
>Jukka Zitting

Re: NodeStore#checkpoint api reevaluation

Posted by Jukka Zitting <ju...@zitting.name>.

Hi,

2014-08-22 10:31 GMT-04:00 Alex Parvulescu <al...@gmail.com>:
> It looks a bit confusing that you can call the apis to create a checkpoint
> and get back a reference but when retrieving it, it might not exist, even
> if the calls are back to back.

I'd call that a bug, as the behavior clearly breaks the documented
contract of the method.

> Alternatives mentioned are
>  - return null if the checkpoint was not created
>  - throw en exception

Throwing an exception seems like the correct solution here, as not
being able to create a checkpoint is an exceptional situation, much
like when a commit fails for similar reasons.

> I vote -0 for the change, I believe that making this more complicated than
> it needs to be (more null checks, or a try/catch) doesn't really benefit
> anybody.

I think there's more extra complication in having to deal with a
potentially non-existent return value.

BR.

Jukka Zitting

Re: NodeStore#checkpoint api reevaluation

Posted by Thomas Mueller <mu...@adobe.com>.

Hi,

I would prefer if the "checkpoint" method is renamed to "tryCheckpoint" or
similar, if it is a "best effort" method (which seems to be the case, as
it uses commitSemaphore.tryAcquire()).
 
If a checkpoint can't be created, then I would return null. This is the
same as FileChannel.tryLock (returns null if the lock can't be acquired).
>From a user point of view, it would be strange if FileChannel.tryLock
returns a "special case" non-null value, or returns a lock that can't be
released...

Regards,
Thomas





On 25/08/14 10:17, "Marcel Reutegger" <mr...@adobe.com> wrote:

>Hi,
>
>On 22/08/14 16:31, "Alex Parvulescu" <al...@gmail.com> wrote:
>>Following OAK-2039 there was a discussion around the current design of
>>the
>>#checkpoint apis. [0]
>>
>>It looks a bit confusing that you can call the apis to create a
>>checkpoint
>>and get back a reference but when retrieving it, it might not exist, even
>>if the calls are back to back.
>>With OAK-2039 I've added some warning logs when a checkpoint cannot be
>>created but a ref is still returned, to understand if this is a system
>>load
>>problem, or something more profound.
>
>what is the reason the SegmentNodeStore does a
>commitSemaphore.tryAcquire()
>instead of a commitSemaphore.acquire() like in SegmentNodeStore.merge()?
>
>>I believe that nobody has any issues with the #retrieve method, all the
>>confusion is really about the #checkpoint parts, currently marked as
>>'@Nonnull'.
>>
>>Alternatives mentioned are
>> - return null if the checkpoint was not created
>> - throw en exception
>>
>>I vote -0 for the change, I believe that making this more complicated
>>than
>>it needs to be (more null checks, or a try/catch) doesn't really benefit
>>anybody.
>
>I think we should improve it somehow because I find the current behaviour
>quite confusing. The current implementation of
>SegmentNodeStore.checkpoint()
>IMO violates the contract. It may return a string reference to a
>checkpoint
>which was never created and obviously won't be valid for the requested
>lifetime.
>
>In my view, a client should be able to detect this in a simple way. Right
>now you would have to call retrieve() to find out if checkpoint() actually
>worked.
>
>Returning a null value works better if we specify under what conditions
>no checkpoint can be created. After all a client would have to implement
>some code in response to a null value. E.g. should it retry later, because
>the checkpoint cannot be created when the system is under load? This would
>be a good fit if we keep the current implementation in SegmentNodeStore.
>
>An exception works better if we say an implementation should always be
>able to create a checkpoint and only fail if it cannot perform the
>operation because of e.g. an underlying IOException.
>
>Regards
> Marcel
>

Re: NodeStore#checkpoint api reevaluation

Posted by Marcel Reutegger <mr...@adobe.com>.

Hi,

On 22/08/14 16:31, "Alex Parvulescu" <al...@gmail.com> wrote:
>Following OAK-2039 there was a discussion around the current design of the
>#checkpoint apis. [0]
>
>It looks a bit confusing that you can call the apis to create a checkpoint
>and get back a reference but when retrieving it, it might not exist, even
>if the calls are back to back.
>With OAK-2039 I've added some warning logs when a checkpoint cannot be
>created but a ref is still returned, to understand if this is a system
>load
>problem, or something more profound.

what is the reason the SegmentNodeStore does a commitSemaphore.tryAcquire()
instead of a commitSemaphore.acquire() like in SegmentNodeStore.merge()?

>I believe that nobody has any issues with the #retrieve method, all the
>confusion is really about the #checkpoint parts, currently marked as
>'@Nonnull'.
>
>Alternatives mentioned are
> - return null if the checkpoint was not created
> - throw en exception
>
>I vote -0 for the change, I believe that making this more complicated than
>it needs to be (more null checks, or a try/catch) doesn't really benefit
>anybody.

I think we should improve it somehow because I find the current behaviour
quite confusing. The current implementation of
SegmentNodeStore.checkpoint()
IMO violates the contract. It may return a string reference to a checkpoint
which was never created and obviously won't be valid for the requested
lifetime.

In my view, a client should be able to detect this in a simple way. Right
now you would have to call retrieve() to find out if checkpoint() actually
worked.

Returning a null value works better if we specify under what conditions
no checkpoint can be created. After all a client would have to implement
some code in response to a null value. E.g. should it retry later, because
the checkpoint cannot be created when the system is under load? This would
be a good fit if we keep the current implementation in SegmentNodeStore.

An exception works better if we say an implementation should always be
able to create a checkpoint and only fail if it cannot perform the
operation because of e.g. an underlying IOException.

Regards
 Marcel