You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Chetan Mehrotra (JIRA)" <ji...@apache.org> on 2014/08/18 08:31:19 UTC
[jira] [Updated] (OAK-2039) SegmentNodeStore might not create a
checkpoint
[ https://issues.apache.org/jira/browse/OAK-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chetan Mehrotra updated OAK-2039:
---------------------------------
Attachment: OAK-2039-alex.patch
Attaching a [patch|^OAK-2039-alex.patch] by [~alexparvulescu] which adds debug/warn logging when such an issue occurs
> SegmentNodeStore might not create a checkpoint
> ----------------------------------------------
>
> Key: OAK-2039
> URL: https://issues.apache.org/jira/browse/OAK-2039
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: segmentmk
> Reporter: Chetan Mehrotra
> Priority: Minor
> Fix For: 1.1
>
> Attachments: OAK-2039-alex.patch
>
>
> As per [~edivad] in the {{SegmentNodeStore.checkpoint(long)}} the invocation might return a checkpoint even though it has not been created
> Starting from
> {code:java|title=AsyncIndexUpdate.java#235}
> // there are some recent changes, so let's create a new checkpoint
> String afterCheckpoint = store.checkpoint(lifetime);
> NodeState after = store.retrieve(afterCheckpoint);
> if (after == null) {
> log.warn("Unable to retrieve newly created checkpoint {},"
> + " skipping the {} index update", afterCheckpoint, name);
> return;
> }
> String checkpointToRelease = afterCheckpoint;
> try {
> updateIndex(before, beforeCheckpoint, after, afterCheckpoint);
> // the update succeeded, i.e. it no longer fails
> {code}
> and then
> {code:java|title=SegmentNodeStore.java#205}
> public synchronized String checkpoint(long lifetime) {
> checkArgument(lifetime > 0);
> String name = UUID.randomUUID().toString();
> long now = System.currentTimeMillis();
> // try 5 times
> for (int i = 0; i < 5; i++) {
> if (commitSemaphore.tryAcquire()) {
> try {
> refreshHead();
> SegmentNodeState state = head.get();
> SegmentNodeBuilder builder = state.builder();
> NodeBuilder checkpoints = builder.child("checkpoints");
> for (String n : checkpoints.getChildNodeNames()) {
> NodeBuilder cp = checkpoints.getChildNode(n);
> PropertyState ts = cp.getProperty("timestamp");
> if (ts == null
> || ts.getType() != Type.LONG
> || now > ts.getValue(Type.LONG)) {
> cp.remove();
> }
> }
> NodeBuilder cp = checkpoints.child(name);
> cp.setProperty("timestamp", now + lifetime);
> cp.setChildNode(ROOT, state.getChildNode(ROOT));
> SegmentNodeState newState = builder.getNodeState();
> if (store.setHead(state, newState)) {
> refreshHead();
> return name;
> }
> } finally {
> commitSemaphore.release();
> }
> }
> }
> return name;
> }
> {code}
> we can see that it always return a checkpoint name even if it fails to create it (as by {{@Nonnull}} contract I would say). But if it fails to acquire lock for 5 times (no sleep in the meanwhile?) it does it silently and thus return a checkpoint which is not valid.
> This might cause indexing to not work properly as it relies on the fact that it can access previous version of content through the returned checkpoint
--
This message was sent by Atlassian JIRA
(v6.2#6252)