You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "GianMaria Romanato (JIRA)" <ji...@apache.org> on 2017/05/30 16:12:04 UTC

[jira] [Created] (OAK-6279) Recovery Agent fails on PostgreSQL 9.5+ on Linux

GianMaria Romanato created OAK-6279:
---------------------------------------

             Summary: Recovery Agent fails on PostgreSQL 9.5+ on Linux
                 Key: OAK-6279
                 URL: https://issues.apache.org/jira/browse/OAK-6279
             Project: Jackrabbit Oak
          Issue Type: Bug
          Components: core
    Affects Versions: 1.6.1
         Environment: RDB store on PostgreSQL 9.5 and 9.6 running on Linux
            Reporter: GianMaria Romanato


When Jackrabbit OAK crashes at the next restart it will use the LastRevRecoveryAgent to perform recovery.

The recovery procedure does not work when RDB is used and the database is PostgreSQL running on a Linux machine. However, the recovery works as expected  when RDB is used and the database is PostgreSQL running on Windows or MacOS.

After some investigation I noticed that on Windows and MacOS newly created PostgreSQL databases default to "C" collation, while on Linux "UTF-8" collation is the default. 

Based on my tests, the recovery agent works also on Linux if the database is created with collation set to "C". In other words, somehow the recovery agent fails unless database collation is set to "C".

According to the official PostgreSQL documentation:
'The C and POSIX collations both specify "traditional C" behavior, in which only the ASCII letters "A" through "Z" are treated as letters, and sorting is done strictly by character code byte values.'

On Linux, if collation is not "C" the recovery will lead to error message:

OakMerge0004: Following exceptions occurred during the bulk update 
> operations: [org.apache.jackrabbit.oak.plugins.document.ConflictException: 

By stepping a bit into the code I believe the problem is related to error:

"both setting and then reading of clusterId failed"

and it originates in method:
org.apache.jackrabbit.oak.plugins.identifier.ClusterRepositoryInfo.getOrCreateId(NodeStore)

where a CommitFailedException is caught and the flow proceeds to the lines of code preceded by comment: "// this should not happen"






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)