You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Manuel López Blasi <lo...@conicet.gov.ar> on 2018/03/16 19:12:07 UTC

Restore Xml Binary File Base64 Encoded BUG (???)

Hi everyone,

i wrote some time ago about a problem i had with backup and restore 
functionality,

Context of the problem/Artifact versions:

OS: Linux Mint 18 - 64 BIT

jvm: java-8-openjdk-amd64 (java 1.8 - 64 BIT)

Container: Payara 41 (Glassfish 4)

jackrabbit-core 2.16.1

jackrabbit-ocm 2.0.0

jcr-2.0

jackrabbit-jca-2.16.1 (Applicacion deployed in Payara)

(Same happens with JR core and jca 2.14.4, java 1.7 and Glassfish 3.1)

the methods in Session/Workspace seemed to corrupt binary pdf file (any 
kind of file for the matter):

Backup:
void javax.jcr.Session.exportDocumentView(String absPath, OutputStream 
out, boolean skipBinary, boolean noRecurse) throws IOException, 
PathNotFoundException, RepositoryException

FileOutputStream output = new FileOutputStream(xmlBackupPath);
session.exportDocumentView("/TEST_BINARY", output, ignoreBinary, noRecurse);


Restore:
void javax.jcr.Workspace.importXML(String parentAbsPath, InputStream in, 
int uuidBehavior) throws IOException, VersionException, 
PathNotFoundException, ItemExistsException,ConstraintViolationException, 
InvalidSerializedDataException, LockException,AccessDeniedException, 
RepositoryException

I.E: 
session.getWorkspace().importXML("/",fInput,ImportUUIDBehavior.IMPORT_UUID_COLLISION_REPLACE_EXISTING);

I can export to an xml file with no problems. Binary pdf files are 
encoded en Base64 en put into th xml file.
The problem is when i import the same xml file, the Binary is not 
"decoded" before putting it into jackrabbit repository again.

After backing up a node, restoring it to the repository and retrievieng 
it, it's corrupted.
( Well, not really, it's just it's been saved serialized in Base64 
encoding ):

Node previouslySaved = session.getNode("/TEST_BINARY");

Property pdfFile = previouslySaved.getProperty("file");

ByteArrayInputStream in = 
(ByteArrayInputStream)pdfFile.getBinary().getStream();

int cuantos = in.available();
byte[] fileBytes = new byte[cuantos];
in.read(fileBytes);

saveStuffOnDisk(fileBytes,testFolder+"/retrievedFromJRandSavedOnDisk-AFTER-RESTORE.pdf");

this produces a corrupted pdf file.

If before saving it you do this, the pdf is fine:
Base64.decode(fileBytes);

So i guess somewhere inside the code something is not decoding what was 
previously saved encoded (I guess this should be the expected behaviour 
???).
I tried to debug the code to see where the magic happens for the import 
but had no luck, i guess there must be some callback stuff i couldn't find.
I DID could find where the enconding happens in *exportSystemView* method

*in package org.apache.jackrabbit.value.ValueHelper @line 729:*

*public static void serialize(Value value, boolean encodeBlanks, boolean 
enforceBase64,**
**                                 Writer writer)**
**            throws IllegalStateException, IOException, 
RepositoryException {*
         if (value.getType() == PropertyType.BINARY) {
             // binary data, base64 encoding required;
             // the encodeBlanks flag can be ignored since base64-encoded
             // data cannot contain space characters
             InputStream in = value.getStream();
             try {
*                Base64.encode(in, writer);*
                 // no need to close StringWriter
                 //writer.close();
             } finally {
                 try {
                     in.close();
                 } catch (IOException e) {
                     // ignore
                 }
             }
         } else {
             String textVal = value.getString();
             if (enforceBase64) {
                 byte bytes[] = textVal.getBytes(StandardCharsets.UTF_8);
*                Base64.encode(bytes, 0, bytes.length, writer);*
             }
             else {
                 if (encodeBlanks) {
                     // enocde blanks in string
                     textVal = Text.replace(textVal, " ", "_x0020_");
                 }
                 writer.write(textVal);
             }
         }
     }


In resume: Export/Backup is OK, Import/Restore is not (it's almost, just 
need the decoding before restoring nodes to repository).
So ... i am missing something? Any configuration tha fixes this? any 
thoughts?

Thanks in advance,
regards,
Manuel.