You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Tobias Bocanegra (JIRA)" <ji...@apache.org> on 2006/02/21 14:45:24 UTC

[jira] Created: (JCR-325) docview roundtripping does not work with multivalue non-string properties

docview roundtripping does not work with multivalue non-string properties
-------------------------------------------------------------------------

         Key: JCR-325
         URL: http://issues.apache.org/jira/browse/JCR-325
     Project: Jackrabbit
        Type: Bug
  Components: core  
    Versions: 0.9    
 Environment: jackrabbit r379292
    Reporter: Tobias Bocanegra


when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
for example:

<?xml version="1.0" encoding="UTF-8"?>
.
.
<testNode 
    jcr:primaryType="refTest" 
    refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
/>

the refTest nodetype was:

[refTest] 
- refs (reference) multiple 

importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:

org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
/*
                // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
                String[] strings = Text.explode(attrValue, ' ', true);
                propValues = new Value[strings.length];
                for (int j = 0; j < strings.length; j++) {
                    // decode encoded blanks in value
                    strings[j] = Text.replace(strings[j], "_x0020_", " ");
                    propValues[j] = InternalValue.create(strings[j]);
                }
*/

i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-325?page=comments#action_12369617 ] 

Stefan Guggisberg commented on JCR-325:
---------------------------------------

the multi-value attribute is one of the differentiators being use to identify applicable property definitions.
nt:unstructured for example defines 2 residual properties that only differ in the multi-value attribute.
your approach would not be able to unambiguously determine the 'correct' definition because the 
'multi-valued' information is lost in document view .

btw: i just realized that some related test-cases fail after my changes. i'll have to find out if they 
do so legitimatly or whether they have to be adapted.

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Bug
>   Components: core
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Stefan Guggisberg

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-325?page=all ]

Jukka Zitting updated JCR-325:
------------------------------

    Attachment: xml-refactoring.patch

I've gone through the importer code and come up with a plan on how to implement the interpretation described above. The current importer design makes it quite hard to postpone the value parsing decisions until the applicable property definitions are known, which effectively prevents form implementing the proposed heuristics. To make this easier and to simplify overall code I'd like to encapsulate the value parsing rules to the PropInfo instances so that the SessionImporter or WorkspaceImporter classes wouldn't need to know anything about where the values came from (sys view or doc view) and how they should be parsed. I could then create separate SysViewPropInfo and DocViewPropInfo subclasses to encapsulate the different value parsing rules of the system view and document view XML encodings. Doing this however requires quite extensive refactoring of the current code, so I'd like to get a confirmation on this plan before I proceed.

The attached patch represents the first steps of this refactoring. Unless I've made a mistake, the patch doesn't yet change any of the existing behaviour (except for the location where some exceptions and log messages are generated), it just refactors the code structure so that it is easier to implement the SysViewPropInfo and DocViewPropInfo classes. The reason for publishing the patch at this point is to separate the structural changes from the behavioural changes and thus make the changes easier to review. I will continue with the behavioural changes if this structural change is approved.

Below is a breakdown of the refactoring steps included in this patch. I did almost all of these changes using the automatic refactoring tools in Eclipse to minimize the chance of accidentally introducing errors in the code. The patched source also passes all unit tests and seems to import both system view and document view files just as before, which makes me quite confident in the quality of the refactoring.

1) Move the following internal classes and interfaces to new files to make the class structure easier to manage:

   * Importer.TextValue -> TextValue
   * Importer.PropInfo -> PropInfo
   * Importer.NodeInfo -> NodeInfo
   * TargetImportHandler.AppendableValue -> AppendableValue
   * TargetImportHandler.BufferedStringValue -> BufferedStringValue
   * TargetImportHandler.StringValue -> StringValue

2) Remove the NodeInfo and PropInfo setters and make the member fields "private final" to enforce their immutability

3) Refactor the TargetImportHandler.disposePropertyValues(PropInfo) method into PropInfo.dispose() to improve encapsulation

4) Move the AppendableValue.dispose() method up to TextValue and implement it as a null method in StringValue to avoid type casts in PropInfo.dispose()

5) Remove the AppendableValue interface to make the class structure more simple as the SysViewImportHandler can just as well use the BufferedStringValue class directly

6) Add a TextValue.getNamespaceContext() method and corresponding "private final NamespaceResolver nsContext;" fields in StringValue and BufferedStringValue to associate the value instances with the namespace context in which they should be parsed. Remove the "nsContext" argument from Importer.startNode() and use the TextValue.getNamespaceContext() method in SessionImporter and WorkspaceImporter to access the correct namespace context.

7) Refactor the contents of the property iterator loop in WorkspaceImporter.startNode() into PropInfo.apply(NodeState, BatchedItemOperations, NodeTypeRegistry, ReferenceChangeTracker) and the contents of the similar loop in SessionImporter.startNode() into PropInfo.apply(NodeImpl, NamespaceResolver, ReferenceChangeTracker) to simplify the huge startNode() methods and to encapsulate the value parsing logic into PropInfo. This also allows the methods to easily share code through extracted getTargetType() and getApplicablePropertyDef() methods and enables other structural simplifications.

8) Remove the now unneeded PropInfo getters.

9) Refactor the contents of the value iterator loops in the PropInfo.apply() methods to TextValue.getValue(int type, NamespaceResolver resolver) and TextValue.getInternalValue(int type) methods and copy the implementation to StringValue and BufferedStringValue to increase encapsulation. In both cases the implementation can be simplified thanks to local access to the value details.

10) Remove the length(), retrieve(), reader(), and getNamespaceContext() methods from the TextValue interface to ensure proper encapsulation. Those methods can also be fully removed from StringValue, but in BufferedStringValue they still have a purpose due to the more complex storage model. Refactor the JCR_MIXINTYPES branch in DocViewImportContentHandler to use TextValue.getValue() instead of StringValue.retrieve() to get the mixin type names.

Overall these structural changes improve a number of quality metrics. The encapsulation level is higher, the method and class bodies are shorter, and even the total amount of code is slightly lower.

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Improvement
>   Components: xml
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Jukka Zitting
>  Attachments: xml-refactoring.patch
>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-325?page=comments#action_12369541 ] 

Jukka Zitting commented on JCR-325:
-----------------------------------

Couldn't this issue be handled more gracefully by postponing the space-splitting of the attribute value to when we have mapped the attribute name to a matching property definition? If the defined property is multivalued, then we'd split the attribute value and decode the _xFFFF_ endodings. Otherwise we'd just use the value as-is. Using a special marker character is a bit kludgy.

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Bug
>   Components: core
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Stefan Guggisberg

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-325?page=comments#action_12371104 ] 

Jukka Zitting commented on JCR-325:
-----------------------------------

> in your patch you just pass a reference to the ns context in the constructor.
> the ns context however is mutable and could, theoretically at least, change
> after a *StringValue has been created and before getValue has been called. 

Yes, I tought of this issue as well. The use of the ns context in the patch is temporally identical to the current code in svn, as the *StringValue instances are no longer used after the Importer.startNode() call. This temporal binding is however much less obvious in the patch, so you are right in that there is a threat of further changes breaking the inherent assumption of the ns context not changing during the lifetime of a *StringValue instance.

This issue could be quite cleanly solved by explicitly managing the ns context as a stack of immutable prefix mappings instead of using the mutable SAX NamespaceSupport class. A namespace context would be an object with a reference to the parent namespace context and a collection of prefix mappings declared for a given XML element. Namespace resolution would then follow the parent references until a match is found. (This is actually what the NamespaceSupport class does internally, although it only presents a mutable facade that hides this data structure behind the push/pop methods.)

I'll make a new version of the patch with modified namespace context handling.

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Improvement
>   Components: xml
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Jukka Zitting
>  Attachments: xml-refactoring.patch
>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Resolved: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-325?page=all ]
     
Stefan Guggisberg resolved JCR-325:
-----------------------------------

    Resolution: Fixed

document view xml serialization is *not* guaranteed to be roundtrippable
as some of the information, such as property type and multi-value flag, is lost.

the only failsafe way of handling multi-valued properties is to skip them 
on export (see "6.4.2.5 Multi-value Properties",  jsr-170 specification).
this is probably not desirable from a user perspective.

the following pragmatic approach has been chosen to workaround this issue:

document view export:
------------------------------

- multi-value properties are exported space-separated, with a leading line-feed (&#xa;) 
  as a 'multi-valued'-hint; spaces within a value are encoded as _x0020_;

  e.g. ["john doe","donald duck"] => "&#xa;john_x0020_doe donald_x0020_duck"

- single-value properties are exported without encoding contained spaces as _x0020_

  e.g. "Hello, World!" => "Hello, World!"


document view import:
------------------------------

- attribute values starting with &#xa; (line-feed) are assumed to represent multiple values,
  delimited by spaces; _x0020_ within values are decoded to spaces

  e.g. "&#xa;john_x0020_doe donald_x0020_duck" => ["john doe","donald duck"]

- all other attribute values are considered to be single values; no space decoding
  is performed.

  e.g. "Hello, World!" => "Hello, World!"


fixed in svn r384197

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Bug
>   Components: core
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Stefan Guggisberg

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-325?page=all ]

Stefan Guggisberg updated JCR-325:
----------------------------------

    Component: xml
                   (was: core)
         type: Improvement  (was: Bug)
    Assign To:     (was: Stefan Guggisberg)

document view serialization is per definition 'lossy' and not guaruanteed to roundtrip.

changing type to 'Improvement' as current implementation (skipping mutli-valued properties on document view export) is compliant with "6.4.2.5 Multi-value Properties" of the spec.

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Improvement
>   Components: xml
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-325?page=comments#action_12369624 ] 

Jukka Zitting commented on JCR-325:
-----------------------------------

Oops, I hade the sequence numbering wrong (added 2.2 after a bit of thought). The correct sequence is:

   1) If N matches just a single property definition, interpret V according to that definition.
   2) Otherwise N matches a single-valued definition S and a multi-valued definition M:
      2.1) If V contains whitespace and S is not a string property definition, interpret V according to M
      2.2) If V contains _xFFFF_ escapes and M is not a string property definition, interpret V according to S
      2.3) If V contains _xFFFF_ escapes, interpret V according to M
      2.4) Otherwise interpret V according to S 

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Bug
>   Components: core
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Stefan Guggisberg

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Assigned: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-325?page=all ]

Stefan Guggisberg reassigned JCR-325:
-------------------------------------

    Assign To: Stefan Guggisberg

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Bug
>   Components: core
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Stefan Guggisberg

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Assigned: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-325?page=all ]

Jukka Zitting reassigned JCR-325:
---------------------------------

    Assign To: Jukka Zitting

Assigned to me. I'm working on implementing the heuristics described above.

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Improvement
>   Components: xml
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Jukka Zitting

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-325?page=comments#action_12371095 ] 

Stefan Guggisberg commented on JCR-325:
---------------------------------------

i'm not sure regarding preserving the current (i.e. local, scoped) namespace context 
from the xml document within the StringValue & BufferedStringValue instances.

in your patch you just pass a reference to the ns context in the constructor.
the ns context however is mutable and could, theoretically at least, change
after a *StringValue has been created and before getValue has been called.

there should at least be a warning that the values have to be processed
during the Importer.startNode call. ohterwise the ns context should be cloned. 
that however would significantly increase memory usage compared to the current approach.

apart from this the patch looks good to me.

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Improvement
>   Components: xml
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Jukka Zitting
>  Attachments: xml-refactoring.patch
>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Reopened: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-325?page=all ]
     
Stefan Guggisberg reopened JCR-325:
-----------------------------------


> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Improvement
>   Components: xml
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-325?page=comments#action_12369888 ] 

Stefan Guggisberg commented on JCR-325:
---------------------------------------

Jukka:
> This solution also explicitly breaks the rules in section 6.4.2.5 of the JSR 170 spec. 

you're right, i wasn't aware of that.  btw the previous implementation (pre jcr-325 fix) was neither spec-compliant. 

until we've got a proper and spec-compliant solution for multi-value handling in document view 
i changed the current implementation to skip multi-valued properties entirely on document view export as
this is explicitly allowed by the spec (svn r384850).




> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Bug
>   Components: core
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Stefan Guggisberg

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-325?page=comments#action_12369622 ] 

Jukka Zitting commented on JCR-325:
-----------------------------------

Stefan:
> the multi-value attribute is one of the differentiators being use to identify applicable property definitions.
> nt:unstructured for example defines 2 residual properties that only differ in the multi-value attribute.
> your approach would not be able to unambiguously determine the 'correct' definition because the
> 'multi-valued' information is lost in document view .

You're right, but that's something I'd expect. If the property name matches both a single- and a multi-valued property and the property value doesn't hint the correct interpretation (like when a reference property contains a space), then it should be reasonable to default for the single-valued definition. We could even have some heuristics about the property value containing _xFFFF_ escapes when the correct property definition cannot otherwise be determined. This approach would cover the original case reported by Tobias as well as a majority of normal use cases.

The solution with a marker character only works when doing import/export between Jackrabbit repositories, but fails when importing content from other sources like original XML documents, exports from other JCR implementations, or results of XSL transformations written based on the JCR spec. This solution also explicitly breaks the rules in section 6.4.2.5 of the JSR 170 spec.

I'd use the following heuristic for interpreting the import of a document view property named N with value V:

   1) If N matches just a single property definition, interpret V according to that definition.
   2) Otherwise N matches a single-valued definition S and a multi-valued definition M:
      2.1) If V contains whitespace and S is not a string property definition, interpret V according to M
      2.2) If V contains _xFFFF_ escapes and M is not a string property definition, interpret V according to S
      2.2) If V contains _xFFFF_ escapes, interpret V according to M
      2.3) Otherwise interpret V according to S

This heuristic doesn't match all cases, but should work pretty much as expected for a majority of use cases.


> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Bug
>   Components: core
>     Versions: 0.9
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Stefan Guggisberg

>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (JCR-325) docview roundtripping does not work with multivalue non-string properties

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-325?page=all ]

Jukka Zitting updated JCR-325:
------------------------------

    Version: 1.0

> docview roundtripping does not work with multivalue non-string properties
> -------------------------------------------------------------------------
>
>          Key: JCR-325
>          URL: http://issues.apache.org/jira/browse/JCR-325
>      Project: Jackrabbit
>         Type: Improvement
>   Components: xml
>     Versions: 0.9, 1.0
>  Environment: jackrabbit r379292
>     Reporter: Tobias Bocanegra
>     Assignee: Jukka Zitting
>  Attachments: xml-refactoring.patch
>
> when exporting a multivalue property with docview, the property values are serialized to a space delimited list in the xml attributes:
> for example:
> <?xml version="1.0" encoding="UTF-8"?>
> .
> .
> <testNode 
>     jcr:primaryType="refTest" 
>     refs="b5c12524-5446-4c1a-b024-77f623680271 7b4d4e6f-9515-47d8-a77c-b4beeaf469bc"
> />
> the refTest nodetype was:
> [refTest] 
> - refs (reference) multiple 
> importing this docview fails with: javax.jcr.ValueFormatException: not a valid UUID format
> this is due to the fact, that the space delimited list is not exploded anymore. actually this code is commented:
> org.apache.jackrabbit.core.xml.DocViewImportHandler, lines 191 - 200:
> /*
>                 // @todo should attribute value be interpreted as LIST type (i.e. multi-valued property)?
>                 String[] strings = Text.explode(attrValue, ' ', true);
>                 propValues = new Value[strings.length];
>                 for (int j = 0; j < strings.length; j++) {
>                     // decode encoded blanks in value
>                     strings[j] = Text.replace(strings[j], "_x0020_", " ");
>                     propValues[j] = InternalValue.create(strings[j]);
>                 }
> */
> i haven't tested, but i assume this also fails for all other non-string property types.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira