You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Nicolas Toper (JIRA)" <ji...@apache.org> on 2006/09/12 17:55:25 UTC

[jira] Created: (JCR-569) WorkspaceImporter Refactoring

WorkspaceImporter Refactoring
-----------------------------

                 Key: JCR-569
                 URL: http://issues.apache.org/jira/browse/JCR-569
             Project: Jackrabbit
          Issue Type: Improvement
            Reporter: Nicolas Toper


Hi,

As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).

Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 

Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 

I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.

Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 9/12/06, Nicolas <nt...@gmail.com> wrote:
> Sure. Here it is:

Cool, thanks.

> Issues I am trying to solve:

I pretty much agree with your analysis of the issues, as I've faced
similar problems working on JCR-325. I labeled the issues for brevity:

   A: Methods are too long
   B: Complexity
   C: Factorizing/Code duplicate
   D: Bugs
   E: Lack of flexibility

> Bugs: jcr:root is not handled correctly and other bugs have been detected.

For reference: the jcr:root issue and the preferred fix is discussed
in JCR-535. Please file and refer to separate bug reports for other
specific issues. It's better to keep the refactoring issue separate
from functional changes.

> Solution
> My proposal is in four parts:

In my opinion it would be best to split the solution to separate
issues to make it easier to review the changes and to understand the
rationale for each change.

> - Add some private methods to abstract some of the complex issue this class
> is managing

Do you mean the isAddable, checkNode, createProperties, createNode,
isSkipped, isCorrect, and postProcess methods? Are they supposed to be
private or protected, i.e. can a subclass use them directly? Are they
final, i.e. can a subclass override them?

I can see how this would help with goals A, B, and C. Could you make a
patch that changes the WorkspaceImporter class accordingly so we can
see these methods in effect. The patch should improve A, B, and C and
still pass all unit tests for me to apply it.

> - Add more control in the constructor (for instance for our backup tool):
> add a constructor with the BatchedItemOperation so we can choose if needed
> the needed ItemStateManager. (For instance to restore the version history).
>
> - Add a raw flag in the constructor to import the content as it is without
> initializing versioning and escaping some checks. This is a use case already
> encountered

Both of these parts are related to goal E. They are extra features and
pretty much independent of structural refactoring. I'd propose these
changes as a a separate issue and a separate patch.

> - With this new design, solve remaining bugs and handle the jcr:root issue.

This would be goal D. Again, I would propose a fix for the jcr:root
issue in JCR-535 and in separate reports for any other issues.

> Please have a look as the patch proposed earlier to have a look at the
> design and the new code (it is far from over and a lot of methods are
> incomplete).

I like your approach, but as you mention it still needs some work. I'd
be happy to review an updated version.

I'm a bit concerned that the flow in the proposed class, especially in
the startNode() method isn't directly based on the flow within the
existing WorkspaceImporter. Are you certain that all special cases are
handled as before? Could we add some unit tests to verify that all
those special cases remain operational? I don't think the current
WorkspaceImporter class has full test coverage.

BR,

Jukka Zitting

-- 
Yukatan - http://yukatan.fi/ - info@yukatan.fi
Software craftsmanship, JCR consulting, and Java development

Re: [jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by Nicolas <nt...@gmail.com>.
Sure. Here it is:

Introduction
The current o.a.j.core.xml.WorspaceImporter class has one main
responsability: to import data provided by a ContentHandler to the
repository. It needs to check imported data to maintain consistancy and
coherence of the repository and in some cases rejects them.

The calling stack is:
- WorkspaceImporter implements Importer
- SysViewImportHandler
- SaxParser (and various classes associated)
- Other classes

The Importer interface is composed of 4 methods: start, end, startNode,
endNode. The current WorkspaceImporter contains only two protected classe

Currently the code is complex to grasp and have some issues I would like to
solve with this iteration of the backup tool (I am using this class
heavily).

Issues I am trying to solve:
- Methods are too long: there are 7 methods in this class for 619 lines. In
average one method per 88 lines. This goes with the 4 levels of if/else.
This code is hard to debug.

- Complexity: there is no clear separation of concerns between methods. This
is partly due to the Importer interface quite close to the XML. For
instance, startNode() checks the validity of the import ad endNode do. We
need to create more methods giving a role to each one.

- Factorizing/Code duplicate: Right now we check several times the same
thing and sometimes not at all. For instance we check the sanity of the
workspace at the end of each node and at the end of the import. We also
check several times if a node can be added without disrupting the
consistancy of the repository. We should have a private dedicated method for
that.

- Bugs: jcr:root is not handled correctly and other bugs have been detected.
Since the code is hardly readable,

-  Lack of flexibility: I will give just an example. Because of this
monolothic design (only 7 methods), the checks are embedded in each method
(cf. upper). I had to subclass the 4 methods of the Importer interface. Also
the WorkspaceImporter initiates versioning if needed and remap some UUID all
the time.

Solution
My proposal is in four parts:

- Add some private methods to abstract some of the complex issue this class
is managing

- Add more control in the constructor (for instance for our backup tool):
add a constructor with the BatchedItemOperation so we can choose if needed
the needed ItemStateManager. (For instance to restore the version history).

- Add a raw flag in the constructor to import the content as it is without
initializing versioning and escaping some checks. This is a use case already
encountered

- With this new design, solve remaining bugs and handle the jcr:root issue.

This new class will be subclassed by WorkspaceImporter to handle the
workspace (I am not using it for now) and for the sanityCheck(): therefore
it will be only a few lines and we will have good SoC.

Please have a look as the patch proposed earlier to have a look at the
design and the new code (it is far from over and a lot of methods are
incomplete).

BR
Nicolas

[jira] Resolved: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-569?page=all ]

Jukka Zitting resolved JCR-569.
-------------------------------

    Fix Version/s: 1.1.1
       Resolution: Fixed

Applied the latest patch in revision 465218.

Thanks! And sorry for the long response time.

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Assigned To: Jukka Zitting
>             Fix For: 1.1.1
>
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-569?page=all ]

Nicolas Toper updated JCR-569:
------------------------------

    Attachment: WorkspaceImporter.patch

Hi,

The code now passes now all relevant unit tests.

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Resolved: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved JCR-569.
-------------------------------

    Resolution: Won't Fix

Resolving as Won't Fix, as the use case for the proposed changes has since been addressed in a different way with the RepositoryCopier class.

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: https://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Nicolas Toper
>            Assignee: Jukka Zitting
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434503 ] 
            
Jukka Zitting commented on JCR-569:
-----------------------------------

Nicolas:
> I would like to create two classes: one generic Importer (name to be defined) and one dedicated to import a workspace.

This would still be a part of the flexibility goal (E in my message to the mailing list). To keep things simple I'd suggest that you limit this issue to a patch that refactors the methods within WorkspaceImporter without introducing another class.

Note also that the responsibility of WorkspaceImporter is not to "import a workspace" but to import content *into* a workspace.

> [The methods of the generic Importer] are private since no other class need them (this class is used with the ContentHandler).

How would a subclass modify the behaviour of the class if those methods are private?

> If you are OK with this approach, I will work on this and propose you soon a patch implementing this refactoring. 

It looks OK, but I think the main issue is seeing how the existing code gets mapped to the proposed structure. A patch would be the perfect way to show this. :-)

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-569?page=all ]

Nicolas Toper updated JCR-569:
------------------------------

    Attachment: WorkspaceImporter.patch

Hi,

As told, it was untested. Now it is and working (according to the unit test). 

BR
Nicolas

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434496 ] 
            
Nicolas Toper commented on JCR-569:
-----------------------------------

Sure. Here it is:

Introduction
The current o.a.j.core.xml.WorspaceImporter class has one main responsability: to import data provided by a ContentHandler to the repository. It needs to check imported data to maintain consistancy and coherence of the repository and in some cases rejects them.

The calling stack is:
- WorkspaceImporter implements Importer
- SysViewImportHandler
- SaxParser (and various classes associated)
- Other classes

The Importer interface is composed of 4 methods: start, end, startNode, endNode. The current WorkspaceImporter contains only two protected classe

Currently the code is complex to grasp and have some issues I would like to solve with this iteration of the backup tool (I am using this class heavily).

Issues I am trying to solve:
- Methods are too long: there are 7 methods in this class for 619 lines. In average one method per 88 lines. This goes with the 4 levels of if/else. This code is hard to debug.

- Complexity: there is no clear separation of concerns between methods. This is partly due to the Importer interface quite close to the XML. For instance, startNode() checks the validity of the import ad endNode do. We need to create more methods giving a role to each one.

- Factorizing/Code duplicate: Right now we check several times the same thing and sometimes not at all. For instance we check the sanity of the workspace at the end of each node and at the end of the import. We also check several times if a node can be added without disrupting the consistancy of the repository. We should have a private dedicated method for that.

- Bugs: jcr:root is not handled correctly and other bugs have been detected. Since the code is hardly readable,

-  Lack of flexibility: I will give just an example. Because of this monolothic design (only 7 methods), the checks are embedded in each method (cf. upper). I had to subclass the 4 methods of the Importer interface. Also the WorkspaceImporter initiates versioning if needed and remap some UUID all the time.

Solution
My proposal is in four parts:

- Add some private methods to abstract some of the complex issue this class is managing

- Add more control in the constructor (for instance for our backup tool): add a constructor with the BatchedItemOperation so we can choose if needed the needed ItemStateManager. (For instance to restore the version history).

- Add a raw flag in the constructor to import the content as it is without initializing versioning and escaping some checks. This is a use case already encountered

- With this new design, solve remaining bugs and handle the jcr:root issue.

This new class will be subclassed by WorkspaceImporter to handle the workspace (I am not using it for now) and for the sanityCheck(): therefore it will be only a few lines and we will have good SoC.

Please have a look as the patch proposed earlier to have a look at the design and the new code (it is far from over and a lot of methods are incomplete).

BR
Nicolas

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12454691 ] 
            
Stefan Guggisberg commented on JCR-569:
---------------------------------------

> Jukka Zitting commented on JCR-569:
> -----------------------------------
> 
> Committed some small-scale improvements in revision 477142.
> 
> Looking at the class more closely, it seems like the key to simplifying the code flows would be to get rid of the "aborted" and "succeeded" marker flags, relying more on code structure to indicate the current state of the computation. See the patch attached to JCR-546 for one possible approach to removing (or reducing the visibility of) the "succeeded" flag. The "aborted" flag could possibly be replaced by using the State pattern like in the JCR-RMI Value implementation.

+1

cheers
stefan

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Assigned To: Jukka Zitting
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-569?page=all ]

Jukka Zitting updated JCR-569:
------------------------------

    Fix Version/s:     (was: 1.1.1)

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Assigned To: Jukka Zitting
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435161 ] 
            
Jukka Zitting commented on JCR-569:
-----------------------------------

> Here is the design of the new Importer class.

Thanks! Though I still need to ask for more clarification:

Is the proposed class supposed to replace the current WorkspaceImporter, or is will it be a base class for WorkspaceImpoter? The attached patch contains just the new GenericImporter class, without any changes to WorkspaceImporter or any other class, so applying it would have no effect on Jackrabbit or the WorkspaceImporter class. Please indicate, in the patch itself, how you'd change WorkspaceImporter as that's the stated goal of this issue.

As mentioned in my previous comment, I think it would be clearer if you didn't introduce a separate class, but instead refactored the internals of the existing WorkspaceImporter class. We can add the common base class as needed by the backup tool once this refactoring is done, doing it in one step just makes the proposed change more complex.

A minor note: The GenericImporter javadoc says: "imports an XML document". Note that the responsibility of the  Importer interface is actually *not* to import the XML document itself, just to import the content submitted to it by the XML import handlers. It's a subtle but quite important difference, that is seen for example in the fact that the Importer interface has no dependencies on any XML handling, it deals solely with NodeInfo and PropInfo objects instantiated by the XML import handlers. Thus the javadoc as written is incorrect.


> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435349 ] 
            
Jukka Zitting commented on JCR-569:
-----------------------------------

The patch still fails. The failing tests are in org.apache.jackrabbit.test.api. I'm getting:

    [junit] Tests run: 584, Failures: 0, Errors: 14, Time elapsed: 17,734 sec
    [junit] [ERROR] TEST org.apache.jackrabbit.test.api.TestAll FAILED

You can run the unit tests individually, but I'd really recommend you to use "maven jar" (or even "maven clean jar") at least as the final verification step before submitting patches. Detailed test reports get saved in target/test-reports.

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-569?page=all ]

Nicolas Toper updated JCR-569:
------------------------------

    Attachment: GenericImporter.patch

Hi,

Here is the design of the new Importer class. Since I didn't find any test case for it, if  everything seems OK, I'll write some unit test for this and finish the work.

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434205 ] 
            
Jukka Zitting commented on JCR-569:
-----------------------------------

Please be more specific in identifying the problems. The code being "circumvoluted [sic] and really hard to understand" is probably correct, but doesn't help us to make it better.

Identify the design structures that could be better organized, explain your approach to solving the issue, and propose a patch for doing that. See for example JCR-565 where I identify the design issue, discuss how to solve it, and provide a patch for doing that. What would the WorkspaceImporter look like after the addition of the class you propose? How does it make WorkspaceImporter better?

Note also that the Importer interface is supposed to be independent of the XML serialization format being used as both SysView and DocView ImportHandlers use the interface to save the imported data. Thus the name SysViewImporter is incorrect.

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434386 ] 
            
Stefan Guggisberg commented on JCR-569:
---------------------------------------

> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null).
> 

sorry, i can't follow you here. what does 'circumvoluted' mean? what does 'imbricated' mean? 
please note that if something is 'hard to understand' does not necessarily mean that it 
is too complex.

> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.

what's the use of the private 'isAddabble'(?),  checkNode etc methods? why are they private?



> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435342 ] 
            
Jukka Zitting commented on JCR-569:
-----------------------------------

Two other issues about the patch:

1) The modified class doesn't implement the Importer interface
2) There's a reference to GenericImporter that causes a compile error

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-569?page=all ]

Nicolas Toper updated JCR-569:
------------------------------

    Attachment: WorkspaceImporter.patch

Hi,

I have bought a USB key :p. Therefore, I am submitting thid patch implementing all Jukka's remarks (refactoring only of the WorkspaceImporter for now and comment issue).

Since I didn't find any unit test for this class, please keep in mind, I didn't test yet this class. I would like your feedback first before fully testing it (and writing unit test so we can all benefit from this work) and integrate it to JR's core.

Nico


> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435261 ] 
            
Nicolas Toper commented on JCR-569:
-----------------------------------

Hi Jukka,

I agree on your approach. I have made the update to the code, I will submit a new patch as soon as I find a USB key

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Assigned: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-569?page=all ]

Jukka Zitting reassigned JCR-569:
---------------------------------

    Assignee: Jukka Zitting

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Assigned To: Jukka Zitting
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434497 ] 
            
Nicolas Toper commented on JCR-569:
-----------------------------------

Hi Stefan,

About your first point, you are right: this is why I sent yesterday a more in depth analysis. This JIRA issue will be used as Jukka suggested it, for refactoring issue of this class only.

I would like to create two classes: one generic Importer (name to be defined) and one dedicated to import a workspace. I would use the generic Importer for the backup and others could use it when they need to batch load some data (of course, it is more complex to use and set up). The WorkspaceImporter would be used for the same use case as now. The WorkspaceImporter constructor would stay the same to avoid backward compatibility issue.

Here is a quick descriptive of each methods of the generic Importer. They are private since no other class need them (this class is used with the ContentHandler).

 private boolean checkNode(NodeInfo nodeInfo)
Check if the node can be imported to the repository. For instance, it can try to delete a protected node, the same node might already exist (it can depend on the ImportUUIDBehavior used) or the node can be incorrect according to the node type in the repository.

Some checks are escaped if raw mode is chosen.

If the node is not correct, it puts the skip mode on true to exclude this subtree. When the subtree is over, the skip mode is put to off. If the error is serious (constraint issue for instance), an exception will be thrown.


 private NodeState createNode(NodeState parent, NodeInfo nodeInfo)
Create the NodeState depending according to ImportUUIDBehavior flag

 private void createProperties(NodeState myNode, List propInfos)
Create the properties depending according to ImportUUIDBehavior flag

 private boolean isSkipped(NodeInfo nodeInfo)
if we are in skip mode return true; else false.


private void postProcess(NodeState node)
Initialize if raw == false,  this method is not called. It will be empty for the generic Importer (we don't need to initialize any thing) but WorkspaceImporter will overload it.

protected NodeState resolveUUIDConflict
Resolve UUID conflict :p

 private boolean isAddabble(NodeState parent, NodeInfo nodeInfo) throws ConstraintViolationException, AccessDeniedException, VersionException, LockException, ItemNotFoundException, ItemExistsException, RepositoryException
This method is not needed.

 private boolean isCorrect(NodeState parent, NodeInfo nodeInfo, List propInfos)
Not needed.

If you are OK with this approach, I will work on this and propose you soon a patch implementing this refactoring.

BR,
Nico
my blog! http://www.deviant-abstraction.net !!

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Reopened: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-569?page=all ]

Jukka Zitting reopened JCR-569:
-------------------------------

             
Reopening this issue since a regression was introduced by the WorkspaceImporter change. I've now reverted (revision 469441) the class back to what it was before and added a test case (revision 469440) that illustrates the problem.

Nicolas, there is a concern that since the patch essentially rewrites most of the previous code paths, other similar issues might have been introduced as well. it would be better if the refactoring modified as little as possible of the existing code. I.e. a restructuring of the class would be better than a rewrite.


> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Assigned To: Jukka Zitting
>             Fix For: 1.1.1
>
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12451283 ] 
            
Jukka Zitting commented on JCR-569:
-----------------------------------

Committed some small-scale improvements in revision 477142.

Looking at the class more closely, it seems like the key to simplifying the code flows would be to get rid of the "aborted" and "succeeded" marker flags, relying more on code structure to indicate the current state of the computation. See the patch attached to JCR-546 for one possible approach to removing (or reducing the visibility of) the "succeeded" flag. The "aborted" flag could possibly be replaced by using the State pattern like in the JCR-RMI Value implementation.

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Assigned To: Jukka Zitting
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-569?page=all ]

Nicolas Toper updated JCR-569:
------------------------------

    Attachment: SysViewImporter.patch

This is the skeleton. This class is not to be included in JR code.

WorkspaceImporter should inherit this class so no BatchedItemOperations is directly passed to it.

Please give me your feedback.

Nicolas

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12442223 ] 
            
Nicolas Toper commented on JCR-569:
-----------------------------------


   [[ Old comment, sent by email on Wed, 13 Sep 2006 18:00:23 +0200 ]]

Hi Stefan,

About your first point, you are right: this is why I sent yesterday a more
in depth analysis. This JIRA issue will be used as Jukka suggested it, for
refactoring issue of this class only.

I would like to create two classes: one generic Importer (name to be
defined) and one dedicated to import a workspace. I would use the generic
Importer for the backup and others could use it when they need to batch load
some data (of course, it is more complex to use and set up). The
WorkspaceImporter would be used for the same use case as now. The
WorkspaceImporter constructor would stay the same to avoid backward
compatibility issue.

Here is a quick descriptive of each methods of the generic Importer. They
are private since no other class need them (this class is used with the
ContentHandler).

 private boolean checkNode(NodeInfo nodeInfo)
Check if the node can be imported to the repository. For instance, it can
try to delete a protected node, the same node might already exist (it can
depend on the ImportUUIDBehavior used) or the node can be incorrect
according to the node type in the repository.

Some checks are escaped if raw mode is chosen.

If the node is not correct, it puts the skip mode on true to exclude this
subtree. When the subtree is over, the skip mode is put to off. If the error
is serious (constraint issue for instance), an exception will be thrown.


 private NodeState createNode(NodeState parent, NodeInfo nodeInfo)
Create the NodeState depending according to ImportUUIDBehavior flag

 private void createProperties(NodeState myNode, List propInfos)
Create the properties depending according to ImportUUIDBehavior flag

 private boolean isSkipped(NodeInfo nodeInfo)
if we are in skip mode return true; else false.


private void postProcess(NodeState node)
Initialize if raw == false,  this method is not called. It will be empty for
the generic Importer (we don't need to initialize any thing) but
WorkspaceImporter will overload it.

protected NodeState resolveUUIDConflict
Resolve UUID conflict :p

 private boolean isAddabble(NodeState parent, NodeInfo nodeInfo) throws
ConstraintViolationException, AccessDeniedException, VersionException,
LockException, ItemNotFoundException, ItemExistsException,
RepositoryException
This method is not needed.

 private boolean isCorrect(NodeState parent, NodeInfo nodeInfo, List
propInfos)
Not needed.

If you are OK with this approach, I will work on this and propose you soon a
patch implementing this refactoring.

BR,
Nico
my blog! http://www.deviant-abstraction.net !!


> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Assigned To: Jukka Zitting
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435341 ] 
            
Jukka Zitting commented on JCR-569:
-----------------------------------

XML import functionality is tested as a part of the JCR test suite that is automatically invoked when you run "maven jar" to build Jackrabbit. Currently the patch causes 14 unit test errors due to NullPointerExceptions.

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JCR-569) WorkspaceImporter Refactoring

Posted by "angela (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

angela updated JCR-569:
-----------------------

    Component/s: jackrabbit-core

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: https://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Nicolas Toper
>            Assignee: Jukka Zitting
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by "Nicolas Toper (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12435345 ] 
            
Nicolas Toper commented on JCR-569:
-----------------------------------

Just to be sure everything is all right: I have checked with my machine. Only 2 tests are run when calling o.a.j.test.core.xml. Are they the only ones run?

Is the JCR test suite what is in /jackrabbit/src/test?

Nico

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: GenericImporter.patch, SysViewImporter.patch, WorkspaceImporter.patch, WorkspaceImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira