You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2006/09/12 18:23:23 UTC

[jira] Commented: (JCR-569) WorkspaceImporter Refactoring

    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434205 ] 
            
Jukka Zitting commented on JCR-569:
-----------------------------------

Please be more specific in identifying the problems. The code being "circumvoluted [sic] and really hard to understand" is probably correct, but doesn't help us to make it better.

Identify the design structures that could be better organized, explain your approach to solving the issue, and propose a patch for doing that. See for example JCR-565 where I identify the design issue, discuss how to solve it, and provide a patch for doing that. What would the WorkspaceImporter look like after the addition of the class you propose? How does it make WorkspaceImporter better?

Note also that the Importer interface is supposed to be independent of the XML serialization format being used as both SysView and DocView ImportHandlers use the interface to save the imported data. Thus the name SysViewImporter is incorrect.

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter. I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible enough for my use (in my class called SysViewImporter). This solution was perfectly valid for Google SoC (a lot of time constraints) but unacceptable in the long run for any project (we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance, the current code contains at most 5 imbricated if and uses a lot of different ways to pass information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted to gather the community feedback before stepping in. I tried moving all "work code" away from the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 9/12/06, Nicolas <nt...@gmail.com> wrote:
> Sure. Here it is:

Cool, thanks.

> Issues I am trying to solve:

I pretty much agree with your analysis of the issues, as I've faced
similar problems working on JCR-325. I labeled the issues for brevity:

   A: Methods are too long
   B: Complexity
   C: Factorizing/Code duplicate
   D: Bugs
   E: Lack of flexibility

> Bugs: jcr:root is not handled correctly and other bugs have been detected.

For reference: the jcr:root issue and the preferred fix is discussed
in JCR-535. Please file and refer to separate bug reports for other
specific issues. It's better to keep the refactoring issue separate
from functional changes.

> Solution
> My proposal is in four parts:

In my opinion it would be best to split the solution to separate
issues to make it easier to review the changes and to understand the
rationale for each change.

> - Add some private methods to abstract some of the complex issue this class
> is managing

Do you mean the isAddable, checkNode, createProperties, createNode,
isSkipped, isCorrect, and postProcess methods? Are they supposed to be
private or protected, i.e. can a subclass use them directly? Are they
final, i.e. can a subclass override them?

I can see how this would help with goals A, B, and C. Could you make a
patch that changes the WorkspaceImporter class accordingly so we can
see these methods in effect. The patch should improve A, B, and C and
still pass all unit tests for me to apply it.

> - Add more control in the constructor (for instance for our backup tool):
> add a constructor with the BatchedItemOperation so we can choose if needed
> the needed ItemStateManager. (For instance to restore the version history).
>
> - Add a raw flag in the constructor to import the content as it is without
> initializing versioning and escaping some checks. This is a use case already
> encountered

Both of these parts are related to goal E. They are extra features and
pretty much independent of structural refactoring. I'd propose these
changes as a a separate issue and a separate patch.

> - With this new design, solve remaining bugs and handle the jcr:root issue.

This would be goal D. Again, I would propose a fix for the jcr:root
issue in JCR-535 and in separate reports for any other issues.

> Please have a look as the patch proposed earlier to have a look at the
> design and the new code (it is far from over and a lot of methods are
> incomplete).

I like your approach, but as you mention it still needs some work. I'd
be happy to review an updated version.

I'm a bit concerned that the flow in the proposed class, especially in
the startNode() method isn't directly based on the flow within the
existing WorkspaceImporter. Are you certain that all special cases are
handled as before? Could we add some unit tests to verify that all
those special cases remain operational? I don't think the current
WorkspaceImporter class has full test coverage.

BR,

Jukka Zitting

-- 
Yukatan - http://yukatan.fi/ - info@yukatan.fi
Software craftsmanship, JCR consulting, and Java development

Re: [jira] Commented: (JCR-569) WorkspaceImporter Refactoring

Posted by Nicolas <nt...@gmail.com>.
Sure. Here it is:

Introduction
The current o.a.j.core.xml.WorspaceImporter class has one main
responsability: to import data provided by a ContentHandler to the
repository. It needs to check imported data to maintain consistancy and
coherence of the repository and in some cases rejects them.

The calling stack is:
- WorkspaceImporter implements Importer
- SysViewImportHandler
- SaxParser (and various classes associated)
- Other classes

The Importer interface is composed of 4 methods: start, end, startNode,
endNode. The current WorkspaceImporter contains only two protected classe

Currently the code is complex to grasp and have some issues I would like to
solve with this iteration of the backup tool (I am using this class
heavily).

Issues I am trying to solve:
- Methods are too long: there are 7 methods in this class for 619 lines. In
average one method per 88 lines. This goes with the 4 levels of if/else.
This code is hard to debug.

- Complexity: there is no clear separation of concerns between methods. This
is partly due to the Importer interface quite close to the XML. For
instance, startNode() checks the validity of the import ad endNode do. We
need to create more methods giving a role to each one.

- Factorizing/Code duplicate: Right now we check several times the same
thing and sometimes not at all. For instance we check the sanity of the
workspace at the end of each node and at the end of the import. We also
check several times if a node can be added without disrupting the
consistancy of the repository. We should have a private dedicated method for
that.

- Bugs: jcr:root is not handled correctly and other bugs have been detected.
Since the code is hardly readable,

-  Lack of flexibility: I will give just an example. Because of this
monolothic design (only 7 methods), the checks are embedded in each method
(cf. upper). I had to subclass the 4 methods of the Importer interface. Also
the WorkspaceImporter initiates versioning if needed and remap some UUID all
the time.

Solution
My proposal is in four parts:

- Add some private methods to abstract some of the complex issue this class
is managing

- Add more control in the constructor (for instance for our backup tool):
add a constructor with the BatchedItemOperation so we can choose if needed
the needed ItemStateManager. (For instance to restore the version history).

- Add a raw flag in the constructor to import the content as it is without
initializing versioning and escaping some checks. This is a use case already
encountered

- With this new design, solve remaining bugs and handle the jcr:root issue.

This new class will be subclassed by WorkspaceImporter to handle the
workspace (I am not using it for now) and for the sanityCheck(): therefore
it will be only a few lines and we will have good SoC.

Please have a look as the patch proposed earlier to have a look at the
design and the new code (it is far from over and a lot of methods are
incomplete).

BR
Nicolas