You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by Harold Fennell <ha...@hotmail.com> on 2011/07/13 16:33:45 UTC

Open Office Help Wanted

I would like to know more about:

external diff and merge tools for Subversion that can process ODF documents 
 
Do you have other teams working on this problem, what is the expected timeframe, any helpful hints on completion, and is the a tool that integrates into Subversion or into open office.
 
Thank you for your time.
 
Harold


Harold Fennell
PO Box 3046
North Fort Myers, FL 33918

 		 	   		  

Re: Open Office Help Wanted

Posted by Eike Rathke <oo...@erack.de>.
Hi Mathias,

On Thursday, 2011-07-14 14:59:01 +0200, Mathias Bauer wrote:

> For many use cases a simple tool that just reads the text contents in
> ODF documents and compares them would be enough already, IMHO. Perhaps
> we can create one as a start.

As a starter there are xmlpp and xmldiff from
http://software.decisionsoft.com/index.html

I use xmlpp as a quick inspector for ODF content and it does its job
pretty well.

unzip -p "$1" content.xml styles.xml settings.xml | xmlpp.pl | vim -

To diff content one could use for example

unzip -p doc1.odt content.xml > doc1_content.xml
unzip -p doc2.odt content.xml > doc2_content.xml
xmldiff.pl doc1_content.xml doc2_content.xml

xmldiff internally uses xmlpp with appropriate options so no need to
have an extra xmlpp preprocess step.

Hacking some special treatment of Writer paragraphs into xmlpp would
probably be possible.

  Eike

-- 
 PGP/OpenPGP/GnuPG encrypted mail preferred in all private communication.
 Key ID: 0x293C05FD - 997A 4C60 CE41 0149 0DB3  9E96 2F1A D073 293C 05FD

Re: Open Office Help Wanted

Posted by Mathias Bauer <Ma...@gmx.net>.
On 13.07.2011 17:47, Rob Weir wrote:

> On Wed, Jul 13, 2011 at 4:33 PM, Harold Fennell
> <ha...@hotmail.com> wrote:
>>
>> I would like to know more about:
>>
>> external diff and merge tools for Subversion that can process ODF documents
>>
>> Do you have other teams working on this problem, what is the expected timeframe, any helpful hints on completion, and is the a tool that integrates into Subversion or into open office.
>>
> 
> No one currently working on it.  The general problem is tools like
> SVN, that work admirably with text files, have limitations with what
> it considers to be opaque binary files:
> 
> 1) Cannot do an effective diff, meaning the commit notifications are
> not as useful to reviewers of the commits.
> 
> 2) No effective way of doing branching and merging
> 
> One possible solution is to note that SVN has the capability to invoke
> external diff and merge (diff3) tools [1].  In theory something like
> this could be written for ODF documents.

For many use cases a simple tool that just reads the text contents in
ODF documents and compares them would be enough already, IMHO. Perhaps
we can create one as a start.

Of course such comparisons will be slow, not only because of the text
extraction preprocessor, but also because of the more complicated text
comparison algorithms. ODF documents usually contain continuous text, so
the usual "text lines comparison" doesn't help (as a "line" would be a
paragraph).

Regards,
Mathias



Re: Open Office Help Wanted

Posted by Rob Weir <ap...@robweir.com>.
On Wed, Jul 13, 2011 at 4:33 PM, Harold Fennell
<ha...@hotmail.com> wrote:
>
> I would like to know more about:
>
> external diff and merge tools for Subversion that can process ODF documents
>
> Do you have other teams working on this problem, what is the expected timeframe, any helpful hints on completion, and is the a tool that integrates into Subversion or into open office.
>

No one currently working on it.  The general problem is tools like
SVN, that work admirably with text files, have limitations with what
it considers to be opaque binary files:

1) Cannot do an effective diff, meaning the commit notifications are
not as useful to reviewers of the commits.

2) No effective way of doing branching and merging

One possible solution is to note that SVN has the capability to invoke
external diff and merge (diff3) tools [1].  In theory something like
this could be written for ODF documents.

At one end of the scale, one could do a diff of the embedded XML's
directly, e.g., diff the content.xml, the styles,xml, etc. and present
that as a normal text diff.  But some editors, for performance
reasons, write the XML files out all in one line.  A diff app would
probably want to do a canonical "pretty print" of the XML before
diff'ing in order to give something presentable to the user.

At the other end, you could imagine a WYSIWYG diff, akin to what OOo
shows when change tracking is invoked.  You could imagine, for
example, a diff3 mode for OpenOffice itself, where three files are
passed on the command line and a change tracked version of the doc is
created.   But that does not work so well for our commit
notifications, where we would typically want plain text diffs.


-Rob

[1] http://svnbook.red-bean.com/en/1.5/svn.advanced.externaldifftools.html



> Thank you for your time.
>
> Harold
>
>
> Harold Fennell
> PO Box 3046
> North Fort Myers, FL 33918
>
>