You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@openoffice.apache.org by bu...@apache.org on 2015/01/25 00:35:06 UTC

[Issue 125726] Semantical text treatment for comparison (when searching for changes between texts)

https://issues.apache.org/ooo/show_bug.cgi?id=125726

Ramona <ra...@altom.ro> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ramona.tripa@altom.ro

--- Comment #1 from Ramona <ra...@altom.ro> ---
I have also encountered this issue with both the recent OpenOffice 4.1.1, and
the older 4.1.0 version – on Windows XP, as on Mac OS X Yosemite (the issue is
not configuration-dependent).

The issue is: the result of using the “Compare Document” feature in its current
implementation is quite confusing in the context of documents where the edited
version has a similar content but a different layout, a different flow of text
than the original version - as exemplified by Anton, the original reporter.

Steps to reproduce
1. Take a document and an edited version of that document. Ensure the two docs
have the same content but a different flow of text, the edited version
including changes of the type:
- line breaks
- paragraphs split into multiple subsequent paragraphs or sentences
- bullets inserted
- additional commas inserted
- extra space between two words in a sentence.

2. Open the edited document and then go to “Edit” -> “Compare Document...”.

3. Using the file selection dialog which appears, select the original document
and confirm the dialog.

Results
OpenOffice combines both documents into the reviewer's doc.
The fragments of text affected by the new formatting are marked as new
insertions / deletions.  

Expected
The OpenOffice “Help” on comparing documents specifies that: “All text passages
that occur in the reviewer's document but not in the original are identified as
having been inserted, and all text passages that got deleted by the reviewer
are identified as deletions”.

As the description suggests, the user expects that new content is marked as
insertion, whereas no longer existing content is marked as deletion. 

The "Compare Document" feature, however, does not distinguish between semantic
changes and formatting changes of the type mentioned above. Both are treated
alike.

Thus, passages or fragments of sentences marked as (new) insertions are old
passages with new formatting or fragments now preceded by a (previously
forgotten) comma. Similarly, passages marked as deletions are passages still
included in the edited version but under a slightly different form.

The user discovers that even a basic extra space inserted between two words in
the edited version of the doc will lead to subsequent content (up to the
punctuation mark) being marked as insertion / deletion.
Please refer to the screen captures attached. 

Treating formal changes like semantic changes increases confusion and
diminishes usability.

I have discovered in the database two older reports that basically point out to
the same issue, including unhappy comments from people for which this feature
is essential to their (proofreading) work, and suggestions arising from
comparisons with other similar products:
https://issues.apache.org/ooo/show_bug.cgi?id=49217
https://issues.apache.org/ooo/show_bug.cgi?id=54195

-- 
You are receiving this mail because:
You are the assignee for the issue.
You are watching all issue changes.