You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@openoffice.apache.org by bu...@apache.org on 2016/05/29 22:40:41 UTC

[Issue 126990] New: File saved normally then opened and filled with #

https://bz.apache.org/ooo/show_bug.cgi?id=126990

          Issue ID: 126990
        Issue Type: DEFECT
           Summary: File saved normally then opened and filled with #
           Product: Writer
           Version: 4.1.2
          Hardware: PC
                OS: Windows 8, 8.1
            Status: UNCONFIRMED
          Severity: Critical
          Priority: P5 (lowest)
         Component: help
          Assignee: issues@openoffice.apache.org
          Reporter: tinaconroy0718@gmail.com

I saved a file not 2 hours ago and when I opened it again the format was wrong
and all my text was #. The whole document just #########. This has happened
before. I do not want to rewrite it all again. Is there a way to recover the
text I had before?

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

John <jo...@yahoo.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |john.ha24@yahoo.co.uk

--- Comment #2 from John <jo...@yahoo.co.uk> ---
Created attachment 85563
  --> https://bz.apache.org/ooo/attachment.cgi?id=85563&action=edit
An example .odt file which opens as "full of ######"

This file is taken from
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=1532&start=420#p372812.

Notes:

1  It is a .odt file, but it is not a zip file, and it has no internal
structure (no content.xml, manifest.rdf etc).

2  When the file is opened with a Hex editor, it is 27,605 Bytes, and each byte
is zero.

3  When the file is opened by Writer, Writer assumes it must be a flat, ASCII
TEXT file.  Writer brings up the ASCII Filter Options pop-up.  The document
then appears with 9,999 x "#" as word 1, a paragraph return; 9,999 x "#" as
word 2, a paragraph return, and the remaining "#" as word 3. Presumably Writer
has a 9,999 character limit on a word and adds the paragraph return.

4  The fault seems to have the characteristics of Writer reserving some space,
naming that space postcol literature II.odt, setting the space to all zeros ...
and then failing to write the correct data to the file.  The file content is
therefore all zeros.  Does this occur because Writer was somehow prevented from
completing the write?  Could shutting a laptop lid too quickly cause this?

5  There are numerous issues relating to saving files across networks, where
the slow speed of the network highlights problems.  See Issue 107558 - A hidden
step while writing OOo files? which reports that AOO continues to do saving
AFTER the bar stops moving across the bottom of the screen.  Could it be that
users think that the save is completed when the bar stops moving, and slam the
laptop lid shut, whereas the save has not completed?

See also Issue 104661 - Saving to file should take place in a process
independent of the GUI 

Some form of atomic save is needed where the save can be guaranteed.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #4 from John <jo...@yahoo.co.uk> ---
Created attachment 85564
  --> https://bz.apache.org/ooo/attachment.cgi?id=85564&action=edit
Some examples of damaged files - all zeros, garbage and a mixture

See the forum post 22 page term paper replaced with pound signs which is at
https://forum.openoffice.org/en/forum/viewtopic.php?f=6&t=17677#p81363.  You
will see well over 200 cases of "My document is all #####" reports, many with
uploaded damaged files.  I have identified a few below:

These reports have each uploaded files which are full of zeros:

1.  Retrieving document at
https://forum.openoffice.org/en/forum/viewtopic.php?f=6&t=17690
2.  File now contains nothing but # characters at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=18573
3.  Character Set issue opening an .ODT document at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=23463
4.  Problem with file at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=25534
5.  .odt corrupted at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=25838
There are many more ...

These reports have each uploaded a damaged file:

1.  Re: ASCII Filter? Help me save my doc!!! at
https://forum.openoffice.org/en/forum/viewtopic.php?f=6&t=25503#p265546  The
file appears to be a valid .odt when you unzip it, but the file is damaged

2. Re: 22 pages term paper replaced with pound signs at
https://forum.openoffice.org/en/forum/viewtopic.php?f=6&t=17677&start=30#p275921
points to a file at http://www.mediafire.com/download/x75pb ... s_copy.ods
which is partially full of zeros and partially full of garbage.
There are many more ...

See the forum post [Hint] How did I fix my ODT file at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=1532 (viewed 140,000
times) which has many, many corrupted files which have been uploaded.  Many
have been analysed by forum posters.  

A small collection include:

1 Re: [Hint] How did I fix my ODT file at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=1532&start=390#p372337
uploaded a file clinical opthalmolog1.odt which appears to be a valid odt file,
but is full of zeros after FF0 (4,080) bytes.  Is the 2^n significant?

2 Re: [Hint] How did I fix my ODT file at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=1532&start=420#p372836
has upl;oaded $R67BQ9D.odt which starts with readable text which looks like a
Firefox crash report, then is full of zeros, then has binary data and then ends
with zeros.

3 Re: [Hint] How did I fix my ODT file at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=1532&start=330#p354771
Both dz.odt and mc.t.odt start off looking like a valid PK zip files, but then
just end - the files are corrupted.

4 Re: [Hint] How did I fix my ODT file at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=1532&start=330#p357120
uploaded the water door.odt.  The file is full of garbage - it looks like a
dump of memory.

I have uploaded a ZIP file with examples of these damaged files.

There are also many reports, with uploaded files, where the file is perfect ...
but the XML tags in content.xml are incorrect.  acknak is often able to correct
the XML errors manually and thus recover the file.  For example, see
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=1532&start=360#p357769
for currupted tags, which refers to Issue 126219: invalid xml on saving
document with comment/annotation

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #5 from Tania Valladares <tl...@gmail.com> ---
 I am trying to recover a document in which the text has been replaced with #.
It is extremely important. Can you try to see the document as I understand
there's nothing I can do at this point. Can anyone try to retrieve it if I
bring it in somewhere? Thank you. You should really put a warning  on the
Apache open office website. I've lost a lot of material that is very valuable
to me.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #14 from John <jo...@yahoo.co.uk> ---
Also, for someone with programming skills, a simple test would probably be

1.  Add an infinite loop a few lines after line 52 so that AOO loops during the
file write

2.  Wait until AOO is looping during the file write

3.  Issue a shutdown by closing the laptop lid.

Expected behaviour.  AOO will prevent the shutdown because the file has not
been written

Probable behaviour.  AOO will not attempt to prevent the shutdown

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #6 from John <jo...@yahoo.co.uk> ---
I am sorry but there is nothing which can be done because the file is full of
zeros - there is (literally) nothing in it.  For some reason the file was not
saved. Search the forum with #### and you will find a number of posts.

See
[url=https://forum.openoffice.org/en/forum/viewtopic.php?f=71&t=85038][Tutorial]
How to find and un-delete Writer temporary files[/url] for instructions on how
to identify and un-delete the temporary files Writer wrote while you were
editing the file, and then deleted.  You should be able to recover all or most
of the file.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

Tania Valladares <tl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tl.valladares53@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #13 from John <jo...@yahoo.co.uk> ---
Also, for someone with programming skills, a simple test would probably be

1.  Add an infinite loop a few lines after line 52 so that AOO loops during the
file write

2.  Issue a shutdown by closing the laptop lid.

Expected behaviour.  AOO will prevent the shutdown because the file has not
been written

Probable behaviour.  AOO will not attempt to prevent the shutdown

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #12 from John <jo...@yahoo.co.uk> ---
Unfortunately I don't have sufficient programming skills :-(

I recently assisted a user with a corrupted .ods file which I think resulted
from the same "shut down before writing is completed" cause.  

In his case (see
https://forum.openoffice.org/en/forum/viewtopic.php?f=9&t=103810#p502547) he
had lots of graphs and they were all missing.  Unzipping the .ods showed that
the Object 1 through Object 189 folders were present, so were content.xml.
manifest.rdf, meta.xml, mimetype and styles.xml.  However, folders
Configurations-2, META-INF, ObjectReplacements and Thumbnails were missing.

It would support my theory if the missing folders were written after that data
which was written. 

The file is 2MB so I cannot upload it here but it is still available in the
forum thread.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

orcmid <or...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|help                        |save-export
             Status|UNCONFIRMED                 |CONFIRMED
                 CC|                            |orcmid@apache.org
     Ever confirmed|0                           |1

--- Comment #1 from orcmid <or...@apache.org> ---
(In reply to tinaconroy0718 from comment #0)
> I saved a file not 2 hours ago and when I opened it again the format was
> wrong and all my text was #. The whole document just #########. This has
> happened before. I do not want to rewrite it all again. Is there a way to
> recover the text I had before?

Generally, no.

When this happens, however it happens, that is really the content of the file
and that is all there is.

The best precaution is to not save over the previous copy but save with a new
name (put a date in the name or use a sequence number).  Then you at least can
fall back to the one you made the failed one from.  This precaution works for a
number of other problems as well.

If you want, you can upload the file as an attachment here, and we can inspect
it to confirm whether there is recoverable content.

This is the first of the cases identified in Issue 126846.

I am extracting the essential information here so we have an identified issue
for this individual case.  I failed to find an existing separate issue about
it.

"Hagar Delest has carefully listed the posts where users have lost data at 22
pages term paper replaced with pound signs, where he has collected over two
hundred (224 to date) cases."  That is at
https://forum.openoffice.org/en/forum/viewtopic.php?f=6&t=17677

The forum topic includes some cases beside the "#" case.  This issue is for
tracking the "#" issue only.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

orcmid <or...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|orcmid@apache.org           |

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

oooforum (fr) <oo...@free.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |oooforum@free.fr

--- Comment #7 from oooforum (fr) <oo...@free.fr> ---
This issue must be maintained opened?
We know that #### content is equal to a lost document.
The problem is to reproduce the process which corrupt a file.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #8 from John <jo...@yahoo.co.uk> ---
Yes.  There are hundreds and hundreds of cases on the forum.

I strongly suspect that the problem arises when AOO mishandles a hibernate or
sleep interrupt when it is in the process of saving a file.

I further suspect that the time at which the interrupt arrives is critical - at
some times it is handled at others it is not. 

I did some testing for Patricia about two years ago which seemed to support
this suggestion.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #3 from orcmid <or...@apache.org> ---
(In reply to John from comment #2)
> Created attachment 85563 [details]
> An example .odt file which opens as "full of ######"
> 
> This file is taken from
> https://forum.openoffice.org/en/forum/viewtopic.
> php?f=7&t=1532&start=420#p372812.
> 
> Notes:
> 
> 1  It is a .odt file, but it is not a zip file, and it has no internal
> structure (no content.xml, manifest.rdf etc).
> 
> 2  When the file is opened with a Hex editor, it is 27,605 Bytes, and each
> byte is zero.
> 
> 3  When the file is opened by Writer, Writer assumes it must be a flat,
> ASCII TEXT file.  Writer brings up the ASCII Filter Options pop-up.  The
> document then appears with 9,999 x "#" as word 1, a paragraph return; 9,999
> x "#" as word 2, a paragraph return, and the remaining "#" as word 3.
> Presumably Writer has a 9,999 character limit on a word and adds the
> paragraph return.
[ ... ]

I confirm the behavior with the example file.  The particular file triggers the
plaintext filter.  If the file is opened, it will be presented as paragraphs
having runs of "#" characters.  (I assume, in this case, the hex 00 bytes are
interpreted as unknown or inadmissable characters and "#" is used to indicate
them.)

I confirm that the file consists of 27,605 null (hex 00) bytes.

What we need to know from Tina, who has had this experience more than once, is 

 1. When a previously-saved file was opened for further work, and it showed as
all "####", did the plaintext filter show up first?  Did she click OK and then
see the all "#" document?

 2. What can Tina report about the conditions under which the document was
saved and later failed to open correctly?

 3. Can Tina upload an attachment of the file that opened that way for her,
exactly as it was when she tried to open it (not after accepting the plaintext
filter).

And, either way, Tina's document will not be recoverable.  But the evidence she
can provide may help us to eliminate or mitigate what the cause might be.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #11 from oooforum (fr) <oo...@free.fr> ---
(In reply to John from comment #9)
> where I suggest the NULLs are written in
> Line 52 of DEFLATOR.  
If you have programming skill, you can submit a PR on Github with this fix:
https://github.com/apache/openoffice

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

Peter <pe...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|Normal                      |Critical
                 CC|                            |petko@apache.org

--- Comment #10 from Peter <pe...@apache.org> ---
I set the Importance to critical, since we have to look into this.
This issue has been the "Most Valued Bug" for some time now. And we should not
forget to solve it.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #9 from John <jo...@yahoo.co.uk> ---
As

a) this bug is confirmed, 
b) there are hundreds of reports of it in the forum,
c) it causes complete data loss and nothing can be recovered from the file (the
# are displayed because the file is full of NULL characters)

I think it should be classified as CRITICAL.

See
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=1532&start=660#p479545
where I suggest the NULLs are written in Line 52 of DEFLATOR.  If the PC issues
a SHUTDOWN after line 52 and AOO does not prevent the shutdown, then the file
would presumably be full of NULLs.

44    Deflater::~Deflater(void)
45    {
46            end(); 
47    }
48    void Deflater::init (sal_Int32 nLevelArg, sal_Int32 nStrategyArg,
sal_Bool bNowrap)
49    {
50            pStream = new z_stream;
51            /* Memset it to 0...sets zalloc/zfree/opaque to NULL */
52            memset (pStream, 0, sizeof(*pStream));
53    
54            switch (deflateInit2(pStream, nLevelArg, Z_DEFLATED, bNowrap?
-MAX_WBITS : MAX_WBITS,

See Why is my file full of #####? at
https://forum.openoffice.org/en/forum/viewtopic.php?f=71&t=85038#p493247 for a
discussion.

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #16 from John <jo...@yahoo.co.uk> ---
See "Text in document transformed to #####" at
https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=104676#p507553 where
a user describes exactly what happened to cause a .odt file to become full of
#####.

> The piece of writing that I have lost is an OpenDocument Text [ie a .odt file].
>
> I opened it and had been working on it for few hours, saving it 
> every 10 minutes or so, when my computer froze and showed a grey screen.
> As this hadn't shifted despite my best efforts I had to do a forced shut
> down after about half an hour.
> 
> When I restarted the computer it was all fine apart from the document
> I had open on the screen where the text had been replaced by ######
> 
> [ie - when AOO opened ...\fred.odt, the file displayed as #####
> which means ...\fred.odt was a flat file (not a ZIP container) full
> of null characters.  Inspection of ...\fred.odt uploaded to the forum
> shows fred.odt is full of null characters 

As I understand it, when AOO edits fred.odt:

1.  AOO copies ...\fred.odt to a temporary file in ...\Temp.  

2.  AOO marks ...\fred.odt as "in use".  If I send ...\fred.odt to 7-ZIP I get
a 7-ZIP error message "The process cannot access the file it is being used by
another process".  However, I can copy the file and I can send the file to
Notepad++ where it opens.

3. All user changes are held in memory until the file is saved.  ...\fred.odt
is thus never touched until a Save is done.

4.  When a Save is done, ...\fred.odt is saved as a proper .odt file.

As the user saved the document I would expect ...\fred.odt to be a proper .odt
file containing the document exactly as it was when the document being edited
was last saved. 

So why is ...\fred.odt a flat file full of nulls when the PC is restarted? 

Could it be that AOO was writing a Save when the PC froze - indeed, AOO
probably caused the freeze.  In this case, I would expect ...\fred.odt to be as
it was when the PC froze and this is why it is full of nulls.

So, is there a stage during the file write process when ...\fred.odt is set to
be full of nulls?  Or some Windows process that kicks in as a freeze happens
which fills the file full of nulls?

-- 
You are receiving this mail because:
You are the assignee for the issue.

[Issue 126990] File saved normally then opened and filled with #

Posted by bu...@apache.org.
https://bz.apache.org/ooo/show_bug.cgi?id=126990

--- Comment #15 from John <jo...@yahoo.co.uk> ---
See Comment 46 in Issue 126869 - Analysis Task: Lost/Corrupted Documents after
Save/Shutdown where I shut down the PC (Start > Power > Shutdown) a few seconds
after the green bar had finished crossing the screen.  The file was still being
written (I used a slow diskette drive) and AOO did not prevent the shutdown as
expected.

See Comment 48 in Issue 126869 where I issued the shutdown as soon as possible
after the green bar had finished crossing the screen (ie a few seconds
earlier).  AOO now prevented the shutdown taking place and the shutdown screen
offered a pop-up with "fred.odt is open in AOO - do you want to cancel?" and I
was able to prevent shutdown. When I cancelled shutdown AOO was displaying a
"Do you want to save your changes?" pop-up.    

Conclusion:  AOO mishandles? ignores? a Windows interrupt saying the PC is
being shutdown.

-- 
You are receiving this mail because:
You are the assignee for the issue.