You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by bu...@apache.org on 2003/02/09 20:05:12 UTC

DO NOT REPLY [Bug 16918] New: - random erroneous bad callbacks from DocumentTracer sample

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16918>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=16918

random erroneous bad callbacks from DocumentTracer sample

           Summary: random erroneous bad callbacks from DocumentTracer
                    sample
           Product: Xerces2-J
           Version: 2.3.0
          Platform: PC
               URL: http://xml.apache.org/xerces2-j/samples-
                    sax.html#DocumentTracer
        OS/Version: Windows XP
            Status: NEW
          Severity: Major
          Priority: Other
         Component: SAX
        AssignedTo: xerces-j-dev@xml.apache.org
        ReportedBy: vorticity@attbi.com


I have a large (18M) XML file that I am trying to process with SAX.  I have 
adapted the DocumentTracer sample (but only slightly) to (1) create a string 
variable named "currentText" on a characters() callback; (2) send currentText 
to fOut when endElement() has been called and localName is "uwi"; and (3) do 
nothing on all other callbacks.  The entire file contains about 21,000 uwi 
elements.  Each uwi tag is a string of exactly 14 characters.

Out of the 21K uwi's, it gets the currentText wrong for 117 of them.  All the 
rest are correct.  The wrong ones are truncated with characters lost at the 
beginning of the uwi, and are of different lengths.  It appears that the 
result is repeatable in the sense that if I run the tracer a second time, the 
errors will be in the same place.

Here is a sequence of 9 lines from the output showing an error.  The uwi's are 
unique, and I did go back to the input file to verify that the truncated uwi's 
were, in fact, 14 characters.
00021604810W50
00023304810W50
00101604810W50
00022804810W50
W50
00060404810W50
00140404810W50
00061604810W50
00141604810W50

Here is a complete list of the uwi's (all ok on input, bad on output):
50
50
50
50
50
50
50
50
53
703W50
706W50
706W50
708W52
802W50
808W50
810W50
812W50
908W50
3804W50
4207W50
4715W50
4814W50
204703W50
304109W50
304909W50
404306W50
904910W50
1704205W52
2104001W50
2204708W50
2304607W50
2404808W50
2504706W52
3004806W50
3105110W50
3304812W50
22004909W50
43604008W52
71305110W50
82503809W52
110304812W50
111304810W52
140205013W50
141404803W50
142804804W50
160404202W50
161904709W50
W50
W50
W50
W50
W50
W50
W50
W50
W50
W50
W50
W52
W52

Thanks for the help.  Could you please send an email with the response?  
Thanks.

Chris Evans

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org