You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Regunath Balasubramanian <re...@mindtree.com> on 2006/07/11 11:50:54 UTC
Error reading CLOB
Hi,
I chose to use Derby as an embedded DB to store text parsed/stripped from web
pages, MS Office files and PDF documents while implementing an indexing and
search solution. I need the parsed text of the document to enable search term
highlighting to produce an effective summary of search hits.
The natural choice was to use the CLOB data type. I store the contents using
PreparedStatement.setCharacterStream(column, reader) where reader is a
java.io.StringReader constructed from the java.lang.String instance
representing the entire parsed contents. I then read the contents out using
ResultSet.getClob(column).getCharacterStream().
This works fine during write always but fails for a few during the read. What
surprises me is the fact that I read and write using the Derby classes and
therfore naturally expect that they work. The error is in the of the
fillBuffer() method of the UTF8Reader class. It throws a
UTFDataFormatException.
I made a few frustating attempts at trying to get it work - I tried
constructing the parsed string using different encodings (UTF-8, ISO-8859-1)
at the time of write, tried to read it as a binary stream - failed with a
nice exception stating that I was trying to read a CLOB as binary, ascii
stream - failed with the same data format exception.
Finally I decided to write the contents as a BLOB instead. The bytes for
writing were constructed using String.getBytes(). I read the contents as
Blob.getBytes() and then construct the String using the new String(byte[]).
This works!
I wonder why the UTF8 reader of Derby failed? I have the above mentioned
workaround but would like to know if there is an alternative.
Cheers!
Regu
Re: Error reading CLOB
Posted by Kristian Waagan <Kr...@Sun.COM>.
Regunath Balasubramanian wrote:
> Hi,
>
> I chose to use Derby as an embedded DB to store text parsed/stripped
> from web pages, MS Office files and PDF documents while implementing an
> indexing and search solution. I need the parsed text of the document to
> enable search term highlighting to produce an effective summary of
> search hits.
> The natural choice was to use the CLOB data type. I store the contents
> using PreparedStatement.setCharacterStream(column, reader) where reader
> is a java.io.StringReader constructed from the java.lang.String instance
> representing the entire parsed contents. I then read the contents out
> using ResultSet.getClob(column).getCharacterStream().
>
> This works fine during write always but fails for a few during the read.
> What surprises me is the fact that I read and write using the Derby
> classes and therfore naturally expect that they work. The error is in
> the of the fillBuffer() method of the UTF8Reader class. It throws a
> UTFDataFormatException.
Hello Regu,
Could you please tell us in which version(s) of Derby you are seeing
this problem?
Also, if you have a repro application that can be used to demonstrate
the problem, it would be great :)
It would be very handy to have the data that causes the
UTFDataFormatException to be thrown.
Thanks,
--
Kristian
>
> I made a few frustating attempts at trying to get it work - I tried
> constructing the parsed string using different encodings (UTF-8,
> ISO-8859-1) at the time of write, tried to read it as a binary stream -
> failed with a nice exception stating that I was trying to read a CLOB as
> binary, ascii stream - failed with the same data format exception.
>
> Finally I decided to write the contents as a BLOB instead. The bytes for
> writing were constructed using String.getBytes(). I read the contents as
> Blob.getBytes() and then construct the String using the new
> String(byte[]). This works!
>
> I wonder why the UTF8 reader of Derby failed? I have the above mentioned
> workaround but would like to know if there is an alternative.
>
> Cheers!
> Regu
>
>
> ------------------------------------------------------------------------
>
> -----------------------------------------------------------------------------------------------------------------------------
> Disclaimer
> -----------------------------------------------------------------------------------------------------------------------------
>
> "This message(including attachment if any)is confidential and may be privileged.Before opening attachments please check them
> for viruses and defects.MindTree Consulting Private Limited (MindTree)will not be responsible for any viruses or defects or
> any forwarded attachments emanating either from within MindTree or outside.If you have received this message by mistake please notify the sender by return e-mail and delete this message from your system. Any unauthorized use or dissemination of this message in whole or in part is strictly prohibited. Please note that e-mails are susceptible to change and MindTree shall not be liable for any improper, untimely or incomplete transmission."
>
> -----------------------------------------------------------------------------------------------------------------------------