You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Markus Jelsma <ma...@openindex.io> on 2011/03/22 17:19:57 UTC
java.sql.BatchUpdateException after fetch and wrong WebPage.protocolStatus in trunk
Hi,
I did a few successful fetches for testing trunk's solrclean. After removing
some pages for having a few NOTFOUND entries in the WebDB (with HSQLDB as
storage backend) the following exception occured:
2011-03-22 16:53:49,727 INFO fetcher.FetcherJob - -activeThreads=0
2011-03-22 16:53:51,036 WARN mapred.LocalJobRunner - job_local_0001
java.io.IOException: java.sql.BatchUpdateException: data exception: string
data, right truncation
at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:340)
at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
at
org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
Caused by: java.sql.BatchUpdateException: data exception: string data, right
truncation
at org.hsqldb.jdbc.JDBCPreparedStatement.executeBatch(Unknown Source)
at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:328)
... 5 more
Also, when i execute solrclean and log WebPage.ProtocolStatus() i see wrong
values for pages that were removed, instead of ProtocolStatusCodes.NOTFOUND
(13) they got just 0.
It smells like a bug but i could be doing things the wrong way, of course ;)
Cheers,
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350
Re: java.sql.BatchUpdateException after fetch and wrong WebPage.protocolStatus in trunk
Posted by Markus Jelsma <ma...@openindex.io>.
Any thoughts?
On Tuesday 22 March 2011 17:19:57 Markus Jelsma wrote:
> Hi,
>
> I did a few successful fetches for testing trunk's solrclean. After
> removing some pages for having a few NOTFOUND entries in the WebDB (with
> HSQLDB as storage backend) the following exception occured:
>
> 2011-03-22 16:53:49,727 INFO fetcher.FetcherJob - -activeThreads=0
> 2011-03-22 16:53:51,036 WARN mapred.LocalJobRunner - job_local_0001
> java.io.IOException: java.sql.BatchUpdateException: data exception: string
> data, right truncation
> at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:340)
> at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
> at
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
> at
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> Caused by: java.sql.BatchUpdateException: data exception: string data,
> right truncation
> at org.hsqldb.jdbc.JDBCPreparedStatement.executeBatch(Unknown
> Source) at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:328) ...
> 5 more
>
> Also, when i execute solrclean and log WebPage.ProtocolStatus() i see wrong
> values for pages that were removed, instead of ProtocolStatusCodes.NOTFOUND
> (13) they got just 0.
>
> It smells like a bug but i could be doing things the wrong way, of course
> ;)
>
>
> Cheers,
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350