You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Markus Jelsma <ma...@openindex.io> on 2011/03/22 17:19:57 UTC

java.sql.BatchUpdateException after fetch and wrong WebPage.protocolStatus in trunk

Hi,

I did a few successful fetches for testing trunk's solrclean. After removing 
some pages for having a few NOTFOUND entries in the WebDB (with HSQLDB as 
storage backend) the following exception occured:

2011-03-22 16:53:49,727 INFO  fetcher.FetcherJob - -activeThreads=0
2011-03-22 16:53:51,036 WARN  mapred.LocalJobRunner - job_local_0001
java.io.IOException: java.sql.BatchUpdateException: data exception: string 
data, right truncation
        at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:340)
        at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
        at 
org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
        at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
Caused by: java.sql.BatchUpdateException: data exception: string data, right 
truncation
        at org.hsqldb.jdbc.JDBCPreparedStatement.executeBatch(Unknown Source)
        at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:328)
        ... 5 more

Also, when i execute solrclean and log WebPage.ProtocolStatus() i see wrong 
values for pages that were removed, instead of ProtocolStatusCodes.NOTFOUND 
(13) they got just 0.

It smells like a bug but i could be doing things the wrong way, of course ;)


Cheers,

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: java.sql.BatchUpdateException after fetch and wrong WebPage.protocolStatus in trunk

Posted by Markus Jelsma <ma...@openindex.io>.
Any thoughts? 

On Tuesday 22 March 2011 17:19:57 Markus Jelsma wrote:
> Hi,
> 
> I did a few successful fetches for testing trunk's solrclean. After
> removing some pages for having a few NOTFOUND entries in the WebDB (with
> HSQLDB as storage backend) the following exception occured:
> 
> 2011-03-22 16:53:49,727 INFO  fetcher.FetcherJob - -activeThreads=0
> 2011-03-22 16:53:51,036 WARN  mapred.LocalJobRunner - job_local_0001
> java.io.IOException: java.sql.BatchUpdateException: data exception: string
> data, right truncation
>         at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:340)
>         at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
>         at
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
>         at
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>         at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> Caused by: java.sql.BatchUpdateException: data exception: string data,
> right truncation
>         at org.hsqldb.jdbc.JDBCPreparedStatement.executeBatch(Unknown
> Source) at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:328) ...
> 5 more
> 
> Also, when i execute solrclean and log WebPage.ProtocolStatus() i see wrong
> values for pages that were removed, instead of ProtocolStatusCodes.NOTFOUND
> (13) they got just 0.
> 
> It smells like a bug but i could be doing things the wrong way, of course
> ;)
> 
> 
> Cheers,

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350