You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nutch.apache.org by "Julien Nioche (JIRA)" <ji...@apache.org> on 2010/09/07 16:30:33 UTC

[jira] Commented: (NUTCH-899) java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1

    [ https://issues.apache.org/jira/browse/NUTCH-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906816#action_12906816 ] 

Julien Nioche commented on NUTCH-899:
-------------------------------------

You can either set a lower value for the parameter http.content.limit or modify the mapping and set

<field name="content" column="content" jdbc-type="MEDIUMBLOB"/>

which should work for mysql.

See the discussion on http://github.com/enis/gora/issues/closed#issue/48

> java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1
> -------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-899
>                 URL: https://issues.apache.org/jira/browse/NUTCH-899
>             Project: Nutch
>          Issue Type: Bug
>          Components: storage
>    Affects Versions: 2.0
>         Environment: ubuntu 10.04
> JVM : 1.6.0_20
> nutch 2.0 (trunk)
> Mysql/HBase (0.20.6) / Hadoop(0.20.2) pseudo-distributed 
>            Reporter: faruk berksöz
>            Priority: Minor
>
> wenn i try to fetch a web page (e.g. http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html ) with mysql storage definition,
> I am seeing the following error in my hadoop logs. ,  (no error with hbase ) ;
> java.io.IOException: java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1
>     at org.gora.sql.store.SqlStore.flush(SqlStore.java:316)
>     at org.gora.sql.store.SqlStore.close(SqlStore.java:163)
>     at org.gora.mapreduce.GoraOutputFormat$1.close(GoraOutputFormat.java:72)
>     at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>     at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> The type of the column 'content' is BLOB.
> It may be important for the next developments of Gora.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.