You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by "Renato Javier Marroquín Mogrovejo (JIRA)" <ji...@apache.org> on 2013/01/02 06:18:12 UTC
[jira] [Commented] (GORA-24) Throwing EOFException with MEDIUMBLOB
type for inlinks column
[ https://issues.apache.org/jira/browse/GORA-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542002#comment-13542002 ]
Renato Javier Marroquín Mogrovejo commented on GORA-24:
-------------------------------------------------------
Hi Nathan,
I wanted to fix this, but after reading a little bit about this, and reviewing the code, I think this is not a bug within Gora. Maybe we can provide better error messages, the problem here is that the Nutch's MySQL mapping file is not right for most cases.
{code}
<class name="org.apache.gora.examples.generated.WebPage" keyClass="java.lang.String" table="WebPage">
<primarykey column="id" length="128"/>
<field name="url" column="url" length="128" primarykey="true"/>
<field name="content" column="content"/>
<field name="parsedContent" column="parsedContent"/>
<field name="outlinks" column="outlinks"/>
<field name="metadata" column="metadata"/>
</class>
{/code}
This is what it looks like within Gora, should we change it?
> Throwing EOFException with MEDIUMBLOB type for inlinks column
> -------------------------------------------------------------
>
> Key: GORA-24
> URL: https://issues.apache.org/jira/browse/GORA-24
> Project: Apache Gora
> Issue Type: Bug
> Components: storage-sql
> Environment: MySQL
> Reporter: Alexis
> Fix For: 0.3
>
>
> I had an exception with DbUpdaterJob complaining that inlinks column of type BLOB in webpage table was not big enough to store all the incoming links. So I changed the column definition in gora-sql-mapping.xml from BLOB to MEDIUMBLOB:
> <field name="inlinks" column="inlinks" jdbc-type="MEDIUMBLOB"/>
> Now I systematically get an exception in the update step:
> java.io.IOException: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException
> at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:341)
> at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
> at org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> Caused by: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException
> at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2020)
> at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1451)
> at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:329)
> ... 5 more
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira