You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/03/22 10:53:05 UTC

[jira] [Created] (NUTCH-970) Injector job crashes with MySQL with table collation set to utf8_general_ci

Injector job crashes with MySQL with table collation set to utf8_general_ci
---------------------------------------------------------------------------

                 Key: NUTCH-970
                 URL: https://issues.apache.org/jira/browse/NUTCH-970
             Project: Nutch
          Issue Type: Bug
          Components: injector
    Affects Versions: 2.0
            Reporter: Markus Jelsma
             Fix For: 2.0


Running the injector of trunk with an already existing database where the default collation is utf8_* or ucs2_* the following GoraException is thrown:

InjectorJob: starting
InjectorJob: urlDir: urls
InjectorJob: org.apache.gora.util.GoraException: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
        at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110)
        at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:93)
        at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:43)
        at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:227)
        at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
        at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:266)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:276)
Caused by: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
        at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:226)
        at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:172)
        at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:81)
        at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:104)
        ... 7 more
Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:936)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2985)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1631)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723)
        at com.mysql.jdbc.Connection.execSQL(Connection.java:3283)
        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1332)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1604)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1519)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1504)
        at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:224)
        ... 10 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (NUTCH-970) Injector job crashes with MySQL with table collation set to utf8_general_ci

Posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092546#comment-13092546 ] 

Lewis John McGibbney commented on NUTCH-970:
--------------------------------------------

Markus can you reproduce this? As I mentioned a while ago on the lists, I have been getting errors with a similar aesthetic look but what I fear are completely disjoint characteristics. They all appear @ injecting so they definitely share some commonality. 

> Injector job crashes with MySQL with table collation set to utf8_general_ci
> ---------------------------------------------------------------------------
>
>                 Key: NUTCH-970
>                 URL: https://issues.apache.org/jira/browse/NUTCH-970
>             Project: Nutch
>          Issue Type: Bug
>          Components: injector
>    Affects Versions: 2.0
>            Reporter: Markus Jelsma
>             Fix For: 2.0
>
>
> Running the injector of trunk with an already existing database where the default collation is utf8_* or ucs2_* the following GoraException is thrown:
> InjectorJob: starting
> InjectorJob: urlDir: urls
> InjectorJob: org.apache.gora.util.GoraException: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:93)
>         at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:43)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:227)
>         at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:266)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:276)
> Caused by: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:226)
>         at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:172)
>         at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:81)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:104)
>         ... 7 more
> Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:936)
>         at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2985)
>         at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1631)
>         at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723)
>         at com.mysql.jdbc.Connection.execSQL(Connection.java:3283)
>         at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1332)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1604)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1519)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1504)
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:224)
>         ... 10 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (NUTCH-970) Injector job crashes with MySQL with table collation set to utf8_general_ci

Posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lewis John McGibbney updated NUTCH-970:
---------------------------------------

    Fix Version/s:     (was: nutchgora)
                   2.1

Set and Classify
                
> Injector job crashes with MySQL with table collation set to utf8_general_ci
> ---------------------------------------------------------------------------
>
>                 Key: NUTCH-970
>                 URL: https://issues.apache.org/jira/browse/NUTCH-970
>             Project: Nutch
>          Issue Type: Bug
>          Components: injector
>    Affects Versions: nutchgora
>            Reporter: Markus Jelsma
>             Fix For: 2.1
>
>
> Running the injector of trunk with an already existing database where the default collation is utf8_* or ucs2_* the following GoraException is thrown:
> InjectorJob: starting
> InjectorJob: urlDir: urls
> InjectorJob: org.apache.gora.util.GoraException: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:93)
>         at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:43)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:227)
>         at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:266)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:276)
> Caused by: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:226)
>         at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:172)
>         at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:81)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:104)
>         ... 7 more
> Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:936)
>         at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2985)
>         at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1631)
>         at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723)
>         at com.mysql.jdbc.Connection.execSQL(Connection.java:3283)
>         at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1332)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1604)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1519)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1504)
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:224)
>         ... 10 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (NUTCH-970) Injector job crashes with MySQL with table collation set to utf8_general_ci

Posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103491#comment-13103491 ] 

Lewis John McGibbney commented on NUTCH-970:
--------------------------------------------

I'm the same Markus. I have been working towards debugging and fixing the trunk test failures before I progess attempting to sort or pass my opinion on any trunk patches/issues.

This is quite concerning as I now belive that Trunk is completekly broken. I am getting no further than 'similar' exceptions/errors/failures when I attempt to use Nutch trunk for injecting URLs... this subsequently means that I cannot use trunk at all...

> Injector job crashes with MySQL with table collation set to utf8_general_ci
> ---------------------------------------------------------------------------
>
>                 Key: NUTCH-970
>                 URL: https://issues.apache.org/jira/browse/NUTCH-970
>             Project: Nutch
>          Issue Type: Bug
>          Components: injector
>    Affects Versions: 2.0
>            Reporter: Markus Jelsma
>             Fix For: 2.0
>
>
> Running the injector of trunk with an already existing database where the default collation is utf8_* or ucs2_* the following GoraException is thrown:
> InjectorJob: starting
> InjectorJob: urlDir: urls
> InjectorJob: org.apache.gora.util.GoraException: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:93)
>         at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:43)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:227)
>         at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:266)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:276)
> Caused by: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:226)
>         at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:172)
>         at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:81)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:104)
>         ... 7 more
> Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:936)
>         at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2985)
>         at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1631)
>         at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723)
>         at com.mysql.jdbc.Connection.execSQL(Connection.java:3283)
>         at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1332)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1604)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1519)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1504)
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:224)
>         ... 10 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (NUTCH-970) Injector job crashes with MySQL with table collation set to utf8_general_ci

Posted by "Markus Jelsma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103023#comment-13103023 ] 

Markus Jelsma commented on NUTCH-970:
-------------------------------------

I haven't got a working trunk with backend anymore. At the time using Gora 0.1-incubating it happened consistently. Little doubt it's still an issue.

> Injector job crashes with MySQL with table collation set to utf8_general_ci
> ---------------------------------------------------------------------------
>
>                 Key: NUTCH-970
>                 URL: https://issues.apache.org/jira/browse/NUTCH-970
>             Project: Nutch
>          Issue Type: Bug
>          Components: injector
>    Affects Versions: 2.0
>            Reporter: Markus Jelsma
>             Fix For: 2.0
>
>
> Running the injector of trunk with an already existing database where the default collation is utf8_* or ucs2_* the following GoraException is thrown:
> InjectorJob: starting
> InjectorJob: urlDir: urls
> InjectorJob: org.apache.gora.util.GoraException: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:93)
>         at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:43)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:227)
>         at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:266)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:276)
> Caused by: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:226)
>         at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:172)
>         at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:81)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:104)
>         ... 7 more
> Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:936)
>         at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2985)
>         at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1631)
>         at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723)
>         at com.mysql.jdbc.Connection.execSQL(Connection.java:3283)
>         at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1332)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1604)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1519)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1504)
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:224)
>         ... 10 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (NUTCH-970) Injector job crashes with MySQL with table collation set to utf8_general_ci

Posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lewis John McGibbney updated NUTCH-970:
---------------------------------------

    Fix Version/s:     (was: 2.1)
                   2.2
    
> Injector job crashes with MySQL with table collation set to utf8_general_ci
> ---------------------------------------------------------------------------
>
>                 Key: NUTCH-970
>                 URL: https://issues.apache.org/jira/browse/NUTCH-970
>             Project: Nutch
>          Issue Type: Bug
>          Components: injector
>    Affects Versions: nutchgora
>            Reporter: Markus Jelsma
>             Fix For: 2.2
>
>
> Running the injector of trunk with an already existing database where the default collation is utf8_* or ucs2_* the following GoraException is thrown:
> InjectorJob: starting
> InjectorJob: urlDir: urls
> InjectorJob: org.apache.gora.util.GoraException: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:93)
>         at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:43)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:227)
>         at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:266)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:276)
> Caused by: java.io.IOException: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:226)
>         at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:172)
>         at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:81)
>         at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:104)
>         ... 7 more
> Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:936)
>         at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2985)
>         at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1631)
>         at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723)
>         at com.mysql.jdbc.Connection.execSQL(Connection.java:3283)
>         at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1332)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1604)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1519)
>         at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1504)
>         at org.apache.gora.sql.store.SqlStore.createSchema(SqlStore.java:224)
>         ... 10 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira