You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Yuchen (JIRA)" <ji...@apache.org> on 2009/06/02 22:37:07 UTC

[jira] Created: (HADOOP-5960) Incorrect DBInputFormat transaction context

Incorrect DBInputFormat transaction context
-------------------------------------------

                 Key: HADOOP-5960
                 URL: https://issues.apache.org/jira/browse/HADOOP-5960
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.20.0, 0.19.1, 0.19.0
         Environment: Mac OSX 10.5.6, IntelliJ 7.0.5
            Reporter: Yuchen


In my Map/Reduce job,  I use DBInputFormat to get the original tasks for its convenience. I also need to update my mysql db occasionally in our reducer. Because I need to update mysql db, instead of "insert", I cannot use DBOutputFormat. So I use my own JDBC call. I make my own connection like this:

    Class.forName("com.mysql.jdbc.Driver").newInstance();
    conn = DriverManager.getConnection(jdbcUrl);

However, everytime when I try to do the update, I got an SQL exception "transaction lock time out; try restarting transction" -- even though I didn't use transaction at all in my update (setAutoCommit to false).

Digging into the hadoop code, I found in DBInputFormat, there are these lines: 

      this.connection.setAutoCommit(false);
      connection.setTransactionIsolation(Connection.TRANSACTION_SERIALIZABLE);

When I comment them out (and the connection.commit()) and everything works fine. I also found the connection in DBInputFormat is never closed. I am wondering why we need to set the transaction / transaction isolation since we are in DBInputFormat? and why I can't overwrite it in my jdbc call even if explicitly set autocommit to false and transaction isolation type to default (repeat-read).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.