You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Yuchen (JIRA)" <ji...@apache.org> on 2009/06/02 22:37:07 UTC
[jira] Created: (HADOOP-5960) Incorrect DBInputFormat transaction
context
Incorrect DBInputFormat transaction context
-------------------------------------------
Key: HADOOP-5960
URL: https://issues.apache.org/jira/browse/HADOOP-5960
Project: Hadoop Core
Issue Type: Bug
Components: mapred
Affects Versions: 0.20.0, 0.19.1, 0.19.0
Environment: Mac OSX 10.5.6, IntelliJ 7.0.5
Reporter: Yuchen
In my Map/Reduce job, I use DBInputFormat to get the original tasks for its convenience. I also need to update my mysql db occasionally in our reducer. Because I need to update mysql db, instead of "insert", I cannot use DBOutputFormat. So I use my own JDBC call. I make my own connection like this:
Class.forName("com.mysql.jdbc.Driver").newInstance();
conn = DriverManager.getConnection(jdbcUrl);
However, everytime when I try to do the update, I got an SQL exception "transaction lock time out; try restarting transction" -- even though I didn't use transaction at all in my update (setAutoCommit to false).
Digging into the hadoop code, I found in DBInputFormat, there are these lines:
this.connection.setAutoCommit(false);
connection.setTransactionIsolation(Connection.TRANSACTION_SERIALIZABLE);
When I comment them out (and the connection.commit()) and everything works fine. I also found the connection in DBInputFormat is never closed. I am wondering why we need to set the transaction / transaction isolation since we are in DBInputFormat? and why I can't overwrite it in my jdbc call even if explicitly set autocommit to false and transaction isolation type to default (repeat-read).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.