You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Alexander Schwid (JIRA)" <ji...@apache.org> on 2008/10/02 11:11:44 UTC
[jira] Created: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
--------------------------------------------------------------------------------------------------
Key: HADOOP-4331
URL: https://issues.apache.org/jira/browse/HADOOP-4331
Project: Hadoop Core
Issue Type: Improvement
Components: mapred
Reporter: Alexander Schwid
Priority: Minor
Fix For: 0.19.0
Attachments: patch.txt
add batch size support for JDBC in DBOutputFormat
recieve DBWritable object in value not in key in DBOutputFormat
---------------patch--------------
Index: src/mapred/org/apache/hadoop/mapred/lib/db/DBConfiguration.java
===================================================================
--- src/mapred/org/apache/hadoop/mapred/lib/db/DBConfiguration.java (revision 701034)
+++ src/mapred/org/apache/hadoop/mapred/lib/db/DBConfiguration.java (working copy)
@@ -80,6 +80,11 @@
/** Field names in the Output table */
public static final String OUTPUT_FIELD_NAMES_PROPERTY = "mapred.jdbc.output.field.names";
+ /** Batch size for output statement */
+ public static final String OUTPUT_BATCH_SIZE = "mapred.jdbc.output.batch.size";
+
+ public static final int DEFAULT_BATCH_SIZE = 1000;
+
/**
* Sets the DB access related fields in the JobConf.
* @param job the job
@@ -212,5 +217,12 @@
job.setStrings(DBConfiguration.OUTPUT_FIELD_NAMES_PROPERTY, fieldNames);
}
+ int getBatchSize() {
+ return job.getInt(DBConfiguration.OUTPUT_BATCH_SIZE, DEFAULT_BATCH_SIZE);
+ }
+
+ void setBatchSize(int sz) {
+ job.setInt(DBConfiguration.OUTPUT_BATCH_SIZE, sz);
+ }
}
Index: src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java
===================================================================
--- src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java (revision 701034)
+++ src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java (working copy)
@@ -37,11 +37,11 @@
* A OutputFormat that sends the reduce output to a SQL table.
* <p>
* {@link DBOutputFormat} accepts <key,value> pairs, where
- * key has a type extending DBWritable. Returned {@link RecordWriter}
- * writes <b>only the key</b> to the database with a batch SQL query.
+ * value has a type extending DBWritable. Returned {@link RecordWriter}
+ * writes <b>only the value</b> to the database with a batch SQL query.
*
*/
-public class DBOutputFormat<K extends DBWritable, V>
+public class DBOutputFormat<K, V extends DBWritable>
implements OutputFormat<K,V> {
private static final Log LOG = LogFactory.getLog(DBOutputFormat.class);
@@ -54,27 +54,21 @@
private Connection connection;
private PreparedStatement statement;
+ private int batch = 0;
+ private int batchSize;
protected DBRecordWriter(Connection connection
- , PreparedStatement statement) throws SQLException {
+ , PreparedStatement statement, int batchSize) throws SQLException {
this.connection = connection;
this.statement = statement;
this.connection.setAutoCommit(false);
+ this.batchSize = batchSize;
}
/** {@inheritDoc} */
public void close(Reporter reporter) throws IOException {
try {
- statement.executeBatch();
- connection.commit();
- } catch (SQLException e) {
- try {
- connection.rollback();
- }
- catch (SQLException ex) {
- LOG.warn(StringUtils.stringifyException(ex));
- }
- throw new IOException(e.getMessage());
+ executeBatch();
} finally {
try {
statement.close();
@@ -89,12 +83,37 @@
/** {@inheritDoc} */
public void write(K key, V value) throws IOException {
try {
- key.write(statement);
+ value.write(statement);
statement.addBatch();
+ batch++;
+ if (batch == batchSize) {
+ executeBatch();
+ batch = 0;
+ }
+
} catch (SQLException e) {
e.printStackTrace();
}
}
+
+ private void executeBatch() throws IOException {
+ if (batch > 0) {
+ try {
+ statement.executeBatch();
+ connection.commit();
+ statement.clearBatch();
+ }
+ catch(SQLException e) {
+ try {
+ connection.rollback();
+ }
+ catch (SQLException ex) {
+ LOG.warn(StringUtils.stringifyException(ex));
+ }
+ throw new IOException(e.getMessage());
+ }
+ }
+ }
}
/**
@@ -129,13 +148,14 @@
DBConfiguration dbConf = new DBConfiguration(job);
String tableName = dbConf.getOutputTableName();
String[] fieldNames = dbConf.getOutputFieldNames();
+ int batchSize = dbConf.getBatchSize();
try {
Connection connection = dbConf.getConnection();
PreparedStatement statement = null;
statement = connection.prepareStatement(constructQuery(tableName, fieldNames));
- return new DBRecordWriter(connection, statement);
+ return new DBRecordWriter(connection, statement, batchSize);
}
catch (Exception ex) {
throw new IOException(ex.getMessage());
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Alexander Schwid (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643177#action_12643177 ]
Alexander Schwid commented on HADOOP-4331:
------------------------------------------
You right, all work perfect with increased number of reduces and has the same effect as using batch in one reduce.
So, i think this task is not nessesary.
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Alexander Schwid (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Schwid resolved HADOOP-4331.
--------------------------------------
Resolution: Won't Fix
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Enis Soztutar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637120#action_12637120 ]
Enis Soztutar commented on HADOOP-4331:
---------------------------------------
I am not convinced that further splitting the batch in reduces is the right way. It is better to add all the values in the reduce once to keep atomicity. If some error occurs in the transaction, none of the records in the reduce should be inserted, otherwise when the reduce is restarted, some of the records might be duplicated.
Is there a specific performance/driver-related reason to add batch sizes?
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Enis Soztutar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637394#action_12637394 ]
Enis Soztutar commented on HADOOP-4331:
---------------------------------------
Have you increased the number of reducers, so that you get the same effect? Could you please report some performance metrics of your job with/without the patch, and with increased number of reducers.
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636389#action_12636389 ]
Hadoop QA commented on HADOOP-4331:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12391348/patch.txt
against trunk revision 700997.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 Eclipse classpath. The patch retains Eclipse classpath integrity.
-1 core tests. The patch failed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3423/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3423/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3423/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3423/console
This message is automatically generated.
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Alexander Schwid (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Schwid updated HADOOP-4331:
-------------------------------------
Status: Open (was: Patch Available)
this task is unnessesary
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Alexander Schwid (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Schwid updated HADOOP-4331:
-------------------------------------
Description:
package mapred.lib.db
added batch size support for JDBC in DBOutputFormat
recieve DBWritable object in value not in key in DBOutputFormat
was:
add batch size support for JDBC in DBOutputFormat
recieve DBWritable object in value not in key in DBOutputFormat
---------------patch--------------
Index: src/mapred/org/apache/hadoop/mapred/lib/db/DBConfiguration.java
===================================================================
--- src/mapred/org/apache/hadoop/mapred/lib/db/DBConfiguration.java (revision 701034)
+++ src/mapred/org/apache/hadoop/mapred/lib/db/DBConfiguration.java (working copy)
@@ -80,6 +80,11 @@
/** Field names in the Output table */
public static final String OUTPUT_FIELD_NAMES_PROPERTY = "mapred.jdbc.output.field.names";
+ /** Batch size for output statement */
+ public static final String OUTPUT_BATCH_SIZE = "mapred.jdbc.output.batch.size";
+
+ public static final int DEFAULT_BATCH_SIZE = 1000;
+
/**
* Sets the DB access related fields in the JobConf.
* @param job the job
@@ -212,5 +217,12 @@
job.setStrings(DBConfiguration.OUTPUT_FIELD_NAMES_PROPERTY, fieldNames);
}
+ int getBatchSize() {
+ return job.getInt(DBConfiguration.OUTPUT_BATCH_SIZE, DEFAULT_BATCH_SIZE);
+ }
+
+ void setBatchSize(int sz) {
+ job.setInt(DBConfiguration.OUTPUT_BATCH_SIZE, sz);
+ }
}
Index: src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java
===================================================================
--- src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java (revision 701034)
+++ src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java (working copy)
@@ -37,11 +37,11 @@
* A OutputFormat that sends the reduce output to a SQL table.
* <p>
* {@link DBOutputFormat} accepts <key,value> pairs, where
- * key has a type extending DBWritable. Returned {@link RecordWriter}
- * writes <b>only the key</b> to the database with a batch SQL query.
+ * value has a type extending DBWritable. Returned {@link RecordWriter}
+ * writes <b>only the value</b> to the database with a batch SQL query.
*
*/
-public class DBOutputFormat<K extends DBWritable, V>
+public class DBOutputFormat<K, V extends DBWritable>
implements OutputFormat<K,V> {
private static final Log LOG = LogFactory.getLog(DBOutputFormat.class);
@@ -54,27 +54,21 @@
private Connection connection;
private PreparedStatement statement;
+ private int batch = 0;
+ private int batchSize;
protected DBRecordWriter(Connection connection
- , PreparedStatement statement) throws SQLException {
+ , PreparedStatement statement, int batchSize) throws SQLException {
this.connection = connection;
this.statement = statement;
this.connection.setAutoCommit(false);
+ this.batchSize = batchSize;
}
/** {@inheritDoc} */
public void close(Reporter reporter) throws IOException {
try {
- statement.executeBatch();
- connection.commit();
- } catch (SQLException e) {
- try {
- connection.rollback();
- }
- catch (SQLException ex) {
- LOG.warn(StringUtils.stringifyException(ex));
- }
- throw new IOException(e.getMessage());
+ executeBatch();
} finally {
try {
statement.close();
@@ -89,12 +83,37 @@
/** {@inheritDoc} */
public void write(K key, V value) throws IOException {
try {
- key.write(statement);
+ value.write(statement);
statement.addBatch();
+ batch++;
+ if (batch == batchSize) {
+ executeBatch();
+ batch = 0;
+ }
+
} catch (SQLException e) {
e.printStackTrace();
}
}
+
+ private void executeBatch() throws IOException {
+ if (batch > 0) {
+ try {
+ statement.executeBatch();
+ connection.commit();
+ statement.clearBatch();
+ }
+ catch(SQLException e) {
+ try {
+ connection.rollback();
+ }
+ catch (SQLException ex) {
+ LOG.warn(StringUtils.stringifyException(ex));
+ }
+ throw new IOException(e.getMessage());
+ }
+ }
+ }
}
/**
@@ -129,13 +148,14 @@
DBConfiguration dbConf = new DBConfiguration(job);
String tableName = dbConf.getOutputTableName();
String[] fieldNames = dbConf.getOutputFieldNames();
+ int batchSize = dbConf.getBatchSize();
try {
Connection connection = dbConf.getConnection();
PreparedStatement statement = null;
statement = connection.prepareStatement(constructQuery(tableName, fieldNames));
- return new DBRecordWriter(connection, statement);
+ return new DBRecordWriter(connection, statement, batchSize);
}
catch (Exception ex) {
throw new IOException(ex.getMessage());
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.19.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Alexander Schwid (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Schwid updated HADOOP-4331:
-------------------------------------
Attachment: patch.txt
Patch for this task
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.19.0
>
> Attachments: patch.txt
>
>
> add batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
> ---------------patch--------------
> Index: src/mapred/org/apache/hadoop/mapred/lib/db/DBConfiguration.java
> ===================================================================
> --- src/mapred/org/apache/hadoop/mapred/lib/db/DBConfiguration.java (revision 701034)
> +++ src/mapred/org/apache/hadoop/mapred/lib/db/DBConfiguration.java (working copy)
> @@ -80,6 +80,11 @@
> /** Field names in the Output table */
> public static final String OUTPUT_FIELD_NAMES_PROPERTY = "mapred.jdbc.output.field.names";
> + /** Batch size for output statement */
> + public static final String OUTPUT_BATCH_SIZE = "mapred.jdbc.output.batch.size";
> +
> + public static final int DEFAULT_BATCH_SIZE = 1000;
> +
> /**
> * Sets the DB access related fields in the JobConf.
> * @param job the job
> @@ -212,5 +217,12 @@
> job.setStrings(DBConfiguration.OUTPUT_FIELD_NAMES_PROPERTY, fieldNames);
> }
> + int getBatchSize() {
> + return job.getInt(DBConfiguration.OUTPUT_BATCH_SIZE, DEFAULT_BATCH_SIZE);
> + }
> +
> + void setBatchSize(int sz) {
> + job.setInt(DBConfiguration.OUTPUT_BATCH_SIZE, sz);
> + }
> }
> Index: src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java
> ===================================================================
> --- src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java (revision 701034)
> +++ src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java (working copy)
> @@ -37,11 +37,11 @@
> * A OutputFormat that sends the reduce output to a SQL table.
> * <p>
> * {@link DBOutputFormat} accepts <key,value> pairs, where
> - * key has a type extending DBWritable. Returned {@link RecordWriter}
> - * writes <b>only the key</b> to the database with a batch SQL query.
> + * value has a type extending DBWritable. Returned {@link RecordWriter}
> + * writes <b>only the value</b> to the database with a batch SQL query.
> *
> */
> -public class DBOutputFormat<K extends DBWritable, V>
> +public class DBOutputFormat<K, V extends DBWritable>
> implements OutputFormat<K,V> {
> private static final Log LOG = LogFactory.getLog(DBOutputFormat.class);
> @@ -54,27 +54,21 @@
> private Connection connection;
> private PreparedStatement statement;
> + private int batch = 0;
> + private int batchSize;
> protected DBRecordWriter(Connection connection
> - , PreparedStatement statement) throws SQLException {
> + , PreparedStatement statement, int batchSize) throws SQLException {
> this.connection = connection;
> this.statement = statement;
> this.connection.setAutoCommit(false);
> + this.batchSize = batchSize;
> }
> /** {@inheritDoc} */
> public void close(Reporter reporter) throws IOException {
> try {
> - statement.executeBatch();
> - connection.commit();
> - } catch (SQLException e) {
> - try {
> - connection.rollback();
> - }
> - catch (SQLException ex) {
> - LOG.warn(StringUtils.stringifyException(ex));
> - }
> - throw new IOException(e.getMessage());
> + executeBatch();
> } finally {
> try {
> statement.close();
> @@ -89,12 +83,37 @@
> /** {@inheritDoc} */
> public void write(K key, V value) throws IOException {
> try {
> - key.write(statement);
> + value.write(statement);
> statement.addBatch();
> + batch++;
> + if (batch == batchSize) {
> + executeBatch();
> + batch = 0;
> + }
> +
> } catch (SQLException e) {
> e.printStackTrace();
> }
> }
> +
> + private void executeBatch() throws IOException {
> + if (batch > 0) {
> + try {
> + statement.executeBatch();
> + connection.commit();
> + statement.clearBatch();
> + }
> + catch(SQLException e) {
> + try {
> + connection.rollback();
> + }
> + catch (SQLException ex) {
> + LOG.warn(StringUtils.stringifyException(ex));
> + }
> + throw new IOException(e.getMessage());
> + }
> + }
> + }
> }
> /**
> @@ -129,13 +148,14 @@
> DBConfiguration dbConf = new DBConfiguration(job);
> String tableName = dbConf.getOutputTableName();
> String[] fieldNames = dbConf.getOutputFieldNames();
> + int batchSize = dbConf.getBatchSize();
> try {
> Connection connection = dbConf.getConnection();
> PreparedStatement statement = null;
> statement = connection.prepareStatement(constructQuery(tableName, fieldNames));
> - return new DBRecordWriter(connection, statement);
> + return new DBRecordWriter(connection, statement, batchSize);
> }
> catch (Exception ex) {
> throw new IOException(ex.getMessage());
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Alexander Schwid (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Schwid updated HADOOP-4331:
-------------------------------------
Status: Patch Available (was: Open)
patch is ready
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Alexander Schwid (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Schwid updated HADOOP-4331:
-------------------------------------
Fix Version/s: (was: 0.19.0)
0.20.0
Affects Version/s: 0.20.0
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-4331) DBOutputFormat: add batch size
support for JDBC and recieve DBWritable object in value not in key
Posted by "Alexander Schwid (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637145#action_12637145 ]
Alexander Schwid commented on HADOOP-4331:
------------------------------------------
i had a task with more then 1 million records in result
> DBOutputFormat: add batch size support for JDBC and recieve DBWritable object in value not in key
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4331
> URL: https://issues.apache.org/jira/browse/HADOOP-4331
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: Alexander Schwid
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: patch.txt
>
>
> package mapred.lib.db
> added batch size support for JDBC in DBOutputFormat
> recieve DBWritable object in value not in key in DBOutputFormat
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.