You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/10/12 14:16:38 UTC

[GitHub] [pulsar] eolivelli commented on a diff in pull request #18017: [improve][io] JDBC sinks: implement JDBC Batch API

eolivelli commented on code in PR #18017:
URL: https://github.com/apache/pulsar/pull/18017#discussion_r993522741


##########
pulsar-io/jdbc/core/src/main/java/org/apache/pulsar/io/jdbc/JdbcAbstractSink.java:
##########
@@ -213,63 +223,90 @@ protected enum MutationType {
 
 
     private void flush() {
-        // if not in flushing state, do flush, else return;
         if (incomingList.size() > 0 && isFlushing.compareAndSet(false, true)) {
-            if (log.isDebugEnabled()) {
-                log.debug("Starting flush, queue size: {}", incomingList.size());
-            }
-            if (!swapList.isEmpty()) {
-                throw new IllegalStateException("swapList should be empty since last flush. swapList.size: "
-                        + swapList.size());
-            }
-            synchronized (this) {
-                List<Record<T>> tmpList;
-                swapList.clear();
+            boolean needAnotherRound;
+            final Deque<Record<T>> swapList = new LinkedList<>();
+
+            synchronized (incomingList) {
+                if (log.isDebugEnabled()) {
+                    log.debug("Starting flush, queue size: {}", incomingList.size());
+                }
+                final int actualBatchSize = batchSize > 0 ? Math.min(incomingList.size(), batchSize) :
+                        incomingList.size();
 
-                tmpList = swapList;
-                swapList = incomingList;
-                incomingList = tmpList;
+                for (int i = 0; i < actualBatchSize; i++) {
+                    swapList.add(incomingList.removeFirst());
+                }
+                needAnotherRound = batchSize > 0 && !incomingList.isEmpty() && incomingList.size() >= batchSize;
             }
+            long start = System.nanoTime();
 
             int count = 0;
             try {
+                PreparedStatement currentBatch = null;
+                final List<Mutation> mutations = swapList

Review Comment:
   why do you need this intermediate list ?



##########
pulsar-io/jdbc/core/src/main/java/org/apache/pulsar/io/jdbc/JdbcAbstractSink.java:
##########
@@ -213,63 +223,90 @@ protected enum MutationType {
 
 
     private void flush() {
-        // if not in flushing state, do flush, else return;
         if (incomingList.size() > 0 && isFlushing.compareAndSet(false, true)) {
-            if (log.isDebugEnabled()) {
-                log.debug("Starting flush, queue size: {}", incomingList.size());
-            }
-            if (!swapList.isEmpty()) {
-                throw new IllegalStateException("swapList should be empty since last flush. swapList.size: "
-                        + swapList.size());
-            }
-            synchronized (this) {
-                List<Record<T>> tmpList;
-                swapList.clear();
+            boolean needAnotherRound;
+            final Deque<Record<T>> swapList = new LinkedList<>();
+
+            synchronized (incomingList) {

Review Comment:
   if you use `synchronized`  on "incomingList" here you have to use it everywhere, otherwire you are not handling concurrently well



##########
pulsar-io/jdbc/core/src/main/java/org/apache/pulsar/io/jdbc/JdbcAbstractSink.java:
##########
@@ -280,21 +317,50 @@ private void flush() {
                 }
             }
 
-            if (swapList.size() != count) {
-                log.error("Update count {} not match total number of records {}", count, swapList.size());
-            }
-
-            // finish flush
-            if (log.isDebugEnabled()) {
-                log.debug("Finish flush, queue size: {}", swapList.size());
-            }
-            swapList.clear();
             isFlushing.set(false);
+            if (needAnotherRound) {
+                flush();
+            }
         } else {
             if (log.isDebugEnabled()) {
                 log.debug("Already in flushing state, will not flush, queue size: {}", incomingList.size());
             }
         }
     }
 
+    private void executeBatch(Deque<Record<T>> swapList, PreparedStatement statement) throws SQLException {
+        final int[] results = statement.executeBatch();
+        Map<Integer, Integer> failuresMapping = null;
+        final boolean useTransactions = jdbcSinkConfig.isUseTransactions();
+
+        for (int r: results) {
+            if (r < 0) {

Review Comment:
   this is not correct, there are special negative values that have special meaning
   
   https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#SUCCESS_NO_INFO



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org