You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Fuhrmann <st...@wandisco.com> on 2014/01/15 18:01:35 UTC

[Patch] Workaround for SQLITE open() race condition

Hi,

We see random failure on our Windows build bot
when it tries to open different working copies at
the same time. This is a simple retry patch for that.

On IRC, Bert spoke out in favour of debugging and
fixing sqlite itself - hence I won't commit the patch
right now. However, I've also been unable to reproduce
the problem under LINUX so far and can't really help
with the effort.

-- Stefan^2.

[[[
Attempt to address the "unable to open database file" error we
see at least on our test suite with multi-threaded SQLITE code.

http://www.mail-archive.com/sqlite-users@sqlite.org/msg81284.html
hints at a problem with temporary files used by SQLITE internally,
i.e. the root cause lying in the SQLITE code itself.

Hence, we add a workaround for now that simply retries the open()
operation for a reasonable amount of time.

* subversion/libsvn_subr/sqlite.c
  (BUSY_TIMEOUT): Document that we use this timeout for failing
                  open() calls as well.
  (internal_open): Retry if we couldn't open the DB for some
                   potentially transient reason.
]]]

--- subversion/libsvn_subr/sqlite.c    (revision 1555441)
+++ subversion/libsvn_subr/sqlite.c    (working copy)
@@ -174,10 +174,11 @@ struct svn_sqlite__value_t
 } while (0)


-/* Time (in milliseconds) to wait for sqlite locks before giving up. */
+/* Time (in milliseconds) to wait for sqlite locks before giving up.
+   We use the same timeout for handling other concurrency issues in
+   sqlite's open function. */
 #define BUSY_TIMEOUT 10000

-
 /* Convenience wrapper around exec_sql2(). */
 #define exec_sql(db, sql) exec_sql2((db), (sql), SQLITE_OK)

@@ -838,6 +839,28 @@ internal_open(sqlite3 **db3, const char
          do this manually. */
       /* ### SQLITE_CANTOPEN */
       int err_code = sqlite3_open_v2(path, db3, flags, NULL);
+
+      /* SQLITE seems to have race condition that prevent separate threads
+         from opening separate DBs at the same time. */
+      if (err_code == SQLITE_CANTOPEN)
+        {
+          /* Retry for approx. 10 seconds while we get the standard "can't
+             open" error return. */
+          apr_time_t start = apr_time_now();
+          while (   (err_code == SQLITE_CANTOPEN)
+                 && (apr_time_now() - start < BUSY_TIMEOUT * 1000))
+            {
+              /* The db object is in an undefined state - clean it up.
+                 We don't catch the error here, since we only care about
the
+                 open error at this point. */
+              sqlite3_close(*db3);
+
+              /* Retry. */
+              err_code = sqlite3_open_v2(path, db3, flags, NULL);
+            }
+        }
+
+      /* Now back to normal error handling. */
       if (err_code != SQLITE_OK)
         {
           /* Save the error message before closing the SQLite handle. */