You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by tdas <gi...@git.apache.org> on 2016/07/21 00:39:13 UTC

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

GitHub user tdas opened a pull request:

    https://github.com/apache/spark/pull/14292

    [SPARK-14131][SQL[STREAMING] Improved fix for avoiding potential deadlocks in HDFSMetadataLog

    ## What changes were proposed in this pull request?
    Current fix for deadlock disables interrupts in the StreamExecution which getting offsets for all sources, and when writing to any metadata log, to avoid potential deadlocks in HDFSMetadataLog(see JIRA for more details). However, disabling interrupts can have unintended consequences in other sources. So I am making the fix more narrow, by disabling interrupt it only in the HDFSMetadataLog. This is a narrower fix for something risky like disabling interrupt.
    
    ## How was this patch tested?
    Existing tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tdas/spark SPARK-14131

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14292.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14292
    
----
commit d64e0c15bbda3c32bff4947b04f386dea9e73515
Author: Tathagata Das <ta...@gmail.com>
Date:   2016-07-21T00:33:57Z

    Improved deadlock fix

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r72126474
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala ---
    @@ -247,6 +248,46 @@ private[sql] trait SQLTestUtils
           }
         }
       }
    +
    +  /** Run a test on a separate [[UninterruptibleThread]]. */
    +  protected def testWithUninterruptibleThread(name: String, quietly: Boolean = false)
    +    (body: => Unit): Unit = {
    +    val timeoutMillis = 10000
    +    var ex: Throwable = null
    +
    +    def runOnThread(): Unit = {
    +      val thread = new UninterruptibleThread(s"Testing thread for test $name") {
    +        override def run(): Unit = {
    +          try {
    +            body
    +          } catch {
    +            case NonFatal(e) =>
    +              ex = e
    +          }
    +        }
    +      }
    +      thread.setDaemon(true)
    +      thread.start()
    +      thread.join(timeoutMillis)
    +      if (thread.isAlive) {
    +        thread.interrupt()
    +        // If this interrupt does not work, then this thread is most likely running something that
    +        // is not interruptible. There is not much point to wait for the thread to termniate, and
    +        // we rather let the JVM terminate the thread on exit.
    +        fail(
    +          s"Test '$name' running on o.a.s.util.UninterruptibleThread timed out after" +
    +            s" $timeoutMillis ms")
    +      } else if (ex != null) {
    +        throw ex
    +      }
    +    }
    +
    +    if (quietly) {
    --- End diff --
    
    it does not compile that easily because test and testQuietly have different param signature, and the code gets complicated trying to make it work. 
    ```
    val f = if (quietly) testQuietly(name) _ else test(name) _
        f {
          runOnThread()
        }
    ```
    
    This is hard to understand.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #3189 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3189/consoleFull)** for PR 14292 at commit [`0e67e26`](https://github.com/apache/spark/commit/0e67e26c0690dfecb011c872f58baf87e1bde3c0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    test this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #62835 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62835/consoleFull)** for PR 14292 at commit [`26138b2`](https://github.com/apache/spark/commit/26138b2c57f417835dbc3462883670d258e72a0b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #62645 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62645/consoleFull)** for PR 14292 at commit [`7a3e3fa`](https://github.com/apache/spark/commit/7a3e3fa4332370823e77b68c6281257574cffc8e).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r72148009
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala ---
    @@ -269,19 +273,11 @@ class StreamExecution(
        * batchId counter is incremented and a new log entry is written with the newest offsets.
        */
       private def constructNextBatch(): Unit = {
    -    // There is a potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622).
    -    // If we interrupt some thread running Shell.runCommand, we may hit this issue.
    -    // As "FileStreamSource.getOffset" will create a file using HDFS API and call "Shell.runCommand"
    -    // to set the file permission, we should not interrupt "microBatchThread" when running this
    -    // method. See SPARK-14131.
    -    //
         // Check to see what new data is available.
         val hasNewData = {
           awaitBatchLock.lock()
           try {
    -        val newData = microBatchThread.runUninterruptibly {
    -          uniqueSources.flatMap(s => s.getOffset.map(o => s -> o))
    -        }
    +        val newData = uniqueSources.flatMap(s => s.getOffset.map(o => s -> o))
    --- End diff --
    
    Gave it a longer thought. I'm not using for comprehension very often, but when I do...What do you think about this?
    
    ```
            val newData = for {
              source <- uniqueSources
              offset <- source.getOffset
            } yield (source, offset)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #3190 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3190/consoleFull)** for PR 14292 at commit [`26138b2`](https://github.com/apache/spark/commit/26138b2c57f417835dbc3462883670d258e72a0b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r71871337
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala ---
    @@ -247,6 +248,46 @@ private[sql] trait SQLTestUtils
           }
         }
       }
    +
    +  /** Run a test on a separate [[UninterruptibleThread]]. */
    +  protected def testWithUninterruptibleThread(name: String, quietly: Boolean = false)
    +    (body: => Unit): Unit = {
    +    val timeoutMillis = 10000
    +    var ex: Throwable = null
    +
    +    def runOnThread(): Unit = {
    +      val thread = new UninterruptibleThread(s"Testing thread for test $name") {
    +        override def run(): Unit = {
    +          try {
    +            body
    +          } catch {
    +            case NonFatal(e) =>
    +              ex = e
    --- End diff --
    
    Will it work?! You're on another thread here and closing over `ex`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #62687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62687/consoleFull)** for PR 14292 at commit [`0e67e26`](https://github.com/apache/spark/commit/0e67e26c0690dfecb011c872f58baf87e1bde3c0).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #62687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62687/consoleFull)** for PR 14292 at commit [`0e67e26`](https://github.com/apache/spark/commit/0e67e26c0690dfecb011c872f58baf87e1bde3c0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r72125535
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala ---
    @@ -91,18 +92,30 @@ class HDFSMetadataLog[T: ClassTag](sparkSession: SparkSession, path: String)
         serializer.deserialize[T](ByteBuffer.wrap(bytes))
       }
     
    +  /**
    +   * Store the metadata for the specified batchId and return `true` if successful. If the batchId's
    +   * metadata has already been stored, this method will return `false`.
    +   *
    +   * Note that this method must be called on a [[org.apache.spark.util.UninterruptibleThread]]
    +   * so that interrupts can be disabled while writing the batch file. This is because there is a
    +   * potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622). If the thread
    +   * running "Shell.runCommand" is interrupted, then the thread can get deadlocked. In our
    +   * case, `writeBatch` creates a file using HDFS API and calls "Shell.runCommand" to set the
    +   * file permissions, and can get deadlocked is the stream execution thread is stopped by
    --- End diff --
    
    done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #3189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3189/consoleFull)** for PR 14292 at commit [`0e67e26`](https://github.com/apache/spark/commit/0e67e26c0690dfecb011c872f58baf87e1bde3c0).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r72127293
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala ---
    @@ -269,19 +273,11 @@ class StreamExecution(
        * batchId counter is incremented and a new log entry is written with the newest offsets.
        */
       private def constructNextBatch(): Unit = {
    -    // There is a potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622).
    -    // If we interrupt some thread running Shell.runCommand, we may hit this issue.
    -    // As "FileStreamSource.getOffset" will create a file using HDFS API and call "Shell.runCommand"
    -    // to set the file permission, we should not interrupt "microBatchThread" when running this
    -    // method. See SPARK-14131.
    -    //
         // Check to see what new data is available.
         val hasNewData = {
           awaitBatchLock.lock()
           try {
    -        val newData = microBatchThread.runUninterruptibly {
    -          uniqueSources.flatMap(s => s.getOffset.map(o => s -> o))
    -        }
    +        val newData = uniqueSources.flatMap(s => s.getOffset.map(o => s -> o))
    --- End diff --
    
    You snippet coverts it `Seq[(Source, Option[Offset])]`. I find it more tedious to extract `Seq[Source, Offset)]` from it. 
    
    
    ```
    uniqueSources.map(s => (s, s.getOffset)).filter(_._2.nonEmpty).map { case (k, v) => (k, v.get)}
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r71870509
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala ---
    @@ -91,18 +92,30 @@ class HDFSMetadataLog[T: ClassTag](sparkSession: SparkSession, path: String)
         serializer.deserialize[T](ByteBuffer.wrap(bytes))
       }
     
    +  /**
    +   * Store the metadata for the specified batchId and return `true` if successful. If the batchId's
    +   * metadata has already been stored, this method will return `false`.
    +   *
    +   * Note that this method must be called on a [[org.apache.spark.util.UninterruptibleThread]]
    +   * so that interrupts can be disabled while writing the batch file. This is because there is a
    +   * potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622). If the thread
    +   * running "Shell.runCommand" is interrupted, then the thread can get deadlocked. In our
    +   * case, `writeBatch` creates a file using HDFS API and calls "Shell.runCommand" to set the
    +   * file permissions, and can get deadlocked is the stream execution thread is stopped by
    +   * interrupt. Hence, we make sure that this method is called on UninterruptibleThread which
    +   * allows use disable interrupts. Also see SPARK-14131.
    --- End diff --
    
    "allow use disable interrupts"? Is this ok?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r71870635
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala ---
    @@ -91,18 +92,30 @@ class HDFSMetadataLog[T: ClassTag](sparkSession: SparkSession, path: String)
         serializer.deserialize[T](ByteBuffer.wrap(bytes))
       }
     
    +  /**
    +   * Store the metadata for the specified batchId and return `true` if successful. If the batchId's
    +   * metadata has already been stored, this method will return `false`.
    +   *
    +   * Note that this method must be called on a [[org.apache.spark.util.UninterruptibleThread]]
    +   * so that interrupts can be disabled while writing the batch file. This is because there is a
    +   * potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622). If the thread
    +   * running "Shell.runCommand" is interrupted, then the thread can get deadlocked. In our
    +   * case, `writeBatch` creates a file using HDFS API and calls "Shell.runCommand" to set the
    +   * file permissions, and can get deadlocked is the stream execution thread is stopped by
    +   * interrupt. Hence, we make sure that this method is called on UninterruptibleThread which
    --- End diff --
    
    `[[org.apache.spark.util.UninterruptibleThread]]` (as you do below)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r71924307
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala ---
    @@ -247,6 +248,46 @@ private[sql] trait SQLTestUtils
           }
         }
       }
    +
    +  /** Run a test on a separate [[UninterruptibleThread]]. */
    +  protected def testWithUninterruptibleThread(name: String, quietly: Boolean = false)
    +    (body: => Unit): Unit = {
    +    val timeoutMillis = 10000
    +    var ex: Throwable = null
    +
    +    def runOnThread(): Unit = {
    +      val thread = new UninterruptibleThread(s"Testing thread for test $name") {
    +        override def run(): Unit = {
    +          try {
    +            body
    +          } catch {
    +            case NonFatal(e) =>
    +              ex = e
    +          }
    +        }
    +      }
    +      thread.setDaemon(true)
    +      thread.start()
    +      thread.join(timeoutMillis)
    +      if (thread.isAlive) {
    +        thread.interrupt()
    +        // If this interrupt does not work, then this thread is most likely running something that
    +        // is not interruptible. There is not much point to wait for the thread to termniate, and
    +        // we rather let the JVM terminate the thread on exit.
    +        fail(
    +          s"Test '$name' running on o.a.s.util.UninterruptibleThread timed out after" +
    +            s" $timeoutMillis ms")
    +      } else if (ex != null) {
    +        throw ex
    +      }
    +    }
    +
    +    if (quietly) {
    --- End diff --
    
    Its more scala-ish, but slightly non-intuitive to read. Maybe rename f to testingFunc


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r72125645
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala ---
    @@ -91,18 +92,30 @@ class HDFSMetadataLog[T: ClassTag](sparkSession: SparkSession, path: String)
         serializer.deserialize[T](ByteBuffer.wrap(bytes))
       }
     
    +  /**
    +   * Store the metadata for the specified batchId and return `true` if successful. If the batchId's
    +   * metadata has already been stored, this method will return `false`.
    +   *
    +   * Note that this method must be called on a [[org.apache.spark.util.UninterruptibleThread]]
    +   * so that interrupts can be disabled while writing the batch file. This is because there is a
    +   * potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622). If the thread
    +   * running "Shell.runCommand" is interrupted, then the thread can get deadlocked. In our
    +   * case, `writeBatch` creates a file using HDFS API and calls "Shell.runCommand" to set the
    +   * file permissions, and can get deadlocked is the stream execution thread is stopped by
    +   * interrupt. Hence, we make sure that this method is called on UninterruptibleThread which
    +   * allows use disable interrupts. Also see SPARK-14131.
    --- End diff --
    
    "allows us to disable interrupts here"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62644/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r72137376
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala ---
    @@ -269,19 +273,11 @@ class StreamExecution(
        * batchId counter is incremented and a new log entry is written with the newest offsets.
        */
       private def constructNextBatch(): Unit = {
    -    // There is a potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622).
    -    // If we interrupt some thread running Shell.runCommand, we may hit this issue.
    -    // As "FileStreamSource.getOffset" will create a file using HDFS API and call "Shell.runCommand"
    -    // to set the file permission, we should not interrupt "microBatchThread" when running this
    -    // method. See SPARK-14131.
    -    //
         // Check to see what new data is available.
         val hasNewData = {
           awaitBatchLock.lock()
           try {
    -        val newData = microBatchThread.runUninterruptibly {
    -          uniqueSources.flatMap(s => s.getOffset.map(o => s -> o))
    -        }
    +        val newData = uniqueSources.flatMap(s => s.getOffset.map(o => s -> o))
    --- End diff --
    
    I don't like it either, but...couldn't it be that the line is trying to do more than it really should? Perhaps the code should be two simpler functions composed? Just a wild thought...Don't wanna hold it back.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #62835 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62835/consoleFull)** for PR 14292 at commit [`26138b2`](https://github.com/apache/spark/commit/26138b2c57f417835dbc3462883670d258e72a0b).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r71871490
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala ---
    @@ -247,6 +248,46 @@ private[sql] trait SQLTestUtils
           }
         }
       }
    +
    +  /** Run a test on a separate [[UninterruptibleThread]]. */
    +  protected def testWithUninterruptibleThread(name: String, quietly: Boolean = false)
    +    (body: => Unit): Unit = {
    +    val timeoutMillis = 10000
    +    var ex: Throwable = null
    +
    +    def runOnThread(): Unit = {
    +      val thread = new UninterruptibleThread(s"Testing thread for test $name") {
    +        override def run(): Unit = {
    +          try {
    +            body
    +          } catch {
    +            case NonFatal(e) =>
    +              ex = e
    +          }
    +        }
    +      }
    +      thread.setDaemon(true)
    +      thread.start()
    +      thread.join(timeoutMillis)
    +      if (thread.isAlive) {
    +        thread.interrupt()
    +        // If this interrupt does not work, then this thread is most likely running something that
    +        // is not interruptible. There is not much point to wait for the thread to termniate, and
    +        // we rather let the JVM terminate the thread on exit.
    +        fail(
    +          s"Test '$name' running on o.a.s.util.UninterruptibleThread timed out after" +
    +            s" $timeoutMillis ms")
    +      } else if (ex != null) {
    +        throw ex
    +      }
    +    }
    +
    +    if (quietly) {
    --- End diff --
    
    I'd appreciate your comment on the following alternative:
    
    ```
    val f = if (quietly) testQuietly else test
    f(name) { runOnThread() }
    ```
    
    ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62645/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    @marmbrus Can you take a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #62644 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62644/consoleFull)** for PR 14292 at commit [`d64e0c1`](https://github.com/apache/spark/commit/d64e0c15bbda3c32bff4947b04f386dea9e73515).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r72127350
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala ---
    @@ -91,18 +92,30 @@ class HDFSMetadataLog[T: ClassTag](sparkSession: SparkSession, path: String)
         serializer.deserialize[T](ByteBuffer.wrap(bytes))
       }
     
    +  /**
    +   * Store the metadata for the specified batchId and return `true` if successful. If the batchId's
    +   * metadata has already been stored, this method will return `false`.
    +   *
    +   * Note that this method must be called on a [[org.apache.spark.util.UninterruptibleThread]]
    +   * so that interrupts can be disabled while writing the batch file. This is because there is a
    +   * potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622). If the thread
    +   * running "Shell.runCommand" is interrupted, then the thread can get deadlocked. In our
    +   * case, `writeBatch` creates a file using HDFS API and calls "Shell.runCommand" to set the
    +   * file permissions, and can get deadlocked is the stream execution thread is stopped by
    +   * interrupt. Hence, we make sure that this method is called on UninterruptibleThread which
    --- End diff --
    
    fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r71923960
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala ---
    @@ -247,6 +248,46 @@ private[sql] trait SQLTestUtils
           }
         }
       }
    +
    +  /** Run a test on a separate [[UninterruptibleThread]]. */
    +  protected def testWithUninterruptibleThread(name: String, quietly: Boolean = false)
    +    (body: => Unit): Unit = {
    +    val timeoutMillis = 10000
    +    var ex: Throwable = null
    +
    +    def runOnThread(): Unit = {
    +      val thread = new UninterruptibleThread(s"Testing thread for test $name") {
    +        override def run(): Unit = {
    +          try {
    +            body
    +          } catch {
    +            case NonFatal(e) =>
    +              ex = e
    --- End diff --
    
    my bad. ex needs to be transient.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #3190 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3190/consoleFull)** for PR 14292 at commit [`26138b2`](https://github.com/apache/spark/commit/26138b2c57f417835dbc3462883670d258e72a0b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #62644 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62644/consoleFull)** for PR 14292 at commit [`d64e0c1`](https://github.com/apache/spark/commit/d64e0c15bbda3c32bff4947b04f386dea9e73515).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62835/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/14292


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Tests have passed. Merging this to master and 2.0. Thanks for reviewing @zsxwing @jaceklaskowski 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by zsxwing <gi...@git.apache.org>.

Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    @tdas this change breaks the tests as they don't run in UninterruptibleThread


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by zsxwing <gi...@git.apache.org>.

Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    LGTM. Pending tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r71871037
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala ---
    @@ -269,19 +273,11 @@ class StreamExecution(
        * batchId counter is incremented and a new log entry is written with the newest offsets.
        */
       private def constructNextBatch(): Unit = {
    -    // There is a potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622).
    -    // If we interrupt some thread running Shell.runCommand, we may hit this issue.
    -    // As "FileStreamSource.getOffset" will create a file using HDFS API and call "Shell.runCommand"
    -    // to set the file permission, we should not interrupt "microBatchThread" when running this
    -    // method. See SPARK-14131.
    -    //
         // Check to see what new data is available.
         val hasNewData = {
           awaitBatchLock.lock()
           try {
    -        val newData = microBatchThread.runUninterruptibly {
    -          uniqueSources.flatMap(s => s.getOffset.map(o => s -> o))
    -        }
    +        val newData = uniqueSources.flatMap(s => s.getOffset.map(o => s -> o))
    --- End diff --
    
    Just a single line but takes a while to figure out what it does. I'd rewrite it to:
    
    ```
    uniqueSources.map(s => (s, s.getOffset))...
    ```
    
    and would do more transformation depending on the types (didn't check in IDE) Just an idea to untangle the knots :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    **[Test build #62645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62645/consoleFull)** for PR 14292 at commit [`7a3e3fa`](https://github.com/apache/spark/commit/7a3e3fa4332370823e77b68c6281257574cffc8e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r71870459
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala ---
    @@ -91,18 +92,30 @@ class HDFSMetadataLog[T: ClassTag](sparkSession: SparkSession, path: String)
         serializer.deserialize[T](ByteBuffer.wrap(bytes))
       }
     
    +  /**
    +   * Store the metadata for the specified batchId and return `true` if successful. If the batchId's
    +   * metadata has already been stored, this method will return `false`.
    +   *
    +   * Note that this method must be called on a [[org.apache.spark.util.UninterruptibleThread]]
    +   * so that interrupts can be disabled while writing the batch file. This is because there is a
    +   * potential dead-lock in Hadoop "Shell.runCommand" before 2.5.0 (HADOOP-10622). If the thread
    +   * running "Shell.runCommand" is interrupted, then the thread can get deadlocked. In our
    +   * case, `writeBatch` creates a file using HDFS API and calls "Shell.runCommand" to set the
    +   * file permissions, and can get deadlocked is the stream execution thread is stopped by
    --- End diff --
    
    s/is/if


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avo...

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14292#discussion_r72148282
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala ---
    @@ -247,6 +248,46 @@ private[sql] trait SQLTestUtils
           }
         }
       }
    +
    +  /** Run a test on a separate [[UninterruptibleThread]]. */
    +  protected def testWithUninterruptibleThread(name: String, quietly: Boolean = false)
    +    (body: => Unit): Unit = {
    +    val timeoutMillis = 10000
    +    var ex: Throwable = null
    +
    +    def runOnThread(): Unit = {
    +      val thread = new UninterruptibleThread(s"Testing thread for test $name") {
    +        override def run(): Unit = {
    +          try {
    +            body
    +          } catch {
    +            case NonFatal(e) =>
    +              ex = e
    +          }
    +        }
    +      }
    +      thread.setDaemon(true)
    +      thread.start()
    +      thread.join(timeoutMillis)
    +      if (thread.isAlive) {
    +        thread.interrupt()
    +        // If this interrupt does not work, then this thread is most likely running something that
    +        // is not interruptible. There is not much point to wait for the thread to termniate, and
    +        // we rather let the JVM terminate the thread on exit.
    +        fail(
    +          s"Test '$name' running on o.a.s.util.UninterruptibleThread timed out after" +
    +            s" $timeoutMillis ms")
    +      } else if (ex != null) {
    +        throw ex
    +      }
    +    }
    +
    +    if (quietly) {
    --- End diff --
    
    What about this?
    
    ```
    val f = if (quietly) testQuietly(name) else test(name)
    f(runOnThread())
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62687/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Fixing it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14292: [SPARK-14131][SQL[STREAMING] Improved fix for avoiding p...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14292
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org