You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rajeshbalamohan <gi...@git.apache.org> on 2016/03/26 10:01:15 UTC
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
GitHub user rajeshbalamohan opened a pull request:
https://github.com/apache/spark/pull/11978
SPARK-14113. Consider marking JobConf closure-cleaning in HadoopRDD a…
## What changes were proposed in this pull request?
In HadoopRDD, the following code was introduced as a part of SPARK-6943.
``
if (initLocalJobConfFuncOpt.isDefined) {
sparkContext.clean(initLocalJobConfFuncOpt.get)
}
``
Passing initLocalJobConfFuncOpt to HadoopRDD incurs good performance penalty (due to closure cleaning) with large number of RDDs. This would be invoked for every HadoopRDD initialization causing the bottleneck.
example threadstack is given below
``
at org.apache.xbean.asm5.ClassReader.a(Unknown Source)
at org.apache.xbean.asm5.ClassReader.readUTF8(Unknown Source)
at org.apache.xbean.asm5.ClassReader.a(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:402)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:390)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:102)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:102)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:102)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:390)
at org.apache.xbean.asm5.ClassReader.a(Unknown Source)
at org.apache.xbean.asm5.ClassReader.b(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$15.apply(ClosureCleaner.scala:224)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$15.apply(ClosureCleaner.scala:223)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:223)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2079)
at org.apache.spark.rdd.HadoopRDD.<init>(HadoopRDD.scala:112)
``
This PR request does the following
1. Remove the closure cleaning in HadoopRDD init, which was mainly added to check if HadoopRDD can be made serializable or not.
2. Directly instantiate HadoopRDD in OrcRelation, instead of going via SparkContext.hadoopRDD (which internally invokes threaddump in "withScope"). Clubbing this change instead of making a separate ticket for this minor change.
## How was this patch tested?
No new tests have been added. Used the following code to measure overhead the HadoopRDD init codepath. With patch it took 340ms as opposed to 4815ms without patch.
Also tested with number of queries from TPC-DS in multi node environment. Along with, ran the following unit tests org.apache.spark.sql.hive.execution.HiveCompatibilitySuite,org.apache.spark.sql.hive.execution.HiveQuerySuite,org.apache.spark.sql.hive.execution.PruningSuite,org.apache.spark.sql.hive.CachedTableSuite,org.apache.spark.rdd.RDDOperationScopeSuite,org.apache.spark.ui.jobs.JobProgressListenerSuite
``
test("Check timing for HadoopRDD init") {
val start: Long = System.currentTimeMillis();
val initializeJobConfFunc = HadoopTableReader.initializeLocalJobConfFunc ("", null) _
Utils.withDummyCallSite(sqlContext.sparkContext) {
// Large tables end up creating 5500 RDDs
for(i <- 1 to 5500) {
// ignore nulls in RDD as its mainly for testing timing of RDD creation
val testRDD = new HadoopRDD(sqlContext.sparkContext, null, Some(initializeJobConfFunc),
null, classOf[NullWritable], classOf[Writable], 10)
}
}
val end: Long = System.currentTimeMillis();
println("Time taken : " + (end - start))
}
``
Without Patch: (time taken to init 5000 HadoopRDD)
Time taken : 4815
Without Patch: (time taken to init 5000 HadoopRDD)
Time taken : 340
…s optional
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rajeshbalamohan/spark SPARK-14113
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/11978.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #11978
----
commit dfb6b03c5061dd8514fe09804c30c9281af50ab9
Author: Rajesh Balamohan <rb...@apache.org>
Date: 2016-03-26T08:58:17Z
SPARK-14113. Consider marking JobConf closure-cleaning in HadoopRDD as optional
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-204300804
**[Test build #54687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54687/consoleFull)** for PR 11978 at commit [`0c53ed2`](https://github.com/apache/spark/commit/0c53ed23e3fb24cc5d882272ddca629843005629).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/11978#discussion_r57609134
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -979,6 +979,7 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
// A Hadoop configuration can be about 10 KB, which is pretty big, so broadcast it.
val confBroadcast = broadcast(new SerializableConfiguration(hadoopConfiguration))
val setInputPathsFunc = (jobConf: JobConf) => FileInputFormat.setInputPaths(jobConf, path)
+ clean(setInputPathsFunc)
--- End diff --
yeah, I think we need to add the cleaning for `hadoopRDD` too.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-202619019
**[Test build #54375 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54375/consoleFull)** for PR 11978 at commit [`d4e75d2`](https://github.com/apache/spark/commit/d4e75d2b306918be131a2d6ef70160f5a3353fe2).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-204301039
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54687/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by rajeshbalamohan <gi...@git.apache.org>.
Github user rajeshbalamohan commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-211175800
@srowen - As per andrew's comment, I thought it was fine to make the change given that HadoopRDD is marked as DeveloperAPI. Please let me know if any additional changes are needed.
Additional info: Huge amount of changes in SPARK-13664 for FileSourceStrategy which is marked as the default codepath. So ideally, OrcRelation would no longer go via this codepath by default. Given that, this PR would have an impact if someone is trying to directly invoke HadoopRDD and has done closure clearing upfront.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-201783935
Jenkins test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-204517168
@rajeshbalamohan the point of cleaning a closure is because it might be passed in by the user. If we provide the closure internally then we don't have to clean it. Before this patch we used to clean the user's closure in `sc.hadoopRDD`, but after this patch we don't do that anymore. That regresses behavior.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/11978#discussion_r57608918
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala ---
@@ -317,12 +317,11 @@ private[orc] case class OrcTableScan(
classOf[OrcInputFormat]
.asInstanceOf[Class[_ <: MapRedInputFormat[NullWritable, Writable]]]
- val rdd = sqlContext.sparkContext.hadoopRDD(
+ val rdd = new HadoopRDD(sqlContext.sparkContext,
--- End diff --
can you add a comment to say we're creating a HadoopRDD here directly to bypass closure cleaning
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-202617742
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by rajeshbalamohan <gi...@git.apache.org>.
Github user rajeshbalamohan commented on a diff in the pull request:
https://github.com/apache/spark/pull/11978#discussion_r57537799
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -979,6 +979,7 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
// A Hadoop configuration can be about 10 KB, which is pretty big, so broadcast it.
val confBroadcast = broadcast(new SerializableConfiguration(hadoopConfiguration))
val setInputPathsFunc = (jobConf: JobConf) => FileInputFormat.setInputPaths(jobConf, path)
+ clean(setInputPathsFunc)
--- End diff --
Thanks @srowen. Yes, for invocations via sc.textFile. Adding additional method like following and passing initLocalJobConfFuncOpt to it can help avoid closure cleaning in this scenario. However, this would call for changes in all other places where sc.textFile is invoked. Intension was to allow user to make use of HadoopRDD directly (if needed) without having to incur the cost of closure cleaning (e.g in sql modules). Hence did not make those additional changes.
```
def newTextFile(
path: String,
initLocalJobConfFuncOpt: Option[JobConf => Unit],
minPartitions: Int = defaultMinPartitions): RDD[String] = withScope {
assertNotStopped()
hadoopFile(path, classOf[TextInputFormat], initLocalJobConfFuncOpt,
classOf[LongWritable], classOf[Text],
minPartitions).map(pair => pair._2.toString).setName(path)
}
def hadoopFile[K, V](
path: String,
inputFormatClass: Class[_ <: InputFormat[K, V]],
initLocalJobConfFuncOpt: Option[JobConf => Unit],
keyClass: Class[K],
valueClass: Class[V],
minPartitions: Int = defaultMinPartitions): RDD[(K, V)] = withScope {
assertNotStopped()
// A Hadoop configuration can be about 10 KB, which is pretty big, so broadcast it.
val confBroadcast = broadcast(new SerializableConfiguration(hadoopConfiguration))
new HadoopRDD(
this,
confBroadcast,
initLocalJobConfFuncOpt,
inputFormatClass,
keyClass,
valueClass,
minPartitions).setName(path)
}
e.g
sc.newTextFile(tmpFilePath, Some(setInputPathsFunc), 4).count()
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/11978#discussion_r58714243
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -979,6 +979,7 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
// A Hadoop configuration can be about 10 KB, which is pretty big, so broadcast it.
val confBroadcast = broadcast(new SerializableConfiguration(hadoopConfiguration))
val setInputPathsFunc = (jobConf: JobConf) => FileInputFormat.setInputPaths(jobConf, path)
+ clean(setInputPathsFunc)
--- End diff --
@rajeshbalamohan catching up here: I think the remaining TODO is that cleaning still needs to be restored for `hadoopRDD` right? then this is ready.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-212182468
(This is trivial but might be better if the title follows `[SPARK-XXXXX][SQL]` format as described in https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark.)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by rajeshbalamohan <gi...@git.apache.org>.
Github user rajeshbalamohan closed the pull request at:
https://github.com/apache/spark/pull/11978
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-210440963
ping @rajeshbalamohan ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by rajeshbalamohan <gi...@git.apache.org>.
Github user rajeshbalamohan commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-204200260
Thanks @andrewor14 . Addressed your review comments in latest commit.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-214677094
@rajeshbalamohan looks like this needs a rebase now, and I think Andrew's comment still needs to be addressed. Are you suggesting this change is no longer needed or did I misunderstand your last comment? Let's resolve this one way or the other.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by rajeshbalamohan <gi...@git.apache.org>.
Github user rajeshbalamohan commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-202048734
Tested with following suites along with the earlier sql suites
org.apache.spark.FileSuite
org.apache.spark.SparkContextSuite
org.apache.spark.graphx.GraphLoaderSuite
org.apache.spark.graphx.lib.SVDPlusPlusSuite
org.apache.spark.metrics.InputOutputMetricsSuite
org.apache.spark.ml.PipelineSuite
org.apache.spark.ml.classification.DecisionTreeClassifierSuite
org.apache.spark.ml.classification.LogisticRegressionSuite
org.apache.spark.ml.classification.MultilayerPerceptronClassifierSuite
org.apache.spark.ml.classification.NaiveBayesSuite
org.apache.spark.ml.clustering.KMeansSuite
org.apache.spark.ml.clustering.LDASuite
org.apache.spark.ml.evaluation.BinaryClassificationEvaluatorSuite
org.apache.spark.ml.evaluation.MulticlassClassificationEvaluatorSuite
org.apache.spark.ml.evaluation.RegressionEvaluatorSuite
org.apache.spark.ml.feature.BinarizerSuite
org.apache.spark.ml.feature.BucketizerSuite
org.apache.spark.ml.feature.ChiSqSelectorSuite
org.apache.spark.ml.feature.CountVectorizerSuite
org.apache.spark.ml.feature.DCTSuite
org.apache.spark.ml.feature.ElementwiseProductSuite
org.apache.spark.ml.feature.HashingTFSuite
org.apache.spark.ml.feature.IDFSuite
org.apache.spark.ml.feature.InteractionSuite
org.apache.spark.ml.feature.MaxAbsScalerSuite
org.apache.spark.ml.feature.MinMaxScalerSuite
org.apache.spark.ml.feature.NGramSuite
org.apache.spark.ml.feature.NormalizerSuite
org.apache.spark.ml.feature.OneHotEncoderSuite
org.apache.spark.ml.feature.PCASuite
org.apache.spark.ml.feature.PolynomialExpansionSuite
org.apache.spark.ml.feature.QuantileDiscretizerSuite
org.apache.spark.ml.feature.RFormulaSuite
org.apache.spark.ml.feature.RegexTokenizerSuite
org.apache.spark.ml.feature.SQLTransformerSuite
org.apache.spark.ml.feature.StandardScalerSuite
org.apache.spark.ml.feature.StopWordsRemoverSuite
org.apache.spark.ml.feature.StringIndexerSuite
org.apache.spark.ml.feature.TokenizerSuite
org.apache.spark.ml.feature.VectorAssemblerSuite
org.apache.spark.ml.feature.VectorIndexerSuite
org.apache.spark.ml.feature.VectorSlicerSuite
org.apache.spark.ml.feature.Word2VecSuite
org.apache.spark.ml.recommendation.ALSSuite
org.apache.spark.ml.regression.AFTSurvivalRegressionSuite
org.apache.spark.ml.regression.DecisionTreeRegressorSuite
org.apache.spark.ml.regression.GeneralizedLinearRegressionSuite
org.apache.spark.ml.regression.IsotonicRegressionSuite
org.apache.spark.ml.regression.LinearRegressionSuite
org.apache.spark.ml.source.libsvm.LibSVMRelationSuite
org.apache.spark.ml.tuning.CrossValidatorSuite
org.apache.spark.ml.util.DefaultReadWriteSuite
org.apache.spark.mllib.classification.LogisticRegressionSuite
org.apache.spark.mllib.classification.NaiveBayesSuite
org.apache.spark.mllib.classification.SVMSuite
org.apache.spark.mllib.clustering.GaussianMixtureSuite
org.apache.spark.mllib.clustering.KMeansSuite
org.apache.spark.mllib.clustering.LDASuite
org.apache.spark.mllib.clustering.PowerIterationClusteringSuite
org.apache.spark.mllib.feature.ChiSqSelectorSuite
org.apache.spark.mllib.feature.Word2VecSuite
org.apache.spark.mllib.fpm.FPGrowthSuite
org.apache.spark.mllib.recommendation.MatrixFactorizationModelSuite
org.apache.spark.mllib.regression.IsotonicRegressionSuite
org.apache.spark.mllib.regression.LassoSuite
org.apache.spark.mllib.regression.LinearRegressionSuite
org.apache.spark.mllib.regression.RidgeRegressionSuite
org.apache.spark.mllib.tree.DecisionTreeSuite
org.apache.spark.mllib.tree.GradientBoostedTreesSuite
org.apache.spark.mllib.tree.RandomForestSuite
org.apache.spark.mllib.util.MLUtilsSuite
org.apache.spark.rdd.HadoopRDD,
org.apache.spark.rdd.MapPartitionsRDD,
org.apache.spark.rdd.PairRDDFunctionsSuite
org.apache.spark.repl.ReplSuite
org.apache.spark.sql.execution.datasources.csv.CSVSuite
org.apache.spark.sql.execution.datasources.json.JsonSuite
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-204255888
**[Test build #54687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54687/consoleFull)** for PR 11978 at commit [`0c53ed2`](https://github.com/apache/spark/commit/0c53ed23e3fb24cc5d882272ddca629843005629).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-204255875
@rajeshbalamohan you need to clean `sc.hadoopRDD` too.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-204255624
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by rajeshbalamohan <gi...@git.apache.org>.
Github user rajeshbalamohan commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-214705705
@srowen - With the master code base & the changes that went in (FileSourceStrategy to be specific), this PR would no longer be very relevant in master codebase. This would be more relevant for 1.6.x line, but not sure if we need to backport it. Will mark it as closed now. Plz let me know and I can close this PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by rajeshbalamohan <gi...@git.apache.org>.
Github user rajeshbalamohan commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-204307805
@andrewor14 - Not sure if I understood your last comment. Currently no direct invocation to HadoopRDD (with initLocalJobConfFuncOpt) is made in Spark. Later point in time, if change is needed to invoke HadoopRDD (with initLocalJobConfFuncOpt) via SparkContext, following method could be added which cleans up the function.
```
def hadoopRDD[K, V](
broadcastedConf: Broadcast[SerializableConfiguration],
initLocalJobConfFuncOpt: Option[JobConf => Unit],
inputFormatClass: Class[_ <: InputFormat[K, V]],
keyClass: Class[K],
valueClass: Class[V],
minPartitions: Int = defaultMinPartitions): RDD[(K, V)] = withScope {
assertNotStopped()
clean(initLocalJobConfFuncOpt)
new HadoopRDD(this, broadcastedConf, initLocalJobConfFuncOpt,
inputFormatClass, keyClass, valueClass, minPartitions)
}
```
But, I am not sure whether we need to clean sc.hadoopRDD in this patch. Please let me know.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/11978#discussion_r57529953
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -979,6 +979,7 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
// A Hadoop configuration can be about 10 KB, which is pretty big, so broadcast it.
val confBroadcast = broadcast(new SerializableConfiguration(hadoopConfiguration))
val setInputPathsFunc = (jobConf: JobConf) => FileInputFormat.setInputPaths(jobConf, path)
+ clean(setInputPathsFunc)
--- End diff --
Doesn't this still cause the closure cleaning to happen once per `HadoopRDD`?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-204301037
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-201828591
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54261/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-202658203
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54375/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-215924087
@rajeshbalamohan go ahead and close this. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-202657649
**[Test build #54375 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54375/consoleFull)** for PR 11978 at commit [`d4e75d2`](https://github.com/apache/spark/commit/d4e75d2b306918be131a2d6ef70160f5a3353fe2).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/11978#discussion_r57559696
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -979,6 +979,7 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
// A Hadoop configuration can be about 10 KB, which is pretty big, so broadcast it.
val confBroadcast = broadcast(new SerializableConfiguration(hadoopConfiguration))
val setInputPathsFunc = (jobConf: JobConf) => FileInputFormat.setInputPaths(jobConf, path)
+ clean(setInputPathsFunc)
--- End diff --
OK yeah it's added back for `hadoopFile` calls but not others, like `hadoopRDD` or direct use. I don't know enough to evaluate this change with authority. If it's there for correctness it has to be there, and this does look like a function that needs cleaning, right? It may not happen to need cleaning in the case you're exercising but that doesn't mean it works to not clean it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-201828050
**[Test build #54261 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54261/consoleFull)** for PR 11978 at commit [`dfb6b03`](https://github.com/apache/spark/commit/dfb6b03c5061dd8514fe09804c30c9281af50ab9).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-211263001
I'm referring to https://github.com/apache/spark/pull/11978#issuecomment-204517168 which suggests `hadoopRDD` calls need to be cleaned.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-201828588
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-202515698
Looks good.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-201738816
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-202658198
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/11978#discussion_r57608860
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala ---
@@ -317,12 +317,11 @@ private[orc] case class OrcTableScan(
classOf[OrcInputFormat]
.asInstanceOf[Class[_ <: MapRedInputFormat[NullWritable, Writable]]]
- val rdd = sqlContext.sparkContext.hadoopRDD(
+ val rdd = new HadoopRDD(sqlContext.sparkContext,
conf.asInstanceOf[JobConf],
inputFormatClass,
classOf[NullWritable],
- classOf[Writable]
- ).asInstanceOf[HadoopRDD[NullWritable, Writable]]
+ classOf[Writable], sqlContext.sparkContext.defaultMinPartitions)
--- End diff --
split this line
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-201783910
CC @andrewor14 possibly to comment on whether this closure cleaning can be removed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/11978#discussion_r58250294
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -109,10 +109,6 @@ class HadoopRDD[K, V](
minPartitions: Int)
extends RDD[(K, V)](sc, Nil) with Logging {
- if (initLocalJobConfFuncOpt.isDefined) {
- sparkContext.clean(initLocalJobConfFuncOpt.get)
--- End diff --
technically this might also regress behavior since it's `HadoopRDD` is not private, but it's probably OK if this change improves performance since this class is a `DeveloperApi`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: SPARK-14113. Consider marking JobConf closure-...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11978#issuecomment-201786327
**[Test build #54261 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54261/consoleFull)** for PR 11978 at commit [`dfb6b03`](https://github.com/apache/spark/commit/dfb6b03c5061dd8514fe09804c30c9281af50ab9).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org