You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/12/08 16:21:02 UTC

[GitHub] [spark] masseyke opened a new pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

masseyke opened a new pull request #34838:
URL: https://github.com/apache/spark/pull/34838


   This commit adds support for RDDs containing ShortWritables to pyspark. Right now if a user calls sc.newAPIHadoopRDD() with an InputFormat that provides ShortWritables, the call will fail with an error like the one below because ShortWritable is not explicitly handled by PythonHadoopUtil.
   ```
   >>> rdd = sc.newAPIHadoopRDD(inputFormatClass="org.elasticsearch.hadoop.mr.EsInputFormat",
   ...                          keyClass="org.apache.hadoop.io.NullWritable",
   ...                          valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
   ...                          conf=conf)
   2021-12-08 14:38:40,439 ERROR scheduler.TaskSetManager: task 0.0 in stage 15.0 (TID 31) had a not serializable result: org.apache.hadoop.io.ShortWritable
   Serialization stack:
   	- object not serializable (class: org.apache.hadoop.io.ShortWritable, value: 1)
   	- writeObject data (class: java.util.HashMap)
   	- object (class java.util.HashMap, {price=1})
   	- field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
   	- object (class scala.Tuple2, (1,{price=1}))
   	- element of array (index: 0)
   	- array (class [Lscala.Tuple2;, size 1); not retrying
   Traceback (most recent call last):
     File "<stdin>", line 4, in <module>
     File "/home/hduser/spark-3.1.2-bin-hadoop3.2/python/pyspark/context.py", line 853, in newAPIHadoopRDD
       jconf, batchSize)
     File "/home/hduser/spark-3.1.2-bin-hadoop3.2/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
     File "/home/hduser/spark-3.1.2-bin-hadoop3.2/python/pyspark/sql/utils.py", line 111, in deco
       return f(*a, **kw)
     File "/home/hduser/spark-3.1.2-bin-hadoop3.2/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
   py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
   : org.apache.spark.SparkException: Job aborted due to stage failure: task 0.0 in stage 15.0 (TID 31) had a not serializable result: org.apache.hadoop.io.ShortWritable
   Serialization stack:
   	- object not serializable (class: org.apache.hadoop.io.ShortWritable, value: 1)
   	- writeObject data (class: java.util.HashMap)
   	- object (class java.util.HashMap, {price=1})
   	- field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
   	- object (class scala.Tuple2, (1,{price=1}))
   	- element of array (index: 0)
   	- array (class [Lscala.Tuple2;, size 1)
   	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2258)
   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2207)
   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2206)
   	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
   	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
   	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
   	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2206)
   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1079)
   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1079)
   	at scala.Option.foreach(Option.scala:407)
   	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1079)
   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2445)
   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2387)
   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2376)
   	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:868)
   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2196)
   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2217)
   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2236)
   	at org.apache.spark.rdd.RDD.$anonfun$take$1(RDD.scala:1449)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
   	at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
   	at org.apache.spark.rdd.RDD.take(RDD.scala:1422)
   	at org.apache.spark.api.python.SerDeUtil$.pairRDDToPython(SerDeUtil.scala:173)
   	at org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:385)
   	at org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:238)
   	at java.lang.Thread.run(Thread.java:748)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989428630


   We would also have to file a JIRA. See also https://spark.apache.org/contributing.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990133363


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146043/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991410595


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146082/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] masseyke commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
masseyke commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991103253


   I'm unclear on why tests are failing. They actually don't seem to be executing. I believe I've enabled everything I need to in the actions tab of my fork. Any tips on what I might be missing? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] masseyke commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
masseyke commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991348936


   I added ByteWritable, and added roundtrip tests for all of the basic types. That did mean that I needed to update convertToWritable() as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991398173


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50557/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990063081


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50518/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991361257


   **[Test build #146082 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146082/testReport)** for PR 34838 at commit [`63f3bfe`](https://github.com/apache/spark/commit/63f3bfec9b7a48b4e2296a17c2b0b36ecb7537a7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] masseyke commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
masseyke commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991293683


   > Looks like we are missing ByteWritable too, can you add that here? Bonus points for adding tests for all the primitive types, as you started a test suite, but not required.
   
   Oh you're right -- I missed that there's a separate ByteWritable from BytesWritable. I'll add that (and some more tests). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989366938


   **[Test build #146018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146018/testReport)** for PR 34838 at commit [`c32f6f3`](https://github.com/apache/spark/commit/c32f6f3278b66e6577e1660b08659765f04000d3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990763844


   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-993149588


   Sorry for late reply, @srowen .  I'm +1 for @HyukjinKwon 's analysis.
   > Maybe a question for dongjoon-hyun or others more familiar with the GA builds 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991998470


   nope, that error is likely from https://github.com/apache/spark/commit/16d1c68e8b185457ae86a248d0874e61c3bc6f3a. the latest master branch requires to have that package but the old branch in the forked repository does not have it. it shouldn't be an issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991818060


   @masseyke, can you retrigger https://github.com/masseyke/spark/actions/runs/1565441431 and see how it goes? I think there was an issue in GA env. It should take around 1.5 hours in general.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990713499


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991405611


   **[Test build #146082 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146082/testReport)** for PR 34838 at commit [`63f3bfe`](https://github.com/apache/spark/commit/63f3bfec9b7a48b4e2296a17c2b0b36ecb7537a7).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989486030


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146018/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991290049






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990133363


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146043/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991290049


   I even wonder: should we support String -> Text Writable here too?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] masseyke commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
masseyke commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989885218


   
   > Looks good but how did you test this? it would be great to keep GitHub PR template: https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE. Also Apache Spark uses GitHub Actions from forked repository, see also https://github.com/apache/spark/pull/34838/checks?check_run_id=4459879248
   
   I didn't see an existing unit test for that class, so I manually tested it by creating a mapping with a short field in Elasticsearch, and then reading that data with es-spark (it fails before the patch). I can put the template back in and add a jira ticket. I believe I've enabled github actions now (not completely sure) but I don't know how to re-trigger the build.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] masseyke commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
masseyke commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991294874


   > I even wonder: should we support String -> Text Writable here too?
   
   Not sure what you mean here -- we do support Text -> String. Maybe a typo, and you meant we ought to support Short -> ShortWritable in convertToWritable()? That would make sense to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989459508


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50494/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991337476


   Sorry, no I am just not reading carefully. string is already there in the code, disregard that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991818060


   @masseyke, can you retrigger https://github.com/masseyke/spark/actions/runs/1565441431 and see how it goes? I think there was an issue in GA env. It should take around 1.5 hours in general.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #34838:
URL: https://github.com/apache/spark/pull/34838


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991410595


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146082/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991398173






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989366938


   **[Test build #146018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146018/testReport)** for PR 34838 at commit [`c32f6f3`](https://github.com/apache/spark/commit/c32f6f3278b66e6577e1660b08659765f04000d3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989644431


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989448931


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50494/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991361257


   **[Test build #146082 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146082/testReport)** for PR 34838 at commit [`63f3bfe`](https://github.com/apache/spark/commit/63f3bfec9b7a48b4e2296a17c2b0b36ecb7537a7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989428887


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50494/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989644431


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991378325


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50557/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990088378


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50518/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991945732


   Error looks unrelated, I think? 
   ```
   Error in loadNamespace(x) : there is no package called ‘pkgdown’
   Calls: loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
   Execution halted
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989364617


   ok to test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990025177


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50518/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] masseyke commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
masseyke commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991818905


   > @masseyke, can you retrigger https://github.com/masseyke/spark/actions/runs/1565441431 and see how it goes? I think there was an issue in GA env. It should take around 1.5 hours in general.
   
   I have retriggered it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991361257


   **[Test build #146082 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146082/testReport)** for PR 34838 at commit [`63f3bfe`](https://github.com/apache/spark/commit/63f3bfec9b7a48b4e2296a17c2b0b36ecb7537a7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990115313


   **[Test build #146043 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146043/testReport)** for PR 34838 at commit [`405cf44`](https://github.com/apache/spark/commit/405cf44e4b8244efa08f1ca7d0809ba4f5e864c5).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `class PythonHadoopUtilSuite extends SparkFunSuite `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-988990580


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991391253


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50557/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991998648


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989459508


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50494/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991398173


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50557/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990088378


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50518/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989984331


   **[Test build #146043 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146043/testReport)** for PR 34838 at commit [`405cf44`](https://github.com/apache/spark/commit/405cf44e4b8244efa08f1ca7d0809ba4f5e864c5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-988990580


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] masseyke commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
masseyke commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991293683






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989483809


   **[Test build #146018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146018/testReport)** for PR 34838 at commit [`c32f6f3`](https://github.com/apache/spark/commit/c32f6f3278b66e6577e1660b08659765f04000d3).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989486030


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146018/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-989984331


   **[Test build #146043 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146043/testReport)** for PR 34838 at commit [`405cf44`](https://github.com/apache/spark/commit/405cf44e4b8244efa08f1ca7d0809ba4f5e864c5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991361257






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-990713499






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #34838: [SPARK-37598][PYTHON] Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-991683923


   Maybe a question for @dongjoon-hyun or others more familiar with the GA builds - this fails because it takes more than 6 hours. Are we facing that problem in general with GA tests and is there anything we can do, other than try to shorten the tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] masseyke commented on pull request #34838: Adding support for ShortWriables to pyspark's newAPIHadoopRDD method

Posted by GitBox <gi...@apache.org>.
masseyke commented on pull request #34838:
URL: https://github.com/apache/spark/pull/34838#issuecomment-988964444


   This was originally reported here -- https://discuss.elastic.co/t/es-hadoop-serialize-org-apache-hadoop-io-shortwritable-failed/276894


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org