You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ueshin <gi...@git.apache.org> on 2014/10/03 01:42:09 UTC

[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

GitHub user ueshin opened a pull request:

    https://github.com/apache/spark/pull/2638

    [SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection to prevent breaking binary-compatibility.

    Original problem is [SPARK-3764](https://issues.apache.org/jira/browse/SPARK-3764).
    
    `AppendingParquetOutputFormat` uses a binary-incompatible method `context.getTaskAttemptID`.
    This causes binary-incompatible of Spark itself, i.e. if Spark itself is built against hadoop-1, the artifact is for only hadoop-1, and vice versa.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ueshin/apache-spark issues/SPARK-3771

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2638.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2638
    
----
commit ec213c160393698fa01c62469263f050f3668453
Author: Takuya UESHIN <ue...@happy-camper.st>
Date:   2014-10-02T14:22:46Z

    Use reflection to prevent breaking binary-compatibility.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-57746994
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/259/consoleFull) for   PR 2638 at commit [`ec213c1`](https://github.com/apache/spark/commit/ec213c160393698fa01c62469263f050f3668453).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `          println(s"Failed to load main class $childMainClass.")`
      * `  case class GetPeers(blockManagerId: BlockManagerId) extends ToBlockManagerMaster`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-57791903
  
    A particular instance of Spark will be built for a particular version of Hadoop and/or YARN. It is not at this point a universal binary anyway, and so, I do not think it is necessary to add this indirection via reflection. That is, if you are deploying on Hadoop 1, you need to build Spark for Hadoop 1, and similarly for Hadoop 2. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-57800537
  
    @srowen, Thank you for your comment.
    Indeed, when deploy completed apps to Spark cluster, there is a particular instance of Spark.
    But Spark app developers will use artifacts in Maven Central while developing and unit-testing. The artifacts seem to be built for Hadoop 2, so if they want to test with Hadoop 1, it won't work.
    What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-58596020
  
    @marmbrus, Thank you for your comment.
    The `TaskAttemptContext` is a class in [hadoop-1](https://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/TaskAttemptContext.html) but is an interface in [hadoop-2](http://hadoop.apache.org/docs/r2.5.1/api/org/apache/hadoop/mapreduce/TaskAttemptContext.html).
    The signatures of the method `TaskAttemptContext.getTaskAttemptID` for the both versions are the same, so the method calls are source-compatible but NOT binary-compatible because the opcode of method call for class is [`INVOKEVIRTUAL`](http://cs.au.dk/~mis/dOvs/jvmspec/ref--35.html) and for interface is [`INVOKEINTERFACE`](http://cs.au.dk/~mis/dOvs/jvmspec/ref--32.html).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-58565195
  
    @ueshin I'm not sure I fully understand.  What are the two method signatures in question such that it compiles but then fails at runtime.  Can you perhaps include these details in a comment?
    
    @srowen are you satisfied with that explanation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-57741192
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/259/consoleFull) for   PR 2638 at commit [`ec213c1`](https://github.com/apache/spark/commit/ec213c160393698fa01c62469263f050f3668453).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-57729963
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21231/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-58606003
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21566/consoleFull) for   PR 2638 at commit [`efd3784`](https://github.com/apache/spark/commit/efd3784a756bd1a1b239496ed5a1c1b662c04ffa).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-58603463
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21566/consoleFull) for   PR 2638 at commit [`efd3784`](https://github.com/apache/spark/commit/efd3784a756bd1a1b239496ed5a1c1b662c04ffa).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-58572880
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2638


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-58952072
  
    Thanks! Merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2638#issuecomment-58606008
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21566/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org