You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by zjffdu <gi...@git.apache.org> on 2015/12/15 11:03:15 UTC

[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

GitHub user zjffdu opened a pull request:

    https://github.com/apache/spark/pull/10307

    [SPARK-12334][SQL][PYSPARK] Support read from multiple input paths fo…

    …r orc file in DataFrameReader.orc
    
    Beside the issue in spark api, also fix 2 minor issues in pyspark
    * support read from multiple input paths for orc
    * support read from multiple input paths for text

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zjffdu/spark SPARK-12334

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10307.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10307
    
----
commit c0f90bf92b5143b557c223da7b9bb799972b8655
Author: Jeff Zhang <zj...@apache.org>
Date:   2015-12-15T10:00:47Z

    [SPARK-12334][SQL][PYSPARK] Support read from multiple input paths for orc file in DataFrameReader.orc

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by 3ourroom <gi...@git.apache.org>.
Github user 3ourroom commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-164712854
  
    
    NAVER - http://www.naver.com/
    --------------------------------------------
    
    3ourroom@naver.com 님께 보내신 메일 <Re: [spark] [SPARK-12334][SQL][PYSPARK] Support read from multiple input paths fo… (#10307)> 이 다음과 같은 이유로 전송 실패했습니다.
    
    --------------------------------------------
    
    받는 사람이 회원님의 메일을 수신차단 하였습니다. 
    
    
    --------------------------------------------



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r47727150
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
    @@ -322,11 +322,11 @@ class DataFrameReader private[sql](sqlContext: SQLContext) extends Logging {
       /**
        * Loads an ORC file and returns the result as a [[DataFrame]].
    --- End diff --
    
    Thanks @bomeng  also update docs for other formats


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #66591 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66591/consoleFull)** for PR 10307 at commit [`727b35a`](https://github.com/apache/spark/commit/727b35a6024adad61d89d2d515c3a1561df51cd2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    @Donderia it seems like some of the files have changed since 1.6 so this won't apply cleanly against 1.6
    @zjffdu if your still working on this can you update this against the latest master? Also this seems like it part of this was fixed in bcaa799cb01289f73e9f48526e94653a07628983 but not the orc part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    ping @holdenk @JoshRosen @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73647/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-211831358
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56216/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74265/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #66592 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66592/consoleFull)** for PR 10307 at commit [`6ac0580`](https://github.com/apache/spark/commit/6ac05805391f13dcd0530f1ecedbd837befcfb20).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #73647 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73647/testReport)** for PR 10307 at commit [`0686453`](https://github.com/apache/spark/commit/0686453da59a6502a7c44c02210b7367c1379e4f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-211854050
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged to master, thank you @zjffdu 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by bomeng <gi...@git.apache.org>.
Github user bomeng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r47713140
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
    @@ -322,11 +322,11 @@ class DataFrameReader private[sql](sqlContext: SQLContext) extends Logging {
       /**
        * Loads an ORC file and returns the result as a [[DataFrame]].
    --- End diff --
    
    The comments should be updated as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r104874993
  
    --- Diff: python/pyspark/sql/readwriter.py ---
    @@ -282,6 +282,23 @@ def parquet(self, *paths):
             """
             return self._df(self._jreader.parquet(_to_seq(self._spark._sc, paths)))
     
    +    @since(2.2)
    +    def parquet(self, path):
    --- End diff --
    
    Thanks @holdenk , I learned a new thing of python. I reverted the changes on parquet, It would be very weird to change it as `def parquet(self, *paths, path=None):` and  `def parquet(self, **kwargs:)` would break the code without using keyword argument, e.g. `parquet("p_file")`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74096/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74097/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by Donderia <gi...@git.apache.org>.
Github user Donderia commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Thanks for your kind reply. sorry for the delay in reply
    
    I have fixed this and it working fine.  )
    
    
    
    On Sat, Feb 18, 2017 at 12:41 AM, Holden Karau <no...@github.com>
    wrote:
    
    > Gentle ping @zjffdu <https://github.com/zjffdu>
    >
    > \u2014
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/spark/pull/10307#issuecomment-280739480>, or mute
    > the thread
    > <https://github.com/notifications/unsubscribe-auth/AFjRF7dFzSmNXrxVDcyUtgEve6Br0E6mks5rdfDrgaJpZM4G1gGy>
    > .
    >
    
    
    
    -- 
    
    Thanking You
    With Regards
    
    Vishal Donderia
    vishaldonderia@gmail.com
    +91-9711556310



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #66773 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66773/consoleFull)** for PR 10307 at commit [`b9e6481`](https://github.com/apache/spark/commit/b9e64815890db81d8168e4aa350b939b9b83c94e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66592/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #66592 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66592/consoleFull)** for PR 10307 at commit [`6ac0580`](https://github.com/apache/spark/commit/6ac05805391f13dcd0530f1ecedbd837befcfb20).
     * This patch **fails Python style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74198/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-164962297
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r48002910
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
    @@ -164,6 +165,14 @@ class OrcQuerySuite extends QueryTest with BeforeAndAfterAll with OrcTest {
         }
       }
     
    +  test("read from multiple input paths") {
    +    val path1 = Utils.createTempDir()
    +    val path2 = Utils.createTempDir()
    +    makeOrcFile((1 to 10).map(Tuple1.apply), path1)
    +    makeOrcFile((1 to 10).map(Tuple1.apply), path2)
    +    assertResult(20)(read.orc(path1.getCanonicalPath, path2.getCanonicalPath).count())
    --- End diff --
    
    It will be deleted automatically after the program exit. 
    ```
      /**
       * Create a temporary directory inside the given parent directory. The directory will be
       * automatically deleted when the VM shuts down.
       */
      def createTempDir(
          root: String = System.getProperty("java.io.tmpdir"),
          namePrefix: String = "spark"): File = {
        val dir = createDirectory(root, namePrefix)
        ShutdownHookManager.registerShutdownDeleteDir(dir)
        dir
      }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu closed the pull request at:

    https://github.com/apache/spark/pull/10307


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-211831350
  
    **[Test build #56216 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56216/consoleFull)** for PR 10307 at commit [`a8690e1`](https://github.com/apache/spark/commit/a8690e14ed54eaa239ba30dd25c328f54d955ed7).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-211854014
  
    **[Test build #56219 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56219/consoleFull)** for PR 10307 at commit [`2109a94`](https://github.com/apache/spark/commit/2109a94a44ce63ccf3eefa3b25938505ee37037c).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #74096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74096/testReport)** for PR 10307 at commit [`a2d35dc`](https://github.com/apache/spark/commit/a2d35dcec73f22f8dce6d57158c5eaa8de7724fe).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-164740563
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47725/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-211672664
  
    @zjffdu would you want to update this against master so jenkins can give it a run?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #73647 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73647/testReport)** for PR 10307 at commit [`0686453`](https://github.com/apache/spark/commit/0686453da59a6502a7c44c02210b7367c1379e4f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #74265 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74265/testReport)** for PR 10307 at commit [`2a5c3c6`](https://github.com/apache/spark/commit/2a5c3c60d4fc7f5b979e1ea2bcfbc71eda4da0c2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73577/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #73577 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73577/testReport)** for PR 10307 at commit [`01501fa`](https://github.com/apache/spark/commit/01501fad0d11b50381852fce5a3c8739b9983d87).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73567/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r104695304
  
    --- Diff: python/pyspark/sql/readwriter.py ---
    @@ -282,6 +282,23 @@ def parquet(self, *paths):
             """
             return self._df(self._jreader.parquet(_to_seq(self._spark._sc, paths)))
     
    +    @since(2.2)
    +    def parquet(self, path):
    --- End diff --
    
    Having two functions with the same name and different args doesn't behave like in Scala (so this won't work). Please use kwargs or similar and add a test for paths and path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-165704298
  
    +1, I vote to request this feature for a while.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-164712640
  
    **[Test build #47725 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47725/consoleFull)** for PR 10307 at commit [`c0f90bf`](https://github.com/apache/spark/commit/c0f90bf92b5143b557c223da7b9bb799972b8655).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #74097 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74097/testReport)** for PR 10307 at commit [`fb883c8`](https://github.com/apache/spark/commit/fb883c8cdef99dc77aac20a10242f2cb34fc1a20).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by 3ourroom <gi...@git.apache.org>.
Github user 3ourroom commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-164710262
  
    
    NAVER - http://www.naver.com/
    --------------------------------------------
    
    3ourroom@naver.com 님께 보내신 메일 <[spark] [SPARK-12334][SQL][PYSPARK] Support read from multiple input paths fo… (#10307)> 이 다음과 같은 이유로 전송 실패했습니다.
    
    --------------------------------------------
    
    받는 사람이 회원님의 메일을 수신차단 하였습니다. 
    
    
    --------------------------------------------



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-165314680
  
    please build it again (if it works)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Sorry for the delay in getting to this, do you have time to update this to the latest master branch? It would be a nice small fix/improvement to get in :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by yanboliang <gi...@git.apache.org>.
Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r48001396
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
    @@ -164,6 +165,14 @@ class OrcQuerySuite extends QueryTest with BeforeAndAfterAll with OrcTest {
         }
       }
     
    +  test("read from multiple input paths") {
    +    val path1 = Utils.createTempDir()
    +    val path2 = Utils.createTempDir()
    +    makeOrcFile((1 to 10).map(Tuple1.apply), path1)
    +    makeOrcFile((1 to 10).map(Tuple1.apply), path2)
    +    assertResult(20)(read.orc(path1.getCanonicalPath, path2.getCanonicalPath).count())
    --- End diff --
    
    We need to remove generated temporary file automatically, use ```withOrcFile``` or  ```withTempDir```.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r103530578
  
    --- Diff: python/pyspark/sql/readwriter.py ---
    @@ -388,16 +388,18 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non
             return self._df(self._jreader.csv(self._spark._sc._jvm.PythonUtils.toSeq(path)))
     
         @since(1.5)
    -    def orc(self, path):
    -        """Loads an ORC file, returning the result as a :class:`DataFrame`.
    +    def orc(self, paths):
    --- End diff --
    
    So if someones been calling orc with a named param of path this could cause them problems when they upgrade. I might be being overly cautious but it seems like we should avoid breaking that since we don't have to until the next major version change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r104454307
  
    --- Diff: python/pyspark/sql/readwriter.py ---
    @@ -388,16 +388,18 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non
             return self._df(self._jreader.csv(self._spark._sc._jvm.PythonUtils.toSeq(path)))
     
         @since(1.5)
    -    def orc(self, path):
    -        """Loads an ORC file, returning the result as a :class:`DataFrame`.
    +    def orc(self, paths):
    --- End diff --
    
    We might as well make it consistent in this PR if we can do it without breaking anything.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Gentle ping @zjffdu :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #73567 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73567/testReport)** for PR 10307 at commit [`401f682`](https://github.com/apache/spark/commit/401f6829dcadb6d0f2ce51c99520cc55dbc28995).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73569/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71521/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-211831056
  
    **[Test build #56216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56216/consoleFull)** for PR 10307 at commit [`a8690e1`](https://github.com/apache/spark/commit/a8690e14ed54eaa239ba30dd25c328f54d955ed7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-164962302
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47772/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #73567 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73567/testReport)** for PR 10307 at commit [`401f682`](https://github.com/apache/spark/commit/401f6829dcadb6d0f2ce51c99520cc55dbc28995).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-211854056
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56219/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #66773 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66773/consoleFull)** for PR 10307 at commit [`b9e6481`](https://github.com/apache/spark/commit/b9e64815890db81d8168e4aa350b939b9b83c94e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-164740433
  
    **[Test build #47725 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47725/consoleFull)** for PR 10307 at commit [`c0f90bf`](https://github.com/apache/spark/commit/c0f90bf92b5143b557c223da7b9bb799972b8655).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66773/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #74097 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74097/testReport)** for PR 10307 at commit [`fb883c8`](https://github.com/apache/spark/commit/fb883c8cdef99dc77aac20a10242f2cb34fc1a20).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #71521 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71521/consoleFull)** for PR 10307 at commit [`b9e6481`](https://github.com/apache/spark/commit/b9e64815890db81d8168e4aa350b939b9b83c94e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    So right now we've got a mix of path & paths as the name for the arguments to the different file loading things - this is annoying to fix in Python but we should maybe make a JIRA so we follow up on the reader/writer interfaces next time we have a major release. Can you do that @zjffdu ?
    
    Also thank you for working on this for over a year, I'm so sorry its taken so long to get to this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #73577 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73577/testReport)** for PR 10307 at commit [`01501fa`](https://github.com/apache/spark/commit/01501fad0d11b50381852fce5a3c8739b9983d87).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #74198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74198/testReport)** for PR 10307 at commit [`6f366bf`](https://github.com/apache/spark/commit/6f366bf796fcdf9fb82745216487401da6dcb7a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #74096 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74096/testReport)** for PR 10307 at commit [`a2d35dc`](https://github.com/apache/spark/commit/a2d35dcec73f22f8dce6d57158c5eaa8de7724fe).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r104829036
  
    --- Diff: python/pyspark/sql/readwriter.py ---
    @@ -407,15 +424,17 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non
     
         @since(1.5)
         def orc(self, path):
    -        """Loads an ORC file, returning the result as a :class:`DataFrame`.
    +        """Loads ORC files, returning the result as a :class:`DataFrame`.
    --- End diff --
    
    It is in `python/pyspark/sql/tests.py`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r57751258
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
    @@ -164,6 +165,14 @@ class OrcQuerySuite extends QueryTest with BeforeAndAfterAll with OrcTest {
         }
       }
     
    +  test("read from multiple input paths") {
    +    val path1 = Utils.createTempDir()
    +    val path2 = Utils.createTempDir()
    +    makeOrcFile((1 to 10).map(Tuple1.apply), path1)
    +    makeOrcFile((1 to 10).map(Tuple1.apply), path2)
    +    assertResult(20)(read.orc(path1.getCanonicalPath, path2.getCanonicalPath).count())
    --- End diff --
    
    `withOrcFile` will get cleaned up faster (e.g. as soon as the test ends rather than program exit).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #73569 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73569/testReport)** for PR 10307 at commit [`e425438`](https://github.com/apache/spark/commit/e4254389a46e297bf89a45a07d85cd565ba6343e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r104695461
  
    --- Diff: python/pyspark/sql/readwriter.py ---
    @@ -407,15 +424,17 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non
     
         @since(1.5)
         def orc(self, path):
    -        """Loads an ORC file, returning the result as a :class:`DataFrame`.
    +        """Loads ORC files, returning the result as a :class:`DataFrame`.
    --- End diff --
    
    Maybe add a test for loading with a list of orc files.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-211850792
  
    **[Test build #56219 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56219/consoleFull)** for PR 10307 at commit [`2109a94`](https://github.com/apache/spark/commit/2109a94a44ce63ccf3eefa3b25938505ee37037c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #71521 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71521/consoleFull)** for PR 10307 at commit [`b9e6481`](https://github.com/apache/spark/commit/b9e64815890db81d8168e4aa350b939b9b83c94e).
     * This patch **fails PySpark unit tests**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #74198 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74198/testReport)** for PR 10307 at commit [`6f366bf`](https://github.com/apache/spark/commit/6f366bf796fcdf9fb82745216487401da6dcb7a7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10307


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-211831355
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10307#discussion_r103600310
  
    --- Diff: python/pyspark/sql/readwriter.py ---
    @@ -388,16 +388,18 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non
             return self._df(self._jreader.csv(self._spark._sc._jvm.PythonUtils.toSeq(path)))
     
         @since(1.5)
    -    def orc(self, path):
    -        """Loads an ORC file, returning the result as a :class:`DataFrame`.
    +    def orc(self, paths):
    --- End diff --
    
    Good catch, I should not break the compatibility.  BTW,  I found that `DataFrameReader.parquet` use variable length argument which is not consistent with other file formats such as text, json and orc that use string or a list of string.  I can fix this in this PR or can do it in another PR to make them consistent. What do you think ?
    
    ```
    @since(1.4)
        def parquet(self, *paths):
            """Loads Parquet files, returning the result as a :class:`DataFrame`.
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #10307: [SPARK-12334][SQL][PYSPARK] Support read from mul...

Posted by zjffdu <gi...@git.apache.org>.
GitHub user zjffdu reopened a pull request:

    https://github.com/apache/spark/pull/10307

    [SPARK-12334][SQL][PYSPARK] Support read from multiple input paths for orc file in DataFrameReader.orc

    
    
    Beside the issue in spark api, also fix 2 minor issues in pyspark
    * support read from multiple input paths for orc
    * support read from multiple input paths for text

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zjffdu/spark SPARK-12334

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10307.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10307
    
----
commit 3dd3452236156bb7ef36e9d290217e23556f6b6e
Author: Jeff Zhang <zj...@apache.org>
Date:   2015-12-15T10:00:47Z

    [SPARK-12334][SQL][PYSPARK] Support read from multiple input paths for orc file in DataFrameReader.orc

commit b6a26e946fcf4331fb62382537ce2b0964a5b90e
Author: Jeff Zhang <zj...@apache.org>
Date:   2015-12-16T01:41:29Z

    Update doc

commit 24a8f4f70cb9da2d83a39836e8517b55a9238e70
Author: Jeff Zhang <zj...@apache.org>
Date:   2016-04-19T10:36:24Z

    address code style

commit 6ac05805391f13dcd0530f1ecedbd837befcfb20
Author: Jeff Zhang <zj...@apache.org>
Date:   2016-10-09T03:53:41Z

    resolve conflicts

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    So this still doesn't merge with master, if you want to update it would be good to take a look :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #66591 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66591/consoleFull)** for PR 10307 at commit [`727b35a`](https://github.com/apache/spark/commit/727b35a6024adad61d89d2d515c3a1561df51cd2).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12334][SQL][PYSPARK] Support read from ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10307#issuecomment-164740560
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Gentle ping @zjffdu


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    **[Test build #74265 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74265/testReport)** for PR 10307 at commit [`2a5c3c6`](https://github.com/apache/spark/commit/2a5c3c60d4fc7f5b979e1ea2bcfbc71eda4da0c2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by Donderia <gi...@git.apache.org>.
Github user Donderia commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    I am trying to apply this patch on 1.6 branch but patch failed. 
    {code}
    Applying: Support read from multiple input paths for orc file in DataFrameReader.orc
    error: patch failed: python/pyspark/sql/readwriter.py:240
    error: python/pyspark/sql/readwriter.py: patch does not apply
    error: patch failed: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala:388
    error: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala: patch does not apply
    Patch failed at 0001 Support read from multiple input paths for orc file in DataFrameReader.orc
    The copy of the patch that failed is found in:
       /Users/vishaldonderia/Mobileum/Spark/spark/.git/rebase-apply/patch
    When you have resolved this problem, run "git am --continue".
    If you prefer to skip this patch, run "git am --skip" instead.
    To restore the original branch and stop patching, run "git am --abort".
    {code}
    do i am missing something ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #10307: [SPARK-12334][SQL][PYSPARK] Support read from multiple i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/10307
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66591/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org