You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by markgrover <gi...@git.apache.org> on 2015/12/10 08:27:19 UTC

[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

GitHub user markgrover opened a pull request:

    https://github.com/apache/spark/pull/10248

    Small doc change on how to use dynamic partitioning in Spark SQL

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/markgrover/spark spark_sql_doc

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10248.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10248
    
----
commit 86847064a8ac17cbc00a59b7b022fc1aa9c74a6d
Author: Mark Grover <ma...@apache.org>
Date:   2015-12-10T07:25:26Z

    Small doc change on how to use dynamic partitioning in Spark SQL

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-163526432
  
    **[Test build #47492 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47492/consoleFull)** for PR 10248 at commit [`8684706`](https://github.com/apache/spark/commit/86847064a8ac17cbc00a59b7b022fc1aa9c74a6d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class LambdaVariable(value: String, isNull: String, dataType: DataType) extends LeafExpression`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-164137388
  
    Does this (also) belong in the scaladoc somewhere? Maybe to slip this in better, it's implicit that you use the same Hive syntax so that can be skipped, and just parenthetically mention the setting you must make to use it? I know this is trivial, so don't mind merging as is either, just trying to make it fit a bit better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-163526528
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47492/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-164943903
  
    There isn't really an API where this gets called from, it's simple how you'd write that SQL query so I don't know of a good place to add it in scaladoc.
    
    Thanks again for reviewing, Sean.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-165335580
  
    Ok, thanks Yin. I was just trying to be helpful. I feel like we have spent
    enough collective cycles on this and if it's not a no brainer, it's
    probably not worth it:-)
    Thanks Yin and Sean for your feedback, I will close the PR.
    
    On Wed, Dec 16, 2015 at 8:27 PM, Yin Huai <no...@github.com> wrote:
    
    > I feel it can be confusing to users because users can insert into a Hive
    > partitioned table and they can also create partitioned tables backed by
    > data source API. For data source API backed tables, there is no need to use
    > this conf. For inserting into a Hive table, users can set it to either
    > strict or nonstrict mode. I am not sure we need to add this change.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/10248#issuecomment-165334174>.
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-164579371
  
    @yhuai @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-165296868
  
    What do you mean by mentioning `SET hive.exec.dynamic.partition.mode=nonstrict`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-165334174
  
    I feel it can be confusing to users because users can insert into a Hive partitioned table and they can also create partitioned tables backed by data source API. For data source API backed tables, there is no need to use this conf. For inserting into a Hive table, users can set it to either strict or nonstrict mode. I am not sure we need to add this change. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover closed the pull request at:

    https://github.com/apache/spark/pull/10248


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-163805479
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47558/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-163801659
  
    **[Test build #47558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47558/consoleFull)** for PR 10248 at commit [`d299637`](https://github.com/apache/spark/commit/d29963755a0a434872364089206820df7f123bc2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-165333670
  
    In Hive or in Spark SQL, as I understand, the default partitioning mode is
    strict, due to which at least one of the partitions being inserted to has
    to be statically specified. When doing dynamic partitioning, this mode has
    to be changed before. In spark SQL, that means you have to issue a query
    like sqlContext.sql("SET hive.exec.dynamic.partition.mode=nonstrict")
    before running the actual SQL statement.
    
    Does that answer your question?
    
    On Wed, Dec 16, 2015 at 4:16 PM, Yin Huai <no...@github.com> wrote:
    
    > What do you mean by mentioning SET
    > hive.exec.dynamic.partition.mode=nonstrict?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/10248#issuecomment-165296868>.
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-163799469
  
    Sure thing, I have updated it to be a part of the previous bullet. Open to making a separate heading too but I think it may be an overkill since there will only be one sentence under that heading. Thanks for reviewing, Sean!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-163805397
  
    **[Test build #47558 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47558/consoleFull)** for PR 10248 at commit [`d299637`](https://github.com/apache/spark/commit/d29963755a0a434872364089206820df7f123bc2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-163805475
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-163524116
  
    **[Test build #47492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47492/consoleFull)** for PR 10248 at commit [`8684706`](https://github.com/apache/spark/commit/86847064a8ac17cbc00a59b7b022fc1aa9c74a6d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10248#issuecomment-163526526
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Small doc change on how to use dynamic partiti...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10248#discussion_r47221312
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -2430,6 +2430,7 @@ Spark SQL supports the vast majority of Hive features, such as:
     * Sampling
     * Explain
     * Partitioned tables including dynamic partition insertion
    +  * Simply use the same syntax in Hive, after running `SET hive.exec.dynamic.partition.mode=nonstrict`
    --- End diff --
    
    This isn't quite a feature like the other elements in the list. Can it be a shorter elaboration on the previous point? or is this better documented elsewhere rather than in just the table of contents, so to speak?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org