You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by szalai1 <gi...@git.apache.org> on 2017/03/26 18:16:33 UTC

[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

GitHub user szalai1 opened a pull request:

    https://github.com/apache/spark/pull/17435

    [SPARK-20098][PYSPARK] dataType's typeName fix 

    ## What changes were proposed in this pull request?
    `typeName`  classmethod has been fixed by using type -> typeName map. 
    
    ## How was this patch tested?
    local build
    
    (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/szalai1/spark datatype-gettype-fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17435.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17435
    
----
commit 00994e922be77d686c303d3d6bc3a484478d95d7
Author: Peter Szalai <sz...@gmail.com>
Date:   2017-03-26T17:51:10Z

    getType fix

commit 933f3cb7f0b97f3bd79ea55bbd15dc27d0946c4b
Author: Peter Szalai <sz...@gmail.com>
Date:   2017-03-26T18:10:29Z

    type

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108593342
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    Btw, I don't think `i.typeName()` is a valid usage. We better let it throw an exception when calling `typeName` on `StructField`.
    
    `i.dataType.typeName()` is more reasonable call to me.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81511/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #81584 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81584/testReport)** for PR 17435 at commit [`28f142c`](https://github.com/apache/spark/commit/28f142c13b7eeeb4f14b0fe06af423d737dd75e3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17435


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #80013 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80013/testReport)** for PR 17435 at commit [`8872e19`](https://github.com/apache/spark/commit/8872e190b16b328205e0df569d5f5bc3af6c5610).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108103801
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    It is valid call for `DataType`s but I don't think it is against `StructField`. If you look at the Scala-side codes, `StructField` is not a `DataType` but it seems the parent became `DataType` in Python for some reason in the PR I pointed out. In any way, It seems  ["`A field inside a StructType`"](https://github.com/apache/spark/blob/3694ba48f0db0f47baea4b005cdeef3f454b7329/sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructField.scala#L26).
    
    If you want to know the schema, you could simply
    
    ```python
    >>> spark.range(1).schema[0].simpleString()
    'id:bigint'
    >>> spark.range(1).schema.simpleString()
    'struct<id:bigint>'
    ```
    
    Could you elaborate your use-case and why `simpleString` is not enough?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #81556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81556/testReport)** for PR 17435 at commit [`a451901`](https://github.com/apache/spark/commit/a45190160dba7f75ea26d18f2b16c703649dc541).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by szalai1 <gi...@git.apache.org>.
Github user szalai1 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108348472
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    The reason I called  `typeName ` is, that I wanted to generate a HIVE table dynamically from data and to do this I need the type of each column. 
    
    ```
    >>> sqlContext = SQLContext(sc)
    >>> df = sqlContext.read.json('path_to_a_json_doc')
    >>> cols = []
    >>> for i in df.schema:
    ...   cols.append("`" + i.name + "`" + "\t" +  i.typeName())
    >>> ",\n".join(cols)
    ```
    
    In the real case, I converted the type name to a hive compatible form. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #81556 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81556/testReport)** for PR 17435 at commit [`a451901`](https://github.com/apache/spark/commit/a45190160dba7f75ea26d18f2b16c703649dc541).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    I think we need a test and @holdenk's review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r137685629
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -438,6 +438,11 @@ def toInternal(self, obj):
         def fromInternal(self, obj):
             return self.dataType.fromInternal(obj)
     
    +    def typeName(self):
    +        raise TypeError(
    +            "StructField does not have typename. \
    +            You can use self.dataType.simpleString() instead.")
    --- End diff --
    
    I'd remove `self` here and just say something like ` use typeName() on its type explicitly ...`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108685381
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    Yup, I think we should still fix and overriding it is good enough.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Gentle ping (going through old PySpark PRs) :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108082197
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    Yeah, I don't think it is valid to call `typeName` against a `StructField`. Actually, `StructField` is not a data type, strictly speaking...
    
    I don't know why `StructField` inherits `DataType` in pyspark. In scala, it is not.
    
    Throwing an exception when calling `typeName` on `StructField` seems good enough, instead of a map like this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #81584 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81584/testReport)** for PR 17435 at commit [`28f142c`](https://github.com/apache/spark/commit/28f142c13b7eeeb4f14b0fe06af423d737dd75e3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #81511 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81511/testReport)** for PR 17435 at commit [`a18436d`](https://github.com/apache/spark/commit/a18436d290f7a924d1d46b9e0f14ba3ec86174c6).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #81559 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81559/testReport)** for PR 17435 at commit [`98bac12`](https://github.com/apache/spark/commit/98bac1272e79fb5114bbdf604ef0c5668136f4f9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by szalai1 <gi...@git.apache.org>.
Github user szalai1 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r137749545
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -438,6 +438,11 @@ def toInternal(self, obj):
         def fromInternal(self, obj):
             return self.dataType.fromInternal(obj)
     
    +    def typeName(self):
    +        raise TypeError(
    +            "StructField does not have typename. \
    +            You can use self.dataType.simpleString() instead.")
    --- End diff --
    
    use mean `simpleString()` and not `typeName()`. right ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    @szalai1 Could you fix tests if you're still working on this please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by szalai1 <gi...@git.apache.org>.
Github user szalai1 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108100937
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    @viirya The way I found this bug: 
    I wanted to figure out the schema of a dataset. I loaded it into a data frame  and asked its schema. Then I called the `typeName` on each column. I do not know this is/was best way to do this, but I think it is valid to call `typeName` against a `dataType` to get its real type. 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #78097 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78097/testReport)** for PR 17435 at commit [`6bf11f0`](https://github.com/apache/spark/commit/6bf11f027a5117b82936f88918e13e89993f2108).
     * This patch **fails Python style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81556/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by szalai1 <gi...@git.apache.org>.
Github user szalai1 commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    @holdenk I am happy to contribute to this project. I changed the error message and added a test case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    @szalai1, could you update this PR please? Otherwise, we should close this as a stale PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108593127
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    @szalai1 I think @HyukjinKwon 's code snippets should address your request. Doesn't it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r137773078
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -438,6 +438,11 @@ def toInternal(self, obj):
         def fromInternal(self, obj):
             return self.dataType.fromInternal(obj)
     
    +    def typeName(self):
    +        raise TypeError(
    +            "StructField does not have typeName."
    --- End diff --
    
    I am sorry. Could you add a space at the end `..eName."` -> `...eName. "`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Sorry for my delay finishing the review. This looks good pending Jenkins and if it passes I'll try and merge it when I get home. (so Jenkins looks good to test. Jenkins test this please. ) :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Hey @szalai1 if you've got time to address the style issues its looking good otherwise :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Merged to master and branch-2.2.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80013/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by szalai1 <gi...@git.apache.org>.
Github user szalai1 commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    @ueshin Sure, I'll do it next week. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    @szalai1 I think @HyukjinKwon wanted the space on the previous line/sentence.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    @szalai1, could yoy update this PR please? Otherwise, we should cclose this as a stale PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108437512
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    I guess it would not produce correct fields if `struct` is printed from `StructField`.
    
    ```python
    >>> from pyspark.sql import Row
    >>>
    >>> df = spark.createDataFrame([[Row(a=1), 2]])
    >>> cols = []
    >>> for i in df.schema:
    ...     cols.append("`" + i.name + "`" + "\t" +  i.typeName())
    ...
    >>> print ",\n".join(cols)
    `_1`	struct,
    `_2`	struct
    ```
    
    because actual types are as below:
    
    ```python
    >>> df.schema.simpleString()
    'struct<_1:struct<a:bigint>,_2:bigint>'
    ```
    
    How about the one as below?
    
    ```python
    from pyspark.sql import Row
    
    df = spark.createDataFrame([[Row(a=1), 2]])
    cols = []
    for i in df.schema:
        cols.append("`" + i.name + "`" + "\t" +  i.dataType.simpleString())
    
    print ",\n".join(cols)
    ```
    
    prints
    
    ```
    `_1`	struct<a:bigint>,
    `_2`	bigint
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Thanks for working on this. I feel like the error message could maybe be improved to suggest what the user should be doing? It would be nicer to eventually not have this depend on DataType since we don't have this in the Scala version as @HyukjinKwon pointed out, but I think this could be a good improvement for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108079001
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    cc @holdenk and @viirya WDYT?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81584/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r137685731
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -438,6 +438,11 @@ def toInternal(self, obj):
         def fromInternal(self, obj):
             return self.dataType.fromInternal(obj)
     
    +    def typeName(self):
    +        raise TypeError(
    +            "StructField does not have typename. \
    --- End diff --
    
    Little nit: looks a typo, typename -> typeName.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by szalai1 <gi...@git.apache.org>.
Github user szalai1 commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    @HyukjinKwon sure, I will do it this week. I totally forgot this. Sorry.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r137684263
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -438,6 +438,11 @@ def toInternal(self, obj):
         def fromInternal(self, obj):
             return self.dataType.fromInternal(obj)
     
    +    def typeName(self):
    +        raise TypeError(
    --- End diff --
    
    Could we do like ...
    
    ```python
    raise TypeError(
        "..."
        "...")
    ```
    if it doesn't bother you much?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by szalai1 <gi...@git.apache.org>.
Github user szalai1 commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    @holdenk thanks for reminder. I fixed the problem.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r137752617
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -438,6 +438,11 @@ def toInternal(self, obj):
         def fromInternal(self, obj):
             return self.dataType.fromInternal(obj)
     
    +    def typeName(self):
    +        raise TypeError(
    +            "StructField does not have typename. \
    +            You can use self.dataType.simpleString() instead.")
    --- End diff --
    
    I think it should be `typeName` on its datatype because `typeName` was called.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by szalai1 <gi...@git.apache.org>.
Github user szalai1 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108683954
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    @viirya You are right. I will use it that way.  Thanks @HyukjinKwon !
    
    But my bug report is still valid,  I think. Can I override the `typeName` function? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #78097 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78097/testReport)** for PR 17435 at commit [`6bf11f0`](https://github.com/apache/spark/commit/6bf11f027a5117b82936f88918e13e89993f2108).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78097/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #80013 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80013/testReport)** for PR 17435 at commit [`8872e19`](https://github.com/apache/spark/commit/8872e190b16b328205e0df569d5f5bc3af6c5610).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #81559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81559/testReport)** for PR 17435 at commit [`98bac12`](https://github.com/apache/spark/commit/98bac1272e79fb5114bbdf604ef0c5668136f4f9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108691360
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    **[Test build #81511 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81511/testReport)** for PR 17435 at commit [`a18436d`](https://github.com/apache/spark/commit/a18436d290f7a924d1d46b9e0f14ba3ec86174c6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17435#discussion_r108078597
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -57,7 +57,25 @@ def __ne__(self, other):
     
         @classmethod
         def typeName(cls):
    -        return cls.__name__[:-4].lower()
    +        typeTypeNameMap = {"DataType": "data",
    +                           "NullType": "null",
    +                           "StringType": "string",
    +                           "BinaryType": "binary",
    +                           "BooleanType": "boolean",
    +                           "DateType": "date",
    +                           "TimestampType": "timestamp",
    +                           "DecimalType": "decimal",
    +                           "DoubleType": "double",
    +                           "FloatType": "float",
    +                           "ByteType": "byte",
    +                           "IntegerType": "integer",
    +                           "LongType": "long",
    +                           "ShortType": "short",
    +                           "ArrayType": "array",
    +                           "MapType": "map",
    +                           "StructField": "struct",
    --- End diff --
    
    It seems this problem only applies to `StructField`. Could we just overwrite `typeName` with simply throwing an exception? I think users are not supposed to call `typeName` against `StructField` but `simpleString` against the type instance.
    
    BTW, It apparently seems a bit odd that it extends `DataType` though.. I guess probably some tests are broken if we change the parent as it seems it is dependent on the parent assuming from https://github.com/apache/spark/pull/1598. So, I guess minimised fix would be just to overwrite.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81559/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17435: [SPARK-20098][PYSPARK] dataType's typeName fix

Posted by szalai1 <gi...@git.apache.org>.
Github user szalai1 commented on the issue:

    https://github.com/apache/spark/pull/17435
  
    @holdenk Thanks for reviewing it. I fixed the style issues. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org