You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mengxr <gi...@git.apache.org> on 2014/10/08 02:24:38 UTC

[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

GitHub user mengxr opened a pull request:

    https://github.com/apache/spark/pull/2701

    [WIP][SPARK-3569][SQL] Add metadata field to StructField

    Add `metadata` to `StructField`, which is a map that can store information about the column. The metadata is preserved through simple operations like `SELECT`. This PR doesn't handle Ser/De of schema.
    
    Questions:
    1. Should we use a mutable map or immutable map for metadata?
    2. I put the tests in `SQLQuerySuite`. Is it the right place?
    
    @marmbrus @liancheng

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mengxr/spark structfield-metadata

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2701.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2701
    
----
commit c194d5e4827955427b1ffac5bb582b994ae20cac
Author: Xiangrui Meng <me...@databricks.com>
Date:   2014-09-17T21:41:49Z

    add metadata field to StructField and Attribute

commit 367d237b3d5e445a67e6a8b9c9ae79abff26a045
Author: Xiangrui Meng <me...@databricks.com>
Date:   2014-10-07T20:17:02Z

    add test

commit d65072e483da9fd2dbd4999a4976befe2ce054d1
Author: Xiangrui Meng <me...@databricks.com>
Date:   2014-10-07T21:46:48Z

    remove Map.empty

commit 67fdebb7412484c6674052d2ac6c94573e846ce5
Author: Xiangrui Meng <me...@databricks.com>
Date:   2014-10-07T22:06:12Z

    add test on join

commit d8af0edc105e767e22d3d2696587a502741b9416
Author: Xiangrui Meng <me...@databricks.com>
Date:   2014-10-08T00:19:33Z

    move tests to SQLQuerySuite

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59277023
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21772/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59859828
  
    I looked this over and only had a few minor comments.  Also, can you update the PR description?  I think its pretty out of date.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2701#discussion_r18868688
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/Metadata.scala ---
    @@ -0,0 +1,252 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.util
    +
    +import scala.collection.mutable
    +
    +import org.json4s._
    +import org.json4s.jackson.JsonMethods._
    +
    +/**
    + * Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean,
    + * Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], and
    + * Array[Metadata]. JSON is used for serialization.
    + *
    + * The default constructor is private. User should use either [[MetadataBuilder]] or
    + * [[Metadata$#fromJson]] to create Metadata instances.
    + *
    + * @param map an immutable map that stores the data
    + */
    +sealed class Metadata private[util] (private[util] val map: Map[String, Any]) extends Serializable {
    +
    +  /** Gets a Long. */
    +  def getLong(key: String): Long = get(key)
    +
    +  /** Gets a Double. */
    +  def getDouble(key: String): Double = get(key)
    +
    +  /** Gets a Boolean. */
    +  def getBoolean(key: String): Boolean = get(key)
    +
    +  /** Gets a String. */
    +  def getString(key: String): String = get(key)
    +
    +  /** Gets a Metadata. */
    +  def getMetadata(key: String): Metadata = get(key)
    +
    +  /** Gets a Long array. */
    +  def getLongArray(key: String): Array[Long] = get(key)
    +
    +  /** Gets a Double array. */
    +  def getDoubleArray(key: String): Array[Double] = get(key)
    +
    +  /** Gets a Boolean array. */
    +  def getBooleanArray(key: String): Array[Boolean] = get(key)
    +
    +  /** Gets a String array. */
    +  def getStringArray(key: String): Array[String] = get(key)
    +
    +  /** Gets a Metadata array. */
    +  def getMetadataArray(key: String): Array[Metadata] = get(key)
    +
    +  /** Converts to its JSON representation. */
    +  def json: String = compact(render(jsonValue))
    +
    +  override def toString: String = json
    +
    +  override def equals(obj: Any): Boolean = {
    +    obj match {
    +      case that: Metadata =>
    +        if (map.keySet == that.map.keySet) {
    +          map.keys.forall { k =>
    +            (map(k), that.map(k)) match {
    +              case (v0: Array[_], v1: Array[_]) =>
    +                v0.view == v1.view
    +              case (v0, v1) =>
    +                v0 == v1
    +            }
    +          }
    +        } else {
    +          false
    +        }
    +      case other =>
    +        false
    +    }
    +  }
    +
    +  override def hashCode: Int = Metadata.hash(this)
    +
    +  private def get[T](key: String): T = {
    +    map(key).asInstanceOf[T]
    +  }
    +
    +  private[sql] def jsonValue: JValue = Metadata.toJsonValue(this)
    +}
    +
    +object Metadata {
    +
    +  /** Returns an empty Metadata. */
    +  def empty: Metadata = new Metadata(Map.empty)
    +
    +  /** Creates a Metadata instance from JSON. */
    +  def fromJson(json: String): Metadata = {
    +    fromJObject(parse(json).asInstanceOf[JObject])
    +  }
    +
    +  /** Creates a Metadata instance from JSON AST. */
    +  private[sql] def fromJObject(jObj: JObject): Metadata = {
    +    val builder = new MetadataBuilder
    +    jObj.obj.foreach {
    +      case (key, JInt(value)) =>
    +        builder.putLong(key, value.toLong)
    +      case (key, JDouble(value)) =>
    +        builder.putDouble(key, value)
    +      case (key, JBool(value)) =>
    +        builder.putBoolean(key, value)
    +      case (key, JString(value)) =>
    +        builder.putString(key, value)
    +      case (key, o: JObject) =>
    +        builder.putMetadata(key, fromJObject(o))
    +      case (key, JArray(value)) =>
    +        if (value.isEmpty) {
    +          // If it is an empty array, we cannot infer its element type. We put an empty Array[Long].
    +          builder.putLongArray(key, Array.empty)
    +        } else {
    +          value.head match {
    +            case _: JInt =>
    +              builder.putLongArray(key, value.asInstanceOf[List[JInt]].map(_.num.toLong).toArray)
    +            case _: JDouble =>
    +              builder.putDoubleArray(key, value.asInstanceOf[List[JDouble]].map(_.num).toArray)
    +            case _: JBool =>
    +              builder.putBooleanArray(key, value.asInstanceOf[List[JBool]].map(_.value).toArray)
    +            case _: JString =>
    +              builder.putStringArray(key, value.asInstanceOf[List[JString]].map(_.s).toArray)
    +            case _: JObject =>
    +              builder.putMetadataArray(
    +                key, value.asInstanceOf[List[JObject]].map(fromJObject).toArray)
    +            case other =>
    +              throw new RuntimeException(s"Do not support array of type ${other.getClass}.")
    +          }
    +        }
    +      case other =>
    +        throw new RuntimeException(s"Do not support type ${other.getClass}.")
    +    }
    +    builder.build()
    +  }
    +
    +  /** Converts to JSON AST. */
    +  private def toJsonValue(obj: Any): JValue = {
    +    obj match {
    +      case map: Map[_, _] =>
    +        val fields = map.toList.map { case (k: String, v) => (k, toJsonValue(v))}
    --- End diff --
    
    Space before `}`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-61162403
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22543/consoleFull) for   PR 2701 at commit [`c35203f`](https://github.com/apache/spark/commit/c35203f2d7ec25b148aa9800f96486911d46227c).
     * This patch **fails Spark unit tests**.
     * This patch **does not merge cleanly**.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `
      * `class Metadata extends org.apache.spark.sql.catalyst.util.Metadata `
      * `public class MetadataBuilder extends org.apache.spark.sql.catalyst.util.MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58291342
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21435/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60535338
  
      [Test build #465 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/465/consoleFull) for   PR 2701 at commit [`589f314`](https://github.com/apache/spark/commit/589f3141c168f4c3565f4a1f9c92ca222cc8b048).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60333798
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22098/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59453468
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21825/consoleFull) for   PR 2701 at commit [`4266f4d`](https://github.com/apache/spark/commit/4266f4dd4df4b006d3a54144558cb92bf46003a7).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-61150699
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22543/consoleFull) for   PR 2701 at commit [`c35203f`](https://github.com/apache/spark/commit/c35203f2d7ec25b148aa9800f96486911d46227c).
     * This patch **does not merge cleanly**.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59888060
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21967/consoleFull) for   PR 2701 at commit [`611d3c2`](https://github.com/apache/spark/commit/611d3c20cf4aed9927b596d89b9ac96b2cbbcdec).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60538710
  
      [Test build #465 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/465/consoleFull) for   PR 2701 at commit [`589f314`](https://github.com/apache/spark/commit/589f3141c168f4c3565f4a1f9c92ca222cc8b048).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59277013
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21772/consoleFull) for   PR 2701 at commit [`4266f4d`](https://github.com/apache/spark/commit/4266f4dd4df4b006d3a54144558cb92bf46003a7).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59446362
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21825/consoleFull) for   PR 2701 at commit [`4266f4d`](https://github.com/apache/spark/commit/4266f4dd4df4b006d3a54144558cb92bf46003a7).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-61384627
  
    Thanks!  Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60475420
  
      [Test build #443 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/443/consoleFull) for   PR 2701 at commit [`589f314`](https://github.com/apache/spark/commit/589f3141c168f4c3565f4a1f9c92ca222cc8b048).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59316897
  
    test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59317274
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21793/consoleFull) for   PR 2701 at commit [`4266f4d`](https://github.com/apache/spark/commit/4266f4dd4df4b006d3a54144558cb92bf46003a7).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58291338
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21435/consoleFull) for   PR 2701 at commit [`c41a664`](https://github.com/apache/spark/commit/c41a664f902d686c4c9ff5007dd080f51523f484).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2701#discussion_r18679001
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala ---
    @@ -377,24 +378,37 @@ case class ArrayType(elementType: DataType, containsNull: Boolean) extends DataT
      * @param name The name of this field.
      * @param dataType The data type of this field.
      * @param nullable Indicates if values of this field can be `null` values.
    + * @param metadata The metadata of this field, which is a map from string to simple type that can be
    + *                 serialized to JSON automatically. The metadata should be preserved during
    + *                 transformation if the content of the column is not modified, e.g, in selection.
      */
    -case class StructField(name: String, dataType: DataType, nullable: Boolean) {
    +case class StructField(
    +    name: String,
    +    dataType: DataType,
    +    nullable: Boolean,
    +    metadata: Map[String, Any] = Map.empty) {
     
       private[sql] def buildFormattedString(prefix: String, builder: StringBuilder): Unit = {
         builder.append(s"$prefix-- $name: ${dataType.typeName} (nullable = $nullable)\n")
         DataType.buildFormattedString(dataType, s"$prefix    |", builder)
       }
     
    +  override def toString: String = {
    +    // Do not add metadata to be consistent with CaseClassStringParser.
    --- End diff --
    
    I'm not sure if we need to override here. The CaseClassStringParser is only for reading legacy parquet files.  I think it is okay if old version of spark don't handle types exactly when reading data written by new versions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60538403
  
    @mengxr, thanks for working on this!  Overall LGTM.  One minor thing: I think we should expose Metadata as a type variable in the `org.apache.spark.sql` package and as a subclass in `org.apache.spark.sql.api.java`.  Catalyst is considered a Private API so we explicitly expose the parts of it in sql that we expect users to use.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58302975
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21440/consoleFull) for   PR 2701 at commit [`7e5a322`](https://github.com/apache/spark/commit/7e5a322eec77b3228f60f74ff8f573324b82c67c).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58295008
  
    I think using immutable Map for metadata is enough. We can add an API like `.transformMetadata(f: Map[String, Any] => Map[String, Any])` to alter metadata as needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59121002
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21738/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2701#discussion_r18560809
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala ---
    @@ -348,8 +356,8 @@ case class StructType(fields: Seq[StructField]) extends DataType {
        * have a name matching the given name, `null` will be returned.
        */
       def apply(name: String): StructField = {
    -    nameToField.get(name).getOrElse(
    -      throw new IllegalArgumentException(s"Field ${name} does not exist."))
    +    nameToField.getOrElse(name,
    +      throw new IllegalArgumentException(s"Field $name does not exist."))
    --- End diff --
    
    Nit: no need to wrap


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59134110
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21748/consoleFull) for   PR 2701 at commit [`c9d7301`](https://github.com/apache/spark/commit/c9d7301b5e395b8e51030f1be17e35c05de23b7a).
     * This patch **does not merge cleanly**.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-61356068
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22671/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60537553
  
      [Test build #458 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/458/consoleFull) for   PR 2701 at commit [`589f314`](https://github.com/apache/spark/commit/589f3141c168f4c3565f4a1f9c92ca222cc8b048).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60863050
  
    Here's a PR to fix the package visibility.  If that looks good to you I think this is ready to merge: https://github.com/mengxr/spark/pull/1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2701#discussion_r19314368
  
    --- Diff: python/pyspark/sql.py ---
    @@ -305,12 +305,15 @@ class StructField(DataType):
     
         """
     
    -    def __init__(self, name, dataType, nullable):
    +    def __init__(self, name, dataType, nullable, metadata={}):
    --- End diff --
    
    Use {} as default value will have side effects, such as:
    ```
    >>> a = StructField('a', StringType(), True)
    >>> b = StructField('b', StringType(), True)
    >>> a.metadata['name'] = 'a'
    >>> b.metadata['name']
    'a'
    ```
    
    So if the meta could be modified somewhere, here you should use `None` as default value.
    ```
    def xxx(xxx, metadata=None):
        .... 
        self.metadata = metadata or {}
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59322320
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21793/consoleFull) for   PR 2701 at commit [`4266f4d`](https://github.com/apache/spark/commit/4266f4dd4df4b006d3a54144558cb92bf46003a7).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58304563
  
    #2563 has already replaced `str(...)` with `.json()`. You can add Python metadata ser/de by modifying `.jsonValue()`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58299197
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21440/consoleFull) for   PR 2701 at commit [`7e5a322`](https://github.com/apache/spark/commit/7e5a322eec77b3228f60f74ff8f573324b82c67c).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60333792
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22098/consoleFull) for   PR 2701 at commit [`589f314`](https://github.com/apache/spark/commit/589f3141c168f4c3565f4a1f9c92ca222cc8b048).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2701#discussion_r18872439
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/Metadata.scala ---
    @@ -0,0 +1,252 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.util
    +
    +import scala.collection.mutable
    +
    +import org.json4s._
    +import org.json4s.jackson.JsonMethods._
    +
    +/**
    + * Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean,
    + * Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], and
    + * Array[Metadata]. JSON is used for serialization.
    + *
    + * The default constructor is private. User should use either [[MetadataBuilder]] or
    + * [[Metadata$#fromJson]] to create Metadata instances.
    + *
    + * @param map an immutable map that stores the data
    + */
    +sealed class Metadata private[util] (private[util] val map: Map[String, Any]) extends Serializable {
    +
    +  /** Gets a Long. */
    +  def getLong(key: String): Long = get(key)
    +
    +  /** Gets a Double. */
    +  def getDouble(key: String): Double = get(key)
    +
    +  /** Gets a Boolean. */
    +  def getBoolean(key: String): Boolean = get(key)
    +
    +  /** Gets a String. */
    +  def getString(key: String): String = get(key)
    +
    +  /** Gets a Metadata. */
    +  def getMetadata(key: String): Metadata = get(key)
    +
    +  /** Gets a Long array. */
    +  def getLongArray(key: String): Array[Long] = get(key)
    +
    +  /** Gets a Double array. */
    +  def getDoubleArray(key: String): Array[Double] = get(key)
    +
    +  /** Gets a Boolean array. */
    +  def getBooleanArray(key: String): Array[Boolean] = get(key)
    +
    +  /** Gets a String array. */
    +  def getStringArray(key: String): Array[String] = get(key)
    +
    +  /** Gets a Metadata array. */
    +  def getMetadataArray(key: String): Array[Metadata] = get(key)
    +
    +  /** Converts to its JSON representation. */
    +  def json: String = compact(render(jsonValue))
    +
    +  override def toString: String = json
    +
    +  override def equals(obj: Any): Boolean = {
    +    obj match {
    +      case that: Metadata =>
    +        if (map.keySet == that.map.keySet) {
    +          map.keys.forall { k =>
    +            (map(k), that.map(k)) match {
    +              case (v0: Array[_], v1: Array[_]) =>
    +                v0.view == v1.view
    +              case (v0, v1) =>
    +                v0 == v1
    +            }
    +          }
    +        } else {
    +          false
    +        }
    +      case other =>
    +        false
    +    }
    +  }
    +
    +  override def hashCode: Int = Metadata.hash(this)
    +
    +  private def get[T](key: String): T = {
    +    map(key).asInstanceOf[T]
    +  }
    +
    +  private[sql] def jsonValue: JValue = Metadata.toJsonValue(this)
    +}
    +
    +object Metadata {
    +
    +  /** Returns an empty Metadata. */
    +  def empty: Metadata = new Metadata(Map.empty)
    +
    +  /** Creates a Metadata instance from JSON. */
    +  def fromJson(json: String): Metadata = {
    +    fromJObject(parse(json).asInstanceOf[JObject])
    +  }
    +
    +  /** Creates a Metadata instance from JSON AST. */
    +  private[sql] def fromJObject(jObj: JObject): Metadata = {
    +    val builder = new MetadataBuilder
    +    jObj.obj.foreach {
    +      case (key, JInt(value)) =>
    +        builder.putLong(key, value.toLong)
    +      case (key, JDouble(value)) =>
    +        builder.putDouble(key, value)
    +      case (key, JBool(value)) =>
    +        builder.putBoolean(key, value)
    +      case (key, JString(value)) =>
    +        builder.putString(key, value)
    +      case (key, o: JObject) =>
    +        builder.putMetadata(key, fromJObject(o))
    +      case (key, JArray(value)) =>
    +        if (value.isEmpty) {
    +          // If it is an empty array, we cannot infer its element type. We put an empty Array[Long].
    +          builder.putLongArray(key, Array.empty)
    +        } else {
    +          value.head match {
    +            case _: JInt =>
    +              builder.putLongArray(key, value.asInstanceOf[List[JInt]].map(_.num.toLong).toArray)
    +            case _: JDouble =>
    +              builder.putDoubleArray(key, value.asInstanceOf[List[JDouble]].map(_.num).toArray)
    +            case _: JBool =>
    +              builder.putBooleanArray(key, value.asInstanceOf[List[JBool]].map(_.value).toArray)
    +            case _: JString =>
    +              builder.putStringArray(key, value.asInstanceOf[List[JString]].map(_.s).toArray)
    +            case _: JObject =>
    +              builder.putMetadataArray(
    +                key, value.asInstanceOf[List[JObject]].map(fromJObject).toArray)
    +            case other =>
    +              throw new RuntimeException(s"Do not support array of type ${other.getClass}.")
    +          }
    +        }
    +      case other =>
    +        throw new RuntimeException(s"Do not support type ${other.getClass}.")
    +    }
    +    builder.build()
    +  }
    +
    +  /** Converts to JSON AST. */
    +  private def toJsonValue(obj: Any): JValue = {
    +    obj match {
    +      case map: Map[_, _] =>
    +        val fields = map.toList.map { case (k: String, v) => (k, toJsonValue(v))}
    --- End diff --
    
    This `case` branch can be simplified to:
    
    ```scala
    case map: Map[String, _] =>
      JObject(map.mapValues(toJsonValue).toList)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58292916
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21429/consoleFull) for   PR 2701 at commit [`d8af0ed`](https://github.com/apache/spark/commit/d8af0edc105e767e22d3d2696587a502741b9416).
     * This patch **fails** unit tests.
     * This patch **does not** merge cleanly!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2701#discussion_r19121733
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala ---
    @@ -43,6 +44,9 @@ abstract class Expression extends TreeNode[Expression] {
       def nullable: Boolean
       def references: AttributeSet = AttributeSet(children.flatMap(_.references.iterator))
     
    +  /** Returns the metadata when an expression is a reference to another expression with metadata. */
    +  def metadata: Metadata = Metadata.empty
    --- End diff --
    
    Should this be on `NamedExpression` only?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59888064
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21967/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59118273
  
    @liancheng @marmbrus I added a new Metadata class to restrict the value types to: Boolean, Long, Double, String, Metadata, and arrays of them. The Python side still uses `dict` for metadata. I can update the Python implementation once you feel okay with the Scala/Java implementation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2701#discussion_r18867403
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala ---
    @@ -377,24 +378,37 @@ case class ArrayType(elementType: DataType, containsNull: Boolean) extends DataT
      * @param name The name of this field.
      * @param dataType The data type of this field.
      * @param nullable Indicates if values of this field can be `null` values.
    + * @param metadata The metadata of this field, which is a map from string to simple type that can be
    + *                 serialized to JSON automatically. The metadata should be preserved during
    + *                 transformation if the content of the column is not modified, e.g, in selection.
      */
    -case class StructField(name: String, dataType: DataType, nullable: Boolean) {
    +case class StructField(
    +    name: String,
    +    dataType: DataType,
    +    nullable: Boolean,
    +    metadata: Map[String, Any] = Map.empty) {
     
       private[sql] def buildFormattedString(prefix: String, builder: StringBuilder): Unit = {
         builder.append(s"$prefix-- $name: ${dataType.typeName} (nullable = $nullable)\n")
         DataType.buildFormattedString(dataType, s"$prefix    |", builder)
       }
     
    +  override def toString: String = {
    +    // Do not add metadata to be consistent with CaseClassStringParser.
    --- End diff --
    
    Yes, for newer versions, data type information written in Parquet file is in JSON format only.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58302991
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21440/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58298735
  
    PySpark also need to be updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60327497
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22098/consoleFull) for   PR 2701 at commit [`589f314`](https://github.com/apache/spark/commit/589f3141c168f4c3565f4a1f9c92ca222cc8b048).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58481254
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21529/consoleFull) for   PR 2701 at commit [`93518fb`](https://github.com/apache/spark/commit/93518fbfcef06621b81ea33439833a6e2c158bc7).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-61356064
  
      [Test build #22671 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22671/consoleFull) for   PR 2701 at commit [`dedda56`](https://github.com/apache/spark/commit/dedda56fce0f50f7d7b4f2579e279306833d6c92).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `
      * `class Metadata extends org.apache.spark.sql.catalyst.util.Metadata `
      * `public class MetadataBuilder extends org.apache.spark.sql.catalyst.util.MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-61351820
  
      [Test build #22671 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22671/consoleFull) for   PR 2701 at commit [`dedda56`](https://github.com/apache/spark/commit/dedda56fce0f50f7d7b4f2579e279306833d6c92).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58290117
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21435/consoleFull) for   PR 2701 at commit [`c41a664`](https://github.com/apache/spark/commit/c41a664f902d686c4c9ff5007dd080f51523f484).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60477183
  
      [Test build #443 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/443/consoleFull) for   PR 2701 at commit [`589f314`](https://github.com/apache/spark/commit/589f3141c168f4c3565f4a1f9c92ca222cc8b048).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58474390
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21529/consoleFull) for   PR 2701 at commit [`93518fb`](https://github.com/apache/spark/commit/93518fbfcef06621b81ea33439833a6e2c158bc7).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-61162411
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22543/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2701#discussion_r19121837
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/Metadata.scala ---
    @@ -0,0 +1,252 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.util
    +
    +import scala.collection.mutable
    +
    +import org.json4s._
    +import org.json4s.jackson.JsonMethods._
    +
    +/**
    + * Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean,
    + * Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], and
    --- End diff --
    
    Do we want to use `Array` or `Seq` here?  I always find the variance semantics of using `Array` to be kind of confusing and so usually prefer `Seq` unless the space/performance is critical.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59881165
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21967/consoleFull) for   PR 2701 at commit [`611d3c2`](https://github.com/apache/spark/commit/611d3c20cf4aed9927b596d89b9ac96b2cbbcdec).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59133929
  
    test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2701#discussion_r19125755
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/Metadata.scala ---
    @@ -0,0 +1,252 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.util
    +
    +import scala.collection.mutable
    +
    +import org.json4s._
    +import org.json4s.jackson.JsonMethods._
    +
    +/**
    + * Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean,
    + * Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], and
    --- End diff --
    
    This is easy for Java users and mirroring the API for Python. Array of primitive types is also more memory-efficient that Seq.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59165099
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21762/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59141591
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21748/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59322329
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21793/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59141584
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21748/consoleFull) for   PR 2701 at commit [`c9d7301`](https://github.com/apache/spark/commit/c9d7301b5e395b8e51030f1be17e35c05de23b7a).
     * This patch **fails Spark unit tests**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59445740
  
    test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59453472
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21825/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-60534403
  
      [Test build #458 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/458/consoleFull) for   PR 2701 at commit [`589f314`](https://github.com/apache/spark/commit/589f3141c168f4c3565f4a1f9c92ca222cc8b048).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59160388
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21762/consoleFull) for   PR 2701 at commit [`3f49aab`](https://github.com/apache/spark/commit/3f49aab1342fd2d877749c7cce5cfa6bc8ac1fa7).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58286159
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21429/consoleFull) for   PR 2701 at commit [`d8af0ed`](https://github.com/apache/spark/commit/d8af0edc105e767e22d3d2696587a502741b9416).
     * This patch **does not** merge cleanly!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58481258
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21529/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59266259
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21772/consoleFull) for   PR 2701 at commit [`4266f4d`](https://github.com/apache/spark/commit/4266f4dd4df4b006d3a54144558cb92bf46003a7).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58299514
  
    @liancheng The Python API is hard to update at this time because the schema SerDe is via `str(..)`:
    
    https://github.com/apache/spark/blob/master/python/pyspark/sql.py#L1131


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2701


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59165094
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21762/consoleFull) for   PR 2701 at commit [`3f49aab`](https://github.com/apache/spark/commit/3f49aab1342fd2d877749c7cce5cfa6bc8ac1fa7).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class AttributeReference(`
      * `case class StructField(`
      * `class MetadataBuilder `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3569][SQL] Add metadata field to Struct...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-59140835
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21751/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-3569][SQL] Add metadata field to S...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2701#issuecomment-58292922
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21429/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org