You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by vofque <gi...@git.apache.org> on 2018/10/12 11:58:34 UTC

[GitHub] spark pull request #22708: [Spark 21402] Fix java array/map of structs deser...

GitHub user vofque opened a pull request:

    https://github.com/apache/spark/pull/22708

    [Spark 21402] Fix java array/map of structs deserialization

    When deserializing values of ArrayType with struct elements or MapType with struct keys/values in java beans, fields of structs get mixed up. 
    I suggest using struct data types retrieved from resolved input data instead of inferring them from java beans.
    
    ## What changes were proposed in this pull request?
    
    _In case of array:_
    
    MapObjects expression is used to map array elements to java beans. Struct type of elements is inferred from java bean structure and ends up with mixed up field order.
    I used UnresolvedMapObjects instead of MapObjects, which allows to provide element type for MapObjects during analysis based on the resolved input data, not on the java bean.
    
    _In case of map:_
    
    Invocations of "keyArray" and "valueArray" functions are used to extract arrays of keys and values. Struct type of keys or values is also inferred from java bean structure and ends up with mixed up field order.
    I created a new UnresolvedInvoke expression as a temporary substitution of Invoke expression while no actual data is available. It allows to provide the resulting data type during analysis based on the resolved input data, not on the java bean (similar to UnresolvedMapObjects).
    	
    Key and value arrays are then fed to MapObjects expression which I replaced with UnresolvedMapObjects, just like in case of ArrayType.
    	
    Finally I added resolution of UnresolvedInvoke expressions in Analyzer.resolveExpression method as an additional pattern matching case.
    
    ## How was this patch tested?
    
    Created and ran unit tests for two described cases. 
    Built complete project on travis.
    
    @michalsenkyr @cloud-fan @marmbrus @liancheng 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vofque/spark SPARK-21402

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22708.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22708
    
----
commit d3578bb3776a79b97f2713f752127849e7910368
Author: Vladimir Kuriatkov <vo...@...>
Date:   2018-10-12T10:49:18Z

    Merge pull request #2 from apache/master
    
    Synch with apache:master

commit 7576ba7c9b830e45b454a21978fbb0a4275064a6
Author: Vladimir Kuriatkov <vl...@...>
Date:   2018-10-11T12:56:15Z

    Java array/map of structs deserialization fixed

commit c9ffcd4de5738dfabb124aae28ef7b79b36e48d6
Author: Vladimir Kuriatkov <vo...@...>
Date:   2018-10-12T11:10:38Z

    Merge pull request #3 from vofque/bugfix
    
    Java array/map of structs deserialization fixed

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225840666
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithMapSuite.java ---
    @@ -0,0 +1,257 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.HashMap;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Map;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.MapType;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithMapSuite {
    --- End diff --
    
    Yeah, that's my fault..


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402] Fix java array/map of structs deserializat...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Can you explain how this happens? Why thhe fields of structs get mixed up?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    lgtm


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225751459
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    +
    +        public Record() { }
    +
    +        Record(int id, List<Interval> intervals, List<Integer> values) {
    +            this.id = id;
    +            this.intervals = intervals;
    +            this.values = values;
    +        }
    +
    +        public int getId() {
    +            return id;
    +        }
    +
    +        public void setId(int id) {
    +            this.id = id;
    +        }
    +
    +        public List<Interval> getIntervals() {
    +            return intervals;
    +        }
    +
    +        public void setIntervals(List<Interval> intervals) {
    +            this.intervals = intervals;
    +        }
    +
    +        public List<Integer> getValues() {
    +            return values;
    +        }
    +
    +        public void setValues(List<Integer> values) {
    +            this.values = values;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Record)) return false;
    +            Record other = (Record) obj;
    +            return
    --- End diff --
    
    indentation?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Fixed all issues in test class. Thanks a lot for your help and patience.
    Rebase was probably a bad idea, sorry for that..


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97489 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97489/testReport)** for PR 22708 at commit [`f6c40b6`](https://github.com/apache/spark/commit/f6c40b675656a75561be84377198a473025a6bd0).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225820031
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    +
    +        public Record() { }
    +
    +        Record(int id, List<Interval> intervals, List<Integer> values) {
    +            this.id = id;
    +            this.intervals = intervals;
    +            this.values = values;
    +        }
    +
    +        public int getId() {
    +            return id;
    +        }
    +
    +        public void setId(int id) {
    +            this.id = id;
    +        }
    +
    +        public List<Interval> getIntervals() {
    +            return intervals;
    +        }
    +
    +        public void setIntervals(List<Interval> intervals) {
    +            this.intervals = intervals;
    +        }
    +
    +        public List<Integer> getValues() {
    +            return values;
    +        }
    +
    +        public void setValues(List<Integer> values) {
    +            this.values = values;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Record)) return false;
    +            Record other = (Record) obj;
    +            return
    +                    (other.id == this.id) &&
    +                    Util.equals(other.intervals, this.intervals) &&
    +                    Util.equals(other.values, this.values);
    +        }
    +
    +        @Override
    +        public String toString() {
    +            return String.format("{ id: %d, intervals: %s }", id, intervals );
    +        }
    +    }
    +
    +    public static class Interval {
    --- End diff --
    
    We can have this merged first and rebase another PR to have another test in the same file too.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225800282
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    --- End diff --
    
    OK. Thank you for advice.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    @cloud-fan and @vofque .
    Can we have this fix in `branch-2.3` and `branch-2.2`, too?
    It seems that we need another backport PRs for them.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402] Fix java array of structs deseriali...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225452879
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala ---
    @@ -26,6 +26,7 @@ import scala.language.existentials
     
     import com.google.common.reflect.TypeToken
     
    +import org.apache.spark.sql.AnalysisException
    --- End diff --
    
    This is also unused now.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97490 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97490/testReport)** for PR 22708 at commit [`b1f74ac`](https://github.com/apache/spark/commit/b1f74ac87bb1949a99afd65a97224c1a3abdd189).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97458 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97458/testReport)** for PR 22708 at commit [`8bc66e6`](https://github.com/apache/spark/commit/8bc66e6882413a55e90847bb89bfe8b69941384f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    cc @dongjoon-hyun here is another instance of the FileBasedDataSourceSuite flaky test.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225813918
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    +
    +        public Record() { }
    +
    +        Record(int id, List<Interval> intervals, List<Integer> values) {
    +            this.id = id;
    +            this.intervals = intervals;
    +            this.values = values;
    +        }
    +
    +        public int getId() {
    +            return id;
    +        }
    +
    +        public void setId(int id) {
    +            this.id = id;
    +        }
    +
    +        public List<Interval> getIntervals() {
    +            return intervals;
    +        }
    +
    +        public void setIntervals(List<Interval> intervals) {
    +            this.intervals = intervals;
    +        }
    +
    +        public List<Integer> getValues() {
    +            return values;
    +        }
    +
    +        public void setValues(List<Integer> values) {
    +            this.values = values;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Record)) return false;
    +            Record other = (Record) obj;
    +            return
    +                    (other.id == this.id) &&
    +                    Util.equals(other.intervals, this.intervals) &&
    +                    Util.equals(other.values, this.values);
    +        }
    +
    +        @Override
    +        public String toString() {
    +            return String.format("{ id: %d, intervals: %s }", id, intervals );
    +        }
    +    }
    +
    +    public static class Interval {
    --- End diff --
    
    But, I guess, to do that we need to have these both changes in one PR? 
    Or correct another PR to add the second test in the same Java file later.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Corrected all issues.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225752208
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    --- End diff --
    
    If we remove `createSchema`, we can remove Line 35 ~ 40 , too.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97488/testReport)** for PR 22708 at commit [`811e45a`](https://github.com/apache/spark/commit/811e45aa3a0148bd7a75c86b1cb6f835345684cd).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402] Fix java array of structs deseriali...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225454136
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,195 @@
    +package test.org.apache.spark.sql;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.*;
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import java.util.*;
    --- End diff --
    
    The import orders here are not compliant with Spark codebase. You can follow the style in other tests like `JavaApplySchemaSuite`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225751513
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    +
    +        public Record() { }
    +
    +        Record(int id, List<Interval> intervals, List<Integer> values) {
    +            this.id = id;
    +            this.intervals = intervals;
    +            this.values = values;
    +        }
    +
    +        public int getId() {
    +            return id;
    +        }
    +
    +        public void setId(int id) {
    +            this.id = id;
    +        }
    +
    +        public List<Interval> getIntervals() {
    +            return intervals;
    +        }
    +
    +        public void setIntervals(List<Interval> intervals) {
    +            this.intervals = intervals;
    +        }
    +
    +        public List<Integer> getValues() {
    +            return values;
    +        }
    +
    +        public void setValues(List<Integer> values) {
    +            this.values = values;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Record)) return false;
    +            Record other = (Record) obj;
    +            return
    +                    (other.id == this.id) &&
    +                    Util.equals(other.intervals, this.intervals) &&
    +                    Util.equals(other.values, this.values);
    +        }
    +
    +        @Override
    +        public String toString() {
    +            return String.format("{ id: %d, intervals: %s }", id, intervals );
    +        }
    +    }
    +
    +    public static class Interval {
    +
    +        private long startTime;
    +        private long endTime;
    +
    +        public Interval() { }
    +
    +        Interval(long startTime, long endTime) {
    +            this.startTime = startTime;
    +            this.endTime = endTime;
    +        }
    +
    +        public long getStartTime() {
    +            return startTime;
    +        }
    +
    +        public void setStartTime(long startTime) {
    +            this.startTime = startTime;
    +        }
    +
    +        public long getEndTime() {
    +            return endTime;
    +        }
    +
    +        public void setEndTime(long endTime) {
    +            this.endTime = endTime;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Interval)) return false;
    +            Interval other = (Interval) obj;
    +            return
    --- End diff --
    
    indentation?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225799822
  
    --- Diff: sql/core/src/test/resources/test-data/with-array-fields ---
    @@ -0,0 +1,3 @@
    +{ "id": 1, "intervals": [{ "startTime": 111, "endTime": 211 }, { "startTime": 121, "endTime": 221 }], "values": [11, 21, 31, 41]}
    --- End diff --
    
    Sure.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    @dongjoon-hyun, sure, I'll create equal pull requests.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225767103
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    +
    +        public Record() { }
    +
    +        Record(int id, List<Interval> intervals, List<Integer> values) {
    +            this.id = id;
    +            this.intervals = intervals;
    +            this.values = values;
    +        }
    +
    +        public int getId() {
    +            return id;
    +        }
    +
    +        public void setId(int id) {
    +            this.id = id;
    +        }
    +
    +        public List<Interval> getIntervals() {
    +            return intervals;
    +        }
    +
    +        public void setIntervals(List<Interval> intervals) {
    +            this.intervals = intervals;
    +        }
    +
    +        public List<Integer> getValues() {
    +            return values;
    +        }
    +
    +        public void setValues(List<Integer> values) {
    +            this.values = values;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Record)) return false;
    +            Record other = (Record) obj;
    +            return
    +                    (other.id == this.id) &&
    +                    Util.equals(other.intervals, this.intervals) &&
    +                    Util.equals(other.values, this.values);
    +        }
    +
    +        @Override
    +        public String toString() {
    +            return String.format("{ id: %d, intervals: %s }", id, intervals );
    +        }
    +    }
    +
    +    public static class Interval {
    +
    +        private long startTime;
    +        private long endTime;
    +
    +        public Interval() { }
    +
    +        Interval(long startTime, long endTime) {
    +            this.startTime = startTime;
    +            this.endTime = endTime;
    +        }
    +
    +        public long getStartTime() {
    +            return startTime;
    +        }
    +
    +        public void setStartTime(long startTime) {
    +            this.startTime = startTime;
    +        }
    +
    +        public long getEndTime() {
    +            return endTime;
    +        }
    +
    +        public void setEndTime(long endTime) {
    +            this.endTime = endTime;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Interval)) return false;
    +            Interval other = (Interval) obj;
    +            return
    --- End diff --
    
    ah yes, in Spark we should use 2 space indention for java code as well


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402] Fix java array of structs deseriali...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225453267
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,195 @@
    +package test.org.apache.spark.sql;
    --- End diff --
    
    You need to add the license headers.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402] Fix java array/map of structs deser...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225121081
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala ---
    @@ -271,32 +272,41 @@ object JavaTypeInference {
     
           case c if listType.isAssignableFrom(typeToken) =>
             val et = elementType(typeToken)
    -        MapObjects(
    +        UnresolvedMapObjects(
    --- End diff --
    
    Sure. Should I create another pull request with this change only?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97454/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225810177
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    --- End diff --
    
    The intention was to test a non struct case too. But I think, it's really not critical and we can get rid of it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [Spark 21402] Fix java array/map of structs deserializat...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [Spark 21402] Fix java array/map of structs deserializat...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225749733
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    --- End diff --
    
    @vofque Please note the `startTime` and `endTime`. It should be case-sensitive.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402] Fix java array/map of structs deserializat...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    In a nutshell:
    First we calculate field ordinals of array elements using java bean field information (which doesn't guarantee any particular field order).
    Then we apply these ordinals to the actual data to retrieve values. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402] Fix java array/map of structs deser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225182746
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala ---
    @@ -271,32 +272,41 @@ object JavaTypeInference {
     
           case c if listType.isAssignableFrom(typeToken) =>
             val et = elementType(typeToken)
    -        MapObjects(
    +        UnresolvedMapObjects(
    --- End diff --
    
    yes please, thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225834919
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithMapSuite.java ---
    @@ -0,0 +1,257 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.HashMap;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Map;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.MapType;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithMapSuite {
    --- End diff --
    
    Why include this change here? Don't you want to have it in another PR?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97491 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97491/testReport)** for PR 22708 at commit [`46e942d`](https://github.com/apache/spark/commit/46e942d96861670515330ae95d5067d287d86ba0).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402] Fix java array/map of structs deser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225118613
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala ---
    @@ -271,32 +272,41 @@ object JavaTypeInference {
     
           case c if listType.isAssignableFrom(typeToken) =>
             val et = elementType(typeToken)
    -        MapObjects(
    +        UnresolvedMapObjects(
    --- End diff --
    
    can we exclude other changes except this one? This one is very easy to reason about. We did the same thing in `ScalaReflection.`
    
    We need more time to think about the map case, and fix it in `ScalaReflection` as well.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225769471
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    +
    +        public Record() { }
    +
    +        Record(int id, List<Interval> intervals, List<Integer> values) {
    +            this.id = id;
    +            this.intervals = intervals;
    +            this.values = values;
    +        }
    +
    +        public int getId() {
    +            return id;
    +        }
    +
    +        public void setId(int id) {
    +            this.id = id;
    +        }
    +
    +        public List<Interval> getIntervals() {
    +            return intervals;
    +        }
    +
    +        public void setIntervals(List<Interval> intervals) {
    +            this.intervals = intervals;
    +        }
    +
    +        public List<Integer> getValues() {
    +            return values;
    +        }
    +
    +        public void setValues(List<Integer> values) {
    +            this.values = values;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Record)) return false;
    +            Record other = (Record) obj;
    +            return
    +                    (other.id == this.id) &&
    +                    Util.equals(other.intervals, this.intervals) &&
    +                    Util.equals(other.values, this.values);
    +        }
    +
    +        @Override
    +        public String toString() {
    +            return String.format("{ id: %d, intervals: %s }", id, intervals );
    +        }
    +    }
    +
    +    public static class Interval {
    --- End diff --
    
    This is duplicate to your another PR. Maybe we can consider put two tests in one Java file so we don't need to have two `Interval`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    thanks, merging to master/2.4!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97489/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97458/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225505036
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala ---
    @@ -271,32 +272,41 @@ object JavaTypeInference {
     
           case c if listType.isAssignableFrom(typeToken) =>
             val et = elementType(typeToken)
    -        MapObjects(
    +        UnresolvedMapObjects(
    --- End diff --
    
    Removed other changes from this PR and created a new one with only map case.
    As far as I see, everything works fine with scala classes, because StructTypes are generated based on constructor parameters, and they are available in correct order with correct names. Which is hardly achievable with Java beans..


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Thank you for pinging me, @cloud-fan .


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97454 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97454/testReport)** for PR 22708 at commit [`8bc66e6`](https://github.com/apache/spark/commit/8bc66e6882413a55e90847bb89bfe8b69941384f).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97488/testReport)** for PR 22708 at commit [`811e45a`](https://github.com/apache/spark/commit/811e45aa3a0148bd7a75c86b1cb6f835345684cd).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97490 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97490/testReport)** for PR 22708 at commit [`b1f74ac`](https://github.com/apache/spark/commit/b1f74ac87bb1949a99afd65a97224c1a3abdd189).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97491/testReport)** for PR 22708 at commit [`46e942d`](https://github.com/apache/spark/commit/46e942d96861670515330ae95d5067d287d86ba0).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97488/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225768857
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    --- End diff --
    
    Will this list of int affect the test? If no, maybe we can get rid of it to simplify the test.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402] Fix java array/map of structs deser...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225391643
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -282,6 +283,27 @@ case class StaticInvoke(
       }
     }
     
    +/**
    + * When constructing [[Invoke]], the data type must be given, which may be not possible to define
    + * before analysis. This class acts like a placeholder for [[Invoke]], and will be replaced by
    + * [[Invoke]] during analysis after the input data is resolved. Data type passed to [[Invoke]]
    + * will be defined by applying [[dataTypeFunction]] to the data type of the input data.
    + */
    +case class UnresolvedInvoke(
    --- End diff --
    
    Should we move this to `unresolved.scala`? cc @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225751969
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    +
    +        public Record() { }
    +
    +        Record(int id, List<Interval> intervals, List<Integer> values) {
    +            this.id = id;
    +            this.intervals = intervals;
    +            this.values = values;
    +        }
    +
    +        public int getId() {
    +            return id;
    +        }
    +
    +        public void setId(int id) {
    +            this.id = id;
    +        }
    +
    +        public List<Interval> getIntervals() {
    +            return intervals;
    +        }
    +
    +        public void setIntervals(List<Interval> intervals) {
    +            this.intervals = intervals;
    +        }
    +
    +        public List<Integer> getValues() {
    +            return values;
    +        }
    +
    +        public void setValues(List<Integer> values) {
    +            this.values = values;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Record)) return false;
    +            Record other = (Record) obj;
    +            return
    +                    (other.id == this.id) &&
    +                    Util.equals(other.intervals, this.intervals) &&
    +                    Util.equals(other.values, this.values);
    +        }
    +
    +        @Override
    +        public String toString() {
    +            return String.format("{ id: %d, intervals: %s }", id, intervals );
    --- End diff --
    
    It seems that `values` are missed here. Is it intentional?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402] Fix java array/map of structs deserializat...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    The original problem is described here: https://issues.apache.org/jira/browse/SPARK-21402
    
    I'll try to explain what happens in detail.
    
    Let's consider this data structure:
    root
     |-- intervals: array
     |    |-- element: struct
     |    |    |-- startTime: long
     |    |    |-- endTime: long
    
    And let's say we have a java bean class with corresponding structure.
    
    When building a deserializer for the field _intervals_ in _JavaTypeInference.deserializerFor_ we construct a _MapObjects_ expression to convert structs to java beans:
    ```
    case c if listType.isAssignableFrom(typeToken) =>
      val et = elementType(typeToken)
      MapObjects(
        p => deserializerFor(et, Some(p)),
        getPath,
        inferDataType(et)._1,
        customCollectionCls = Some(c))
    ```
    
    _MapObjects_ requires _DataType_ of array elements. It is extracted from java element type using _JavaTypeInference.inferDataType_ which gets java bean properties and maps them to _StructFields_.
    ```
    case other =>
      // some more code goes here
      val properties = getJavaBeanReadableProperties(other)
      val fields = properties.map { property =>
        val returnType = typeToken.method(property.getReadMethod).getReturnType
        val (dataType, nullable) = inferDataType(returnType, seenTypeSet + other)
        new StructField(property.getName, dataType, nullable)
    }
    ```
    
    The order of properties in the resulting _StructType_ may not correspond to their declaration order as the declaration order is simply unknown. So the resulting element _StructType_ may look like this:
    root
     |-- endTime: long
     |-- startTime: long
    
    This _StructType_ is passed to _MapObjects_ and then to its loop variable _LambdaVariable_.
    
    For deserialization of single array elements an _InitializeJavaBean_ expression is created. It contains _UnresolvedExtractValue_ expressions for each field, and these expressions have _LambdaVariable_ as a child. They are resolved during analysis:
    ```
    case UnresolvedExtractValue(child, fieldName) if child.resolved =>
      ExtractValue(child, fieldName, resolver)
    ```  
    
    For each field _startTime_ and _endTime_ ordinals are calculated. For that child's _DataType_ is used, and in our case this is _StructType_ of _LambdaVariable_ with incorrect field order.
    As a result we get _GetStructField_ expressions with ordinal = 0 for 'endTime' and ordinal = 1 for startTime.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97490/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97491/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225826366
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    +                .load("src/test/resources/test-data/with-array-fields")
    +                .as(encoder);
    +
    +        List<Record> records = dataset.collectAsList();
    +
    +        Assert.assertTrue(Util.equals(records, RECORDS));
    +    }
    +
    +    private static StructType createSchema() {
    +        StructField[] intervalFields = {
    +                new StructField("startTime", DataTypes.LongType, true, Metadata.empty()),
    +                new StructField("endTime", DataTypes.LongType, true, Metadata.empty())
    +        };
    +        DataType intervalType = new StructType(intervalFields);
    +
    +        DataType intervalsType = new ArrayType(intervalType, true);
    +
    +        DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
    +
    +        StructField[] fields = {
    +                new StructField("id", DataTypes.IntegerType, true, Metadata.empty()),
    +                new StructField("intervals", intervalsType, true, Metadata.empty()),
    +                new StructField("values", valuesType, true, Metadata.empty())
    +        };
    +        return new StructType(fields);
    +    }
    +
    +    public static class Record {
    +
    +        private int id;
    +        private List<Interval> intervals;
    +        private List<Integer> values;
    +
    +        public Record() { }
    +
    +        Record(int id, List<Interval> intervals, List<Integer> values) {
    +            this.id = id;
    +            this.intervals = intervals;
    +            this.values = values;
    +        }
    +
    +        public int getId() {
    +            return id;
    +        }
    +
    +        public void setId(int id) {
    +            this.id = id;
    +        }
    +
    +        public List<Interval> getIntervals() {
    +            return intervals;
    +        }
    +
    +        public void setIntervals(List<Interval> intervals) {
    +            this.intervals = intervals;
    +        }
    +
    +        public List<Integer> getValues() {
    +            return values;
    +        }
    +
    +        public void setValues(List<Integer> values) {
    +            this.values = values;
    +        }
    +
    +        @Override
    +        public boolean equals(Object obj) {
    +            if (!(obj instanceof Record)) return false;
    +            Record other = (Record) obj;
    +            return
    +                    (other.id == this.id) &&
    +                    Util.equals(other.intervals, this.intervals) &&
    +                    Util.equals(other.values, this.values);
    +        }
    +
    +        @Override
    +        public String toString() {
    +            return String.format("{ id: %d, intervals: %s }", id, intervals );
    +        }
    +    }
    +
    +    public static class Interval {
    --- End diff --
    
    OK, sure.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402] Fix java array of structs deserialization

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    And please add [SQL] to the PR title. Like `[SPARK-21402][SQL]`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402] Fix java array/map of structs deserializat...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Please modify the PR title and description accordingly. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402] Fix java array of structs deseriali...

Posted by vofque <gi...@git.apache.org>.
Github user vofque commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225451652
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -30,6 +30,7 @@ import org.apache.spark.serializer._
     import org.apache.spark.sql.Row
     import org.apache.spark.sql.catalyst.{CatalystTypeConverters, InternalRow, ScalaReflection}
     import org.apache.spark.sql.catalyst.ScalaReflection.universe.TermName
    +import org.apache.spark.sql.catalyst.analysis.UnresolvedException
    --- End diff --
    
    Yep, removed it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402] Fix java array/map of structs deser...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225440522
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -30,6 +30,7 @@ import org.apache.spark.serializer._
     import org.apache.spark.sql.Row
     import org.apache.spark.sql.catalyst.{CatalystTypeConverters, InternalRow, ScalaReflection}
     import org.apache.spark.sql.catalyst.ScalaReflection.universe.TermName
    +import org.apache.spark.sql.catalyst.analysis.UnresolvedException
    --- End diff --
    
    nit: it does not seem to be necessary.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [Spark 21402] Fix java array/map of structs deserializat...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225740504
  
    --- Diff: sql/core/src/test/resources/test-data/with-array-fields ---
    @@ -0,0 +1,3 @@
    +{ "id": 1, "intervals": [{ "startTime": 111, "endTime": 211 }, { "startTime": 121, "endTime": 221 }], "values": [11, 21, 31, 41]}
    --- End diff --
    
    Could you rename this to `with-array-fields.json`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97489 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97489/testReport)** for PR 22708 at commit [`f6c40b6`](https://github.com/apache/spark/commit/f6c40b675656a75561be84377198a473025a6bd0).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class UnresolvedInvoke(`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22708


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22708#discussion_r225749174
  
    --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
    @@ -0,0 +1,222 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package test.org.apache.spark.sql;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.Collection;
    +import java.util.Iterator;
    +import java.util.List;
    +
    +import org.junit.After;
    +import org.junit.Assert;
    +import org.junit.Before;
    +import org.junit.Test;
    +
    +import org.apache.spark.sql.Dataset;
    +import org.apache.spark.sql.Encoder;
    +import org.apache.spark.sql.Encoders;
    +import org.apache.spark.sql.test.TestSparkSession;
    +import org.apache.spark.sql.types.ArrayType;
    +import org.apache.spark.sql.types.DataType;
    +import org.apache.spark.sql.types.DataTypes;
    +import org.apache.spark.sql.types.Metadata;
    +import org.apache.spark.sql.types.StructField;
    +import org.apache.spark.sql.types.StructType;
    +
    +public class JavaBeanWithArraySuite {
    +
    +    private static final List<Record> RECORDS = new ArrayList<>();
    +
    +    static {
    +        RECORDS.add(new Record(1,
    +                Arrays.asList(new Interval(111, 211), new Interval(121, 221)),
    +                Arrays.asList(11, 21, 31, 41)
    +        ));
    +        RECORDS.add(new Record(2,
    +                Arrays.asList(new Interval(112, 212), new Interval(122, 222)),
    +                Arrays.asList(12, 22, 32, 42)
    +        ));
    +        RECORDS.add(new Record(3,
    +                Arrays.asList(new Interval(113, 213), new Interval(123, 223)),
    +                Arrays.asList(13, 23, 33, 43)
    +        ));
    +    }
    +
    +    private TestSparkSession spark;
    +
    +    @Before
    +    public void setUp() {
    +        spark = new TestSparkSession();
    +    }
    +
    +    @After
    +    public void tearDown() {
    +        spark.stop();
    +        spark = null;
    +    }
    +
    +    @Test
    +    public void testBeanWithArrayFieldsDeserialization() {
    +
    +        StructType schema = createSchema();
    +        Encoder<Record> encoder = Encoders.bean(Record.class);
    +
    +        Dataset<Record> dataset = spark
    +                .read()
    +                .format("json")
    +                .schema(schema)
    --- End diff --
    
    I'm wondering if we can use the latest and neat approach in this PR. Then, we can remove `createSchema()` here.
    ```scala
    - .schema(schema)
    + .schema("id int, intervals array<struct<starttime: bigint, endtime: bigint>>, values array<int>")
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22708: [SPARK-21402][SQL] Fix java array of structs deserializa...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22708
  
    **[Test build #97454 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97454/testReport)** for PR 22708 at commit [`8bc66e6`](https://github.com/apache/spark/commit/8bc66e6882413a55e90847bb89bfe8b69941384f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org