You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Dawid Wysakowicz <wy...@gmail.com> on 2015/12/01 09:42:43 UTC

Re: Problem with arrays in phoenix-spark

Sure, I have done that.

https://issues.apache.org/jira/browse/PHOENIX-2469

2015-11-30 22:22 GMT+01:00 Josh Mahonin <jm...@gmail.com>:

> Hi David,
>
> Thanks for the bug report and the proposed patch. Please file a JIRA and
> we'll take the discussion there.
>
> Josh
>
> On Mon, Nov 30, 2015 at 1:01 PM, Dawid Wysakowicz <
> wysakowicz.dawid@gmail.com> wrote:
>
>> Hi,
>>
>> I've recently found some behaviour that I found buggy when working with
>> phoenix-spark and arrays.
>>
>> Take a look at those unit tests:
>>
>>   test("Can save arrays from custom dataframes back to phoenix") {
>>     val dataSet = List(Row(2L, Array("String1", "String2", "String3")))
>>
>>     val sqlContext = new SQLContext(sc)
>>
>>     val schema = StructType(
>>         Seq(StructField("ID", LongType, nullable = false),
>>             StructField("VCARRAY", ArrayType(StringType))))
>>
>>     val rowRDD = sc.parallelize(dataSet)
>>
>>     // Apply the schema to the RDD.
>>     val df = sqlContext.createDataFrame(rowRDD, schema)
>>
>>     df.write
>>       .format("org.apache.phoenix.spark")
>>       .options(Map("table" -> "ARRAY_TEST_TABLE", "zkUrl" ->
>> quorumAddress))
>>       .mode(SaveMode.Overwrite)
>>       .save()
>>   }
>>
>>   test("Can save arrays of AnyVal type back to phoenix") {
>>     val dataSet = List((2L, Array(1, 2, 3), Array(1L, 2L, 3L)))
>>
>>     sc
>>       .parallelize(dataSet)
>>       .saveToPhoenix(
>>         "ARRAY_ANYVAL_TEST_TABLE",
>>         Seq("ID", "INTARRAY", "BIGINTARRAY"),
>>         zkUrl = Some(quorumAddress)
>>       )
>>
>>     // Load the results back
>>     val stmt = conn.createStatement()
>>     val rs = stmt.executeQuery("SELECT INTARRAY, BIGINTARRAY FROM
>> ARRAY_ANYVAL_TEST_TABLE WHERE ID = 2")
>>     rs.next()
>>     val intArray = rs.getArray(1).getArray().asInstanceOf[Array[Int]]
>>     val longArray = rs.getArray(2).getArray().asInstanceOf[Array[Long]]
>>
>>     // Verify the arrays are equal
>>     intArray shouldEqual dataSet(0)._2
>>     longArray shouldEqual dataSet(0)._3
>>   }
>>
>> Both fail with some ClassCastExceptions.
>>
>> In attached patch I've proposed a solution. The tricky part is with
>> Array[Byte] as this would be same for both VARBINARY and TINYINT[].
>>
>> Let me know If I should create an issue for this, and if my solution
>> satisfies you.
>>
>> Regards
>> Dawid Wysakowicz
>>
>>
>>
>