You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Francis Chuang <fr...@apache.org> on 2019/01/07 03:38:25 UTC

Breaking change in Avatica-Go when handling null and empty strings for Apache Phoenix

This is a heads up regarding a breaking change that is currently in 
avatica-go master and will be released as the next major version, 4.0.0.

In Apache Phoenix, string columns set to null or an empty string ("") 
are considered to be equivalent. For more details on why this is the 
case see [1].

While fixing a bug to correctly work with null values in avatica-go [2], 
I had to break existing behavior.

Previous behavior: A string column set to null or an empty string will 
be returned as an empty string.

New behavior: A string column set to null or an empty string will be 
returned as a null.

The reason for this change is to take advantage of Go's database/sql 
package's builtin NullString type [3]. This type allows userland code to 
scan nullable columns into a variable without any errors.

Note: This breaking change will be part of 4.0.0 and will not affect 
users using 3.x.x. However, to take advantage of database/sql's null 
types, you will need to upgrade to 4.0.0 (when it is released) and 
upgrade your import paths to github.com/apache/calcite-avatica-go/v4
This change is only applicable for Apache Phoenix and will not affect 
HSQLDB.

[1] https://issues.apache.org/jira/browse/PHOENIX-947
[2] https://issues.apache.org/jira/browse/CALCITE-2763
[3] https://golang.org/pkg/database/sql/#NullString

Re: Breaking change in Avatica-Go when handling null and empty strings for Apache Phoenix

Posted by Francis Chuang <fr...@apache.org>.
The sql.NullString [1] type exists for this. There are a few other ones: 
NullFloat64, NullInt64 and NullBool.

However, while it does exist, we're at the mercy of Phoenix. Since 
Phoenix only returns a Rep_Type of Null and null = true regardless of 
whether the string is empty or an actual null, there isn't much the 
client can do as we only see nulls.

HSQLDB on the other hand does not do this, so the client is able to see 
whether the returned value is a null or an empty string, so if an empty 
string was stored, the client will return an empty string. If a null was 
stored, the client will return a null (nil in Go).

[1] https://golang.org/pkg/database/sql/#NullString

On 8/01/2019 9:34 am, Julian Hyde wrote:
> OK, it seems you’ve given this plenty of thought.
> 
> I don’t know Go, but it seems that string values can never be null, in the same way that int values cannot be null in Java. But SQL INTEGER columns can have NULL values, so JDBC needs a way to represent the NULL values on the client. In JDBC, if you call ‘int getColumn(int columnId)’ on a column whose value is NULL, then you will receive the value 0. You can then call ‘boolean wasNull()’ to find out whether that 0 was really zero or was actually a NULL.
> 
> Maybe Go could use something similar to the ‘wasNull’ method to distinguish between null strings and empty strings.
> 
> Julian
> 
> 
>> On Jan 7, 2019, at 1:56 PM, Francis Chuang <fr...@apache.org> wrote:
>>
>> This behavior is the default for Phoenix and will only apply to Phoenix. Unfortunately, I don't think there it would be possible to turn this behavior on or off in the Go client as there is no way to distinguish a null and an empty string that is being returned by Avatica.
>>
>> This is what the Go client sees:
>> - We insert an empty string ("") into the column, we query it and avatica returns a TypedValue with Rep_Type set to NULL and null set to true. The string field on the TypedValue is set to "".
>>
>> - We insert a null (nil or sql.NullString in Go) into the column, we query it and Avatica returns a TypedValue with Rep_Type set to NULL and null set to true. The string field on the TypedValue is set to "".
>>
>> In the previous behavior, the driver did not look at the TypedValue's Rep type of the returned value. Rather, it looked at the column's type of the returned result. If a string column contained an empty string or null, the "string" field in the TypedValue is an empty string. Since we were ignoring the Null Rep_Type, we simply return the empty string.
>>
>> This was not ideal as it didn't work with sql.NullString, which provides the ability to scan nullable columns into a sql.NullString and avoid errors during runtime.
>>
>> For the Go client, there is simply no way to distinguish between a null or a "" in the returned type. I don't think the Avatica server even sees this information, as Phoenix is simply sending a Rep_Type of Null and Null =  true to Avatica when the column is set to null or "".
>>
>>
>> On 8/01/2019 6:25 am, Julian Hyde wrote:
>>> How this behavior enabled?
>>> Apache Phoenix’s behavior contravenes the SQL standard (which states that empty strings and null strings are different) and so clients must not get this behavior unless they ask for it. It would be OK if they get it by default for Phoenix, but even then, there should be a way to disable it.
>>> Related: https://issues.apache.org/jira/browse/CALCITE-815 <https://issues.apache.org/jira/browse/CALCITE-815>
>>> Julian
>>>> On Jan 6, 2019, at 7:38 PM, Francis Chuang <fr...@apache.org> wrote:
>>>>
>>>> This is a heads up regarding a breaking change that is currently in avatica-go master and will be released as the next major version, 4.0.0.
>>>>
>>>> In Apache Phoenix, string columns set to null or an empty string ("") are considered to be equivalent. For more details on why this is the case see [1].
>>>>
>>>> While fixing a bug to correctly work with null values in avatica-go [2], I had to break existing behavior.
>>>>
>>>> Previous behavior: A string column set to null or an empty string will be returned as an empty string.
>>>>
>>>> New behavior: A string column set to null or an empty string will be returned as a null.
>>>>
>>>> The reason for this change is to take advantage of Go's database/sql package's builtin NullString type [3]. This type allows userland code to scan nullable columns into a variable without any errors.
>>>>
>>>> Note: This breaking change will be part of 4.0.0 and will not affect users using 3.x.x. However, to take advantage of database/sql's null types, you will need to upgrade to 4.0.0 (when it is released) and upgrade your import paths to github.com/apache/calcite-avatica-go/v4
>>>> This change is only applicable for Apache Phoenix and will not affect HSQLDB.
>>>>
>>>> [1] https://issues.apache.org/jira/browse/PHOENIX-947
>>>> [2] https://issues.apache.org/jira/browse/CALCITE-2763
>>>> [3] https://golang.org/pkg/database/sql/#NullString
>>
> 


Re: Breaking change in Avatica-Go when handling null and empty strings for Apache Phoenix

Posted by Julian Hyde <jh...@apache.org>.
OK, it seems you’ve given this plenty of thought.

I don’t know Go, but it seems that string values can never be null, in the same way that int values cannot be null in Java. But SQL INTEGER columns can have NULL values, so JDBC needs a way to represent the NULL values on the client. In JDBC, if you call ‘int getColumn(int columnId)’ on a column whose value is NULL, then you will receive the value 0. You can then call ‘boolean wasNull()’ to find out whether that 0 was really zero or was actually a NULL.

Maybe Go could use something similar to the ‘wasNull’ method to distinguish between null strings and empty strings.

Julian


> On Jan 7, 2019, at 1:56 PM, Francis Chuang <fr...@apache.org> wrote:
> 
> This behavior is the default for Phoenix and will only apply to Phoenix. Unfortunately, I don't think there it would be possible to turn this behavior on or off in the Go client as there is no way to distinguish a null and an empty string that is being returned by Avatica.
> 
> This is what the Go client sees:
> - We insert an empty string ("") into the column, we query it and avatica returns a TypedValue with Rep_Type set to NULL and null set to true. The string field on the TypedValue is set to "".
> 
> - We insert a null (nil or sql.NullString in Go) into the column, we query it and Avatica returns a TypedValue with Rep_Type set to NULL and null set to true. The string field on the TypedValue is set to "".
> 
> In the previous behavior, the driver did not look at the TypedValue's Rep type of the returned value. Rather, it looked at the column's type of the returned result. If a string column contained an empty string or null, the "string" field in the TypedValue is an empty string. Since we were ignoring the Null Rep_Type, we simply return the empty string.
> 
> This was not ideal as it didn't work with sql.NullString, which provides the ability to scan nullable columns into a sql.NullString and avoid errors during runtime.
> 
> For the Go client, there is simply no way to distinguish between a null or a "" in the returned type. I don't think the Avatica server even sees this information, as Phoenix is simply sending a Rep_Type of Null and Null =  true to Avatica when the column is set to null or "".
> 
> 
> On 8/01/2019 6:25 am, Julian Hyde wrote:
>> How this behavior enabled?
>> Apache Phoenix’s behavior contravenes the SQL standard (which states that empty strings and null strings are different) and so clients must not get this behavior unless they ask for it. It would be OK if they get it by default for Phoenix, but even then, there should be a way to disable it.
>> Related: https://issues.apache.org/jira/browse/CALCITE-815 <https://issues.apache.org/jira/browse/CALCITE-815>
>> Julian
>>> On Jan 6, 2019, at 7:38 PM, Francis Chuang <fr...@apache.org> wrote:
>>> 
>>> This is a heads up regarding a breaking change that is currently in avatica-go master and will be released as the next major version, 4.0.0.
>>> 
>>> In Apache Phoenix, string columns set to null or an empty string ("") are considered to be equivalent. For more details on why this is the case see [1].
>>> 
>>> While fixing a bug to correctly work with null values in avatica-go [2], I had to break existing behavior.
>>> 
>>> Previous behavior: A string column set to null or an empty string will be returned as an empty string.
>>> 
>>> New behavior: A string column set to null or an empty string will be returned as a null.
>>> 
>>> The reason for this change is to take advantage of Go's database/sql package's builtin NullString type [3]. This type allows userland code to scan nullable columns into a variable without any errors.
>>> 
>>> Note: This breaking change will be part of 4.0.0 and will not affect users using 3.x.x. However, to take advantage of database/sql's null types, you will need to upgrade to 4.0.0 (when it is released) and upgrade your import paths to github.com/apache/calcite-avatica-go/v4
>>> This change is only applicable for Apache Phoenix and will not affect HSQLDB.
>>> 
>>> [1] https://issues.apache.org/jira/browse/PHOENIX-947
>>> [2] https://issues.apache.org/jira/browse/CALCITE-2763
>>> [3] https://golang.org/pkg/database/sql/#NullString
> 


Re: Breaking change in Avatica-Go when handling null and empty strings for Apache Phoenix

Posted by Francis Chuang <fr...@apache.org>.
This behavior is the default for Phoenix and will only apply to Phoenix. 
Unfortunately, I don't think there it would be possible to turn this 
behavior on or off in the Go client as there is no way to distinguish a 
null and an empty string that is being returned by Avatica.

This is what the Go client sees:
- We insert an empty string ("") into the column, we query it and 
avatica returns a TypedValue with Rep_Type set to NULL and null set to 
true. The string field on the TypedValue is set to "".

- We insert a null (nil or sql.NullString in Go) into the column, we 
query it and Avatica returns a TypedValue with Rep_Type set to NULL and 
null set to true. The string field on the TypedValue is set to "".

In the previous behavior, the driver did not look at the TypedValue's 
Rep type of the returned value. Rather, it looked at the column's type 
of the returned result. If a string column contained an empty string or 
null, the "string" field in the TypedValue is an empty string. Since we 
were ignoring the Null Rep_Type, we simply return the empty string.

This was not ideal as it didn't work with sql.NullString, which provides 
the ability to scan nullable columns into a sql.NullString and avoid 
errors during runtime.

For the Go client, there is simply no way to distinguish between a null 
or a "" in the returned type. I don't think the Avatica server even sees 
this information, as Phoenix is simply sending a Rep_Type of Null and 
Null =  true to Avatica when the column is set to null or "".


On 8/01/2019 6:25 am, Julian Hyde wrote:
> How this behavior enabled?
> 
> Apache Phoenix’s behavior contravenes the SQL standard (which states that empty strings and null strings are different) and so clients must not get this behavior unless they ask for it. It would be OK if they get it by default for Phoenix, but even then, there should be a way to disable it.
> 
> Related: https://issues.apache.org/jira/browse/CALCITE-815 <https://issues.apache.org/jira/browse/CALCITE-815>
> 
> Julian
> 
> 
>> On Jan 6, 2019, at 7:38 PM, Francis Chuang <fr...@apache.org> wrote:
>>
>> This is a heads up regarding a breaking change that is currently in avatica-go master and will be released as the next major version, 4.0.0.
>>
>> In Apache Phoenix, string columns set to null or an empty string ("") are considered to be equivalent. For more details on why this is the case see [1].
>>
>> While fixing a bug to correctly work with null values in avatica-go [2], I had to break existing behavior.
>>
>> Previous behavior: A string column set to null or an empty string will be returned as an empty string.
>>
>> New behavior: A string column set to null or an empty string will be returned as a null.
>>
>> The reason for this change is to take advantage of Go's database/sql package's builtin NullString type [3]. This type allows userland code to scan nullable columns into a variable without any errors.
>>
>> Note: This breaking change will be part of 4.0.0 and will not affect users using 3.x.x. However, to take advantage of database/sql's null types, you will need to upgrade to 4.0.0 (when it is released) and upgrade your import paths to github.com/apache/calcite-avatica-go/v4
>> This change is only applicable for Apache Phoenix and will not affect HSQLDB.
>>
>> [1] https://issues.apache.org/jira/browse/PHOENIX-947
>> [2] https://issues.apache.org/jira/browse/CALCITE-2763
>> [3] https://golang.org/pkg/database/sql/#NullString
> 
> 


Re: Breaking change in Avatica-Go when handling null and empty strings for Apache Phoenix

Posted by Julian Hyde <jh...@apache.org>.
How this behavior enabled?

Apache Phoenix’s behavior contravenes the SQL standard (which states that empty strings and null strings are different) and so clients must not get this behavior unless they ask for it. It would be OK if they get it by default for Phoenix, but even then, there should be a way to disable it.

Related: https://issues.apache.org/jira/browse/CALCITE-815 <https://issues.apache.org/jira/browse/CALCITE-815> 

Julian


> On Jan 6, 2019, at 7:38 PM, Francis Chuang <fr...@apache.org> wrote:
> 
> This is a heads up regarding a breaking change that is currently in avatica-go master and will be released as the next major version, 4.0.0.
> 
> In Apache Phoenix, string columns set to null or an empty string ("") are considered to be equivalent. For more details on why this is the case see [1].
> 
> While fixing a bug to correctly work with null values in avatica-go [2], I had to break existing behavior.
> 
> Previous behavior: A string column set to null or an empty string will be returned as an empty string.
> 
> New behavior: A string column set to null or an empty string will be returned as a null.
> 
> The reason for this change is to take advantage of Go's database/sql package's builtin NullString type [3]. This type allows userland code to scan nullable columns into a variable without any errors.
> 
> Note: This breaking change will be part of 4.0.0 and will not affect users using 3.x.x. However, to take advantage of database/sql's null types, you will need to upgrade to 4.0.0 (when it is released) and upgrade your import paths to github.com/apache/calcite-avatica-go/v4
> This change is only applicable for Apache Phoenix and will not affect HSQLDB.
> 
> [1] https://issues.apache.org/jira/browse/PHOENIX-947
> [2] https://issues.apache.org/jira/browse/CALCITE-2763
> [3] https://golang.org/pkg/database/sql/#NullString


Breaking change in Avatica-Go when handling null and empty strings for Apache Phoenix

Posted by Francis Chuang <fr...@apache.org>.
This is a heads up regarding a breaking change that is currently in 
avatica-go master and will be released as the next major version, 4.0.0.

In Apache Phoenix, string columns set to null or an empty string ("") 
are considered to be equivalent. For more details on why this is the 
case see [1].

While fixing a bug to correctly work with null values in avatica-go [2], 
I had to break existing behavior.

Previous behavior: A string column set to null or an empty string will 
be returned as an empty string.

New behavior: A string column set to null or an empty string will be 
returned as a null.

The reason for this change is to take advantage of Go's database/sql 
package's builtin NullString type [3]. This type allows userland code to 
scan nullable columns into a variable without any errors.

Note: This breaking change will be part of 4.0.0 and will not affect 
users using 3.x.x. However, to take advantage of database/sql's null 
types, you will need to upgrade to 4.0.0 (when it is released) and 
upgrade your import paths to github.com/apache/calcite-avatica-go/v4
This change is only applicable for Apache Phoenix and will not affect 
HSQLDB.

[1] https://issues.apache.org/jira/browse/PHOENIX-947
[2] https://issues.apache.org/jira/browse/CALCITE-2763
[3] https://golang.org/pkg/database/sql/#NullString

Re: Breaking change in Avatica-Go when handling null and empty strings for Apache Phoenix

Posted by Francis Chuang <fr...@apache.org>.
I will put a note in the release note and the news item for the next 
release as I have done for previous releases where there are breaking 
changes or important things to note.

I have a WIP copy of the release notes on a separate branch, so that we 
don't accidentally publish them if we deploy the Avatica site before the 
next release. See 
https://github.com/Boostport/calcite-avatica-go/blob/4.0.0-release-notes/site/_docs/go_history.md

Francis

On 10/01/2019 8:12 am, Julian Hyde wrote:
> Breaking changes should be part of release notes, i.e. https://calcite.apache.org/avatica/docs/go_history.html <https://calcite.apache.org/avatica/docs/go_history.html>. We’ve not been very disciplined in doing this, but let’s do better.
> 
> I see no reason why we could not start writing the release notes for the next release, and add breaking changes to that page as we commit them.
> 
> Julian
>   
> 
>> On Jan 9, 2019, at 1:04 PM, Josh Elser <el...@apache.org> wrote:
>>
>> Nice write up, Francis.
>>
>> Do we have a corner of the Avatica website for the Go-driver yet which you could use to memorialize this? Thinking that it might be more readily found than via mailing list archives.
>>
>> On 1/6/19 10:38 PM, Francis Chuang wrote:
>>> This is a heads up regarding a breaking change that is currently in avatica-go master and will be released as the next major version, 4.0.0.
>>> In Apache Phoenix, string columns set to null or an empty string ("") are considered to be equivalent. For more details on why this is the case see [1].
>>> While fixing a bug to correctly work with null values in avatica-go [2], I had to break existing behavior.
>>> Previous behavior: A string column set to null or an empty string will be returned as an empty string.
>>> New behavior: A string column set to null or an empty string will be returned as a null.
>>> The reason for this change is to take advantage of Go's database/sql package's builtin NullString type [3]. This type allows userland code to scan nullable columns into a variable without any errors.
>>> Note: This breaking change will be part of 4.0.0 and will not affect users using 3.x.x. However, to take advantage of database/sql's null types, you will need to upgrade to 4.0.0 (when it is released) and upgrade your import paths to github.com/apache/calcite-avatica-go/v4
>>> This change is only applicable for Apache Phoenix and will not affect HSQLDB.
>>> [1] https://issues.apache.org/jira/browse/PHOENIX-947
>>> [2] https://issues.apache.org/jira/browse/CALCITE-2763
>>> [3] https://golang.org/pkg/database/sql/#NullString
> 
> 


Re: Breaking change in Avatica-Go when handling null and empty strings for Apache Phoenix

Posted by Julian Hyde <jh...@apache.org>.
Breaking changes should be part of release notes, i.e. https://calcite.apache.org/avatica/docs/go_history.html <https://calcite.apache.org/avatica/docs/go_history.html>. We’ve not been very disciplined in doing this, but let’s do better.

I see no reason why we could not start writing the release notes for the next release, and add breaking changes to that page as we commit them.

Julian
 

> On Jan 9, 2019, at 1:04 PM, Josh Elser <el...@apache.org> wrote:
> 
> Nice write up, Francis.
> 
> Do we have a corner of the Avatica website for the Go-driver yet which you could use to memorialize this? Thinking that it might be more readily found than via mailing list archives.
> 
> On 1/6/19 10:38 PM, Francis Chuang wrote:
>> This is a heads up regarding a breaking change that is currently in avatica-go master and will be released as the next major version, 4.0.0.
>> In Apache Phoenix, string columns set to null or an empty string ("") are considered to be equivalent. For more details on why this is the case see [1].
>> While fixing a bug to correctly work with null values in avatica-go [2], I had to break existing behavior.
>> Previous behavior: A string column set to null or an empty string will be returned as an empty string.
>> New behavior: A string column set to null or an empty string will be returned as a null.
>> The reason for this change is to take advantage of Go's database/sql package's builtin NullString type [3]. This type allows userland code to scan nullable columns into a variable without any errors.
>> Note: This breaking change will be part of 4.0.0 and will not affect users using 3.x.x. However, to take advantage of database/sql's null types, you will need to upgrade to 4.0.0 (when it is released) and upgrade your import paths to github.com/apache/calcite-avatica-go/v4
>> This change is only applicable for Apache Phoenix and will not affect HSQLDB.
>> [1] https://issues.apache.org/jira/browse/PHOENIX-947
>> [2] https://issues.apache.org/jira/browse/CALCITE-2763
>> [3] https://golang.org/pkg/database/sql/#NullString


Re: Breaking change in Avatica-Go when handling null and empty strings for Apache Phoenix

Posted by Josh Elser <el...@apache.org>.
Nice write up, Francis.

Do we have a corner of the Avatica website for the Go-driver yet which 
you could use to memorialize this? Thinking that it might be more 
readily found than via mailing list archives.

On 1/6/19 10:38 PM, Francis Chuang wrote:
> This is a heads up regarding a breaking change that is currently in 
> avatica-go master and will be released as the next major version, 4.0.0.
> 
> In Apache Phoenix, string columns set to null or an empty string ("") 
> are considered to be equivalent. For more details on why this is the 
> case see [1].
> 
> While fixing a bug to correctly work with null values in avatica-go [2], 
> I had to break existing behavior.
> 
> Previous behavior: A string column set to null or an empty string will 
> be returned as an empty string.
> 
> New behavior: A string column set to null or an empty string will be 
> returned as a null.
> 
> The reason for this change is to take advantage of Go's database/sql 
> package's builtin NullString type [3]. This type allows userland code to 
> scan nullable columns into a variable without any errors.
> 
> Note: This breaking change will be part of 4.0.0 and will not affect 
> users using 3.x.x. However, to take advantage of database/sql's null 
> types, you will need to upgrade to 4.0.0 (when it is released) and 
> upgrade your import paths to github.com/apache/calcite-avatica-go/v4
> This change is only applicable for Apache Phoenix and will not affect 
> HSQLDB.
> 
> [1] https://issues.apache.org/jira/browse/PHOENIX-947
> [2] https://issues.apache.org/jira/browse/CALCITE-2763
> [3] https://golang.org/pkg/database/sql/#NullString