You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Marco Gaido <ma...@gmail.com> on 2018/10/25 13:48:46 UTC

[DISCUSS] Support decimals with negative scale in decimal operation

Hi all,

a bit more than one month ago, I sent a proposal for handling properly
decimals with negative scales in our operations. This is a long standing
problem in our codebase as we derived our rules from Hive and SQLServer
where negative scales are forbidden, while in Spark they are not.

The discussion has been stale for a while now. No more comments on the
design doc:
https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm
.

So I am writing this e-mail in order to check whether there are more
comments on it or we can go ahead with the PR.

Thanks,
Marco

Re: [DISCUSS] Support decimals with negative scale in decimal operation

Posted by Marco Gaido <ma...@gmail.com>.
Jörn, may you explain a bit more your proposal, please? We are not
modifying the existing decimal datatype. This is how it works now. If you
check the PR, the only difference is how we compute the result for the
divsion operation. The discussion about precision and scale is about: shall
we limit them more then we are doing now? Now we are supporting any scale
<= precision and any precision in the range (1, 38].

Il giorno mer 9 gen 2019 alle ore 09:13 Jörn Franke <jo...@gmail.com>
ha scritto:

> Maybe it is better to introduce a new datatype that supports negative
> scale, otherwise the migration and testing efforts for organizations
> running Spark application becomes too large. Of course the current decimal
> will be kept as it is.
>
> Am 07.01.2019 um 15:08 schrieb Marco Gaido <ma...@gmail.com>:
>
> In general we can say that some datasources allow them, others fail. At
> the moment, we are doing no casting before writing (so we can state so in
> the doc). But since there is ongoing discussion for DSv2, we can maybe add
> a flag/interface there for "negative scale intollerant" DS and try and cast
> before writing to them. What do you think about this?
>
> Il giorno lun 7 gen 2019 alle ore 15:03 Wenchen Fan <cl...@gmail.com>
> ha scritto:
>
>> AFAIK parquet spec says decimal scale can't be negative. If we want to
>> officially support negative-scale decimal, we should clearly define the
>> behavior when writing negative-scale decimals to parquet and other data
>> sources. The most straightforward way is to fail for this case, but maybe
>> we can do something better, like casting decimal(1, -20) to decimal(20, 0)
>> before writing.
>>
>> On Mon, Jan 7, 2019 at 9:32 PM Marco Gaido <ma...@gmail.com>
>> wrote:
>>
>>> Hi Wenchen,
>>>
>>> thanks for your email. I agree adding doc for decimal type, but I am not
>>> sure what you mean speaking of the behavior when writing: we are not
>>> performing any automatic casting before writing; if we want to do that, we
>>> need a design about it I think.
>>>
>>> I am not sure if it makes sense to set a min for it. That would break
>>> backward compatibility (for very weird use case), so I wouldn't do that.
>>>
>>> Thanks,
>>> Marco
>>>
>>> Il giorno lun 7 gen 2019 alle ore 05:53 Wenchen Fan <cl...@gmail.com>
>>> ha scritto:
>>>
>>>> I think we need to do this for backward compatibility, and according to
>>>> the discussion in the doc, SQL standard allows negative scale.
>>>>
>>>> To do this, I think the PR should also include a doc for the decimal
>>>> type, like the definition of precision and scale(this one
>>>> <https://stackoverflow.com/questions/35435691/bigdecimal-precision-and-scale>
>>>> looks pretty good), and the result type of decimal operations, and the
>>>> behavior when writing out decimals(e.g. we can cast decimal(1, -20) to
>>>> decimal(20, 0) before writing).
>>>>
>>>> Another question is, shall we set a min scale? e.g. shall we allow
>>>> decimal(1, -10000000)?
>>>>
>>>> On Thu, Oct 25, 2018 at 9:49 PM Marco Gaido <ma...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> a bit more than one month ago, I sent a proposal for handling properly
>>>>> decimals with negative scales in our operations. This is a long standing
>>>>> problem in our codebase as we derived our rules from Hive and SQLServer
>>>>> where negative scales are forbidden, while in Spark they are not.
>>>>>
>>>>> The discussion has been stale for a while now. No more comments on the
>>>>> design doc:
>>>>> https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm
>>>>> .
>>>>>
>>>>> So I am writing this e-mail in order to check whether there are more
>>>>> comments on it or we can go ahead with the PR.
>>>>>
>>>>> Thanks,
>>>>> Marco
>>>>>
>>>>

Re: [DISCUSS] Support decimals with negative scale in decimal operation

Posted by Jörn Franke <jo...@gmail.com>.
Maybe it is better to introduce a new datatype that supports negative scale, otherwise the migration and testing efforts for organizations running Spark application becomes too large. Of course the current decimal will be kept as it is.

> Am 07.01.2019 um 15:08 schrieb Marco Gaido <ma...@gmail.com>:
> 
> In general we can say that some datasources allow them, others fail. At the moment, we are doing no casting before writing (so we can state so in the doc). But since there is ongoing discussion for DSv2, we can maybe add a flag/interface there for "negative scale intollerant" DS and try and cast before writing to them. What do you think about this?
> 
>> Il giorno lun 7 gen 2019 alle ore 15:03 Wenchen Fan <cl...@gmail.com> ha scritto:
>> AFAIK parquet spec says decimal scale can't be negative. If we want to officially support negative-scale decimal, we should clearly define the behavior when writing negative-scale decimals to parquet and other data sources. The most straightforward way is to fail for this case, but maybe we can do something better, like casting decimal(1, -20) to decimal(20, 0) before writing.
>> 
>>> On Mon, Jan 7, 2019 at 9:32 PM Marco Gaido <ma...@gmail.com> wrote:
>>> Hi Wenchen,
>>> 
>>> thanks for your email. I agree adding doc for decimal type, but I am not sure what you mean speaking of the behavior when writing: we are not performing any automatic casting before writing; if we want to do that, we need a design about it I think.
>>> 
>>> I am not sure if it makes sense to set a min for it. That would break backward compatibility (for very weird use case), so I wouldn't do that.
>>> 
>>> Thanks,
>>> Marco
>>> 
>>>> Il giorno lun 7 gen 2019 alle ore 05:53 Wenchen Fan <cl...@gmail.com> ha scritto:
>>>> I think we need to do this for backward compatibility, and according to the discussion in the doc, SQL standard allows negative scale.
>>>> 
>>>> To do this, I think the PR should also include a doc for the decimal type, like the definition of precision and scale(this one looks pretty good), and the result type of decimal operations, and the behavior when writing out decimals(e.g. we can cast decimal(1, -20) to decimal(20, 0) before writing).
>>>> 
>>>> Another question is, shall we set a min scale? e.g. shall we allow decimal(1, -10000000)?
>>>> 
>>>>> On Thu, Oct 25, 2018 at 9:49 PM Marco Gaido <ma...@gmail.com> wrote:
>>>>> Hi all,
>>>>> 
>>>>> a bit more than one month ago, I sent a proposal for handling properly decimals with negative scales in our operations. This is a long standing problem in our codebase as we derived our rules from Hive and SQLServer where negative scales are forbidden, while in Spark they are not.
>>>>> 
>>>>> The discussion has been stale for a while now. No more comments on the design doc: https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm.
>>>>> 
>>>>> So I am writing this e-mail in order to check whether there are more comments on it or we can go ahead with the PR.
>>>>> 
>>>>> Thanks,
>>>>> Marco

Re: [DISCUSS] Support decimals with negative scale in decimal operation

Posted by Marco Gaido <ma...@gmail.com>.
Oracle does the same: "The *scale* must be less than or equal to the
precision." (see
https://docs.oracle.com/javadb/10.6.2.1/ref/rrefsqlj15260.html).

Il giorno mer 9 gen 2019 alle ore 05:31 Wenchen Fan <cl...@gmail.com>
ha scritto:

> Some more thoughts. If we support unlimited negative scale, why can't we
> support unlimited positive scale? e.g. 0.0001 can be decimal(1, 4) instead
> of (4, 4). I think we need more references here: how other databases deal
> with decimal type and parse decimal literals?
>
> On Mon, Jan 7, 2019 at 10:36 PM Wenchen Fan <cl...@gmail.com> wrote:
>
>> I'm OK with it, i.e. fail the write if there are negative-scale decimals
>> (we need to document it though). We can improve it later in data source v2.
>>
>> On Mon, Jan 7, 2019 at 10:09 PM Marco Gaido <ma...@gmail.com>
>> wrote:
>>
>>> In general we can say that some datasources allow them, others fail. At
>>> the moment, we are doing no casting before writing (so we can state so in
>>> the doc). But since there is ongoing discussion for DSv2, we can maybe add
>>> a flag/interface there for "negative scale intollerant" DS and try and cast
>>> before writing to them. What do you think about this?
>>>
>>> Il giorno lun 7 gen 2019 alle ore 15:03 Wenchen Fan <cl...@gmail.com>
>>> ha scritto:
>>>
>>>> AFAIK parquet spec says decimal scale can't be negative. If we want to
>>>> officially support negative-scale decimal, we should clearly define the
>>>> behavior when writing negative-scale decimals to parquet and other data
>>>> sources. The most straightforward way is to fail for this case, but maybe
>>>> we can do something better, like casting decimal(1, -20) to decimal(20, 0)
>>>> before writing.
>>>>
>>>> On Mon, Jan 7, 2019 at 9:32 PM Marco Gaido <ma...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Wenchen,
>>>>>
>>>>> thanks for your email. I agree adding doc for decimal type, but I am
>>>>> not sure what you mean speaking of the behavior when writing: we are not
>>>>> performing any automatic casting before writing; if we want to do that, we
>>>>> need a design about it I think.
>>>>>
>>>>> I am not sure if it makes sense to set a min for it. That would break
>>>>> backward compatibility (for very weird use case), so I wouldn't do that.
>>>>>
>>>>> Thanks,
>>>>> Marco
>>>>>
>>>>> Il giorno lun 7 gen 2019 alle ore 05:53 Wenchen Fan <
>>>>> cloud0fan@gmail.com> ha scritto:
>>>>>
>>>>>> I think we need to do this for backward compatibility, and according
>>>>>> to the discussion in the doc, SQL standard allows negative scale.
>>>>>>
>>>>>> To do this, I think the PR should also include a doc for the decimal
>>>>>> type, like the definition of precision and scale(this one
>>>>>> <https://stackoverflow.com/questions/35435691/bigdecimal-precision-and-scale>
>>>>>> looks pretty good), and the result type of decimal operations, and the
>>>>>> behavior when writing out decimals(e.g. we can cast decimal(1, -20) to
>>>>>> decimal(20, 0) before writing).
>>>>>>
>>>>>> Another question is, shall we set a min scale? e.g. shall we allow
>>>>>> decimal(1, -10000000)?
>>>>>>
>>>>>> On Thu, Oct 25, 2018 at 9:49 PM Marco Gaido <ma...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> a bit more than one month ago, I sent a proposal for handling
>>>>>>> properly decimals with negative scales in our operations. This is a long
>>>>>>> standing problem in our codebase as we derived our rules from Hive and
>>>>>>> SQLServer where negative scales are forbidden, while in Spark they are not.
>>>>>>>
>>>>>>> The discussion has been stale for a while now. No more comments on
>>>>>>> the design doc:
>>>>>>> https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm
>>>>>>> .
>>>>>>>
>>>>>>> So I am writing this e-mail in order to check whether there are more
>>>>>>> comments on it or we can go ahead with the PR.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Marco
>>>>>>>
>>>>>>

Re: [DISCUSS] Support decimals with negative scale in decimal operation

Posted by Wenchen Fan <cl...@gmail.com>.
Some more thoughts. If we support unlimited negative scale, why can't we
support unlimited positive scale? e.g. 0.0001 can be decimal(1, 4) instead
of (4, 4). I think we need more references here: how other databases deal
with decimal type and parse decimal literals?

On Mon, Jan 7, 2019 at 10:36 PM Wenchen Fan <cl...@gmail.com> wrote:

> I'm OK with it, i.e. fail the write if there are negative-scale decimals
> (we need to document it though). We can improve it later in data source v2.
>
> On Mon, Jan 7, 2019 at 10:09 PM Marco Gaido <ma...@gmail.com>
> wrote:
>
>> In general we can say that some datasources allow them, others fail. At
>> the moment, we are doing no casting before writing (so we can state so in
>> the doc). But since there is ongoing discussion for DSv2, we can maybe add
>> a flag/interface there for "negative scale intollerant" DS and try and cast
>> before writing to them. What do you think about this?
>>
>> Il giorno lun 7 gen 2019 alle ore 15:03 Wenchen Fan <cl...@gmail.com>
>> ha scritto:
>>
>>> AFAIK parquet spec says decimal scale can't be negative. If we want to
>>> officially support negative-scale decimal, we should clearly define the
>>> behavior when writing negative-scale decimals to parquet and other data
>>> sources. The most straightforward way is to fail for this case, but maybe
>>> we can do something better, like casting decimal(1, -20) to decimal(20, 0)
>>> before writing.
>>>
>>> On Mon, Jan 7, 2019 at 9:32 PM Marco Gaido <ma...@gmail.com>
>>> wrote:
>>>
>>>> Hi Wenchen,
>>>>
>>>> thanks for your email. I agree adding doc for decimal type, but I am
>>>> not sure what you mean speaking of the behavior when writing: we are not
>>>> performing any automatic casting before writing; if we want to do that, we
>>>> need a design about it I think.
>>>>
>>>> I am not sure if it makes sense to set a min for it. That would break
>>>> backward compatibility (for very weird use case), so I wouldn't do that.
>>>>
>>>> Thanks,
>>>> Marco
>>>>
>>>> Il giorno lun 7 gen 2019 alle ore 05:53 Wenchen Fan <
>>>> cloud0fan@gmail.com> ha scritto:
>>>>
>>>>> I think we need to do this for backward compatibility, and according
>>>>> to the discussion in the doc, SQL standard allows negative scale.
>>>>>
>>>>> To do this, I think the PR should also include a doc for the decimal
>>>>> type, like the definition of precision and scale(this one
>>>>> <https://stackoverflow.com/questions/35435691/bigdecimal-precision-and-scale>
>>>>> looks pretty good), and the result type of decimal operations, and the
>>>>> behavior when writing out decimals(e.g. we can cast decimal(1, -20) to
>>>>> decimal(20, 0) before writing).
>>>>>
>>>>> Another question is, shall we set a min scale? e.g. shall we allow
>>>>> decimal(1, -10000000)?
>>>>>
>>>>> On Thu, Oct 25, 2018 at 9:49 PM Marco Gaido <ma...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> a bit more than one month ago, I sent a proposal for handling
>>>>>> properly decimals with negative scales in our operations. This is a long
>>>>>> standing problem in our codebase as we derived our rules from Hive and
>>>>>> SQLServer where negative scales are forbidden, while in Spark they are not.
>>>>>>
>>>>>> The discussion has been stale for a while now. No more comments on
>>>>>> the design doc:
>>>>>> https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm
>>>>>> .
>>>>>>
>>>>>> So I am writing this e-mail in order to check whether there are more
>>>>>> comments on it or we can go ahead with the PR.
>>>>>>
>>>>>> Thanks,
>>>>>> Marco
>>>>>>
>>>>>

Re: [DISCUSS] Support decimals with negative scale in decimal operation

Posted by Wenchen Fan <cl...@gmail.com>.
I'm OK with it, i.e. fail the write if there are negative-scale decimals
(we need to document it though). We can improve it later in data source v2.

On Mon, Jan 7, 2019 at 10:09 PM Marco Gaido <ma...@gmail.com> wrote:

> In general we can say that some datasources allow them, others fail. At
> the moment, we are doing no casting before writing (so we can state so in
> the doc). But since there is ongoing discussion for DSv2, we can maybe add
> a flag/interface there for "negative scale intollerant" DS and try and cast
> before writing to them. What do you think about this?
>
> Il giorno lun 7 gen 2019 alle ore 15:03 Wenchen Fan <cl...@gmail.com>
> ha scritto:
>
>> AFAIK parquet spec says decimal scale can't be negative. If we want to
>> officially support negative-scale decimal, we should clearly define the
>> behavior when writing negative-scale decimals to parquet and other data
>> sources. The most straightforward way is to fail for this case, but maybe
>> we can do something better, like casting decimal(1, -20) to decimal(20, 0)
>> before writing.
>>
>> On Mon, Jan 7, 2019 at 9:32 PM Marco Gaido <ma...@gmail.com>
>> wrote:
>>
>>> Hi Wenchen,
>>>
>>> thanks for your email. I agree adding doc for decimal type, but I am not
>>> sure what you mean speaking of the behavior when writing: we are not
>>> performing any automatic casting before writing; if we want to do that, we
>>> need a design about it I think.
>>>
>>> I am not sure if it makes sense to set a min for it. That would break
>>> backward compatibility (for very weird use case), so I wouldn't do that.
>>>
>>> Thanks,
>>> Marco
>>>
>>> Il giorno lun 7 gen 2019 alle ore 05:53 Wenchen Fan <cl...@gmail.com>
>>> ha scritto:
>>>
>>>> I think we need to do this for backward compatibility, and according to
>>>> the discussion in the doc, SQL standard allows negative scale.
>>>>
>>>> To do this, I think the PR should also include a doc for the decimal
>>>> type, like the definition of precision and scale(this one
>>>> <https://stackoverflow.com/questions/35435691/bigdecimal-precision-and-scale>
>>>> looks pretty good), and the result type of decimal operations, and the
>>>> behavior when writing out decimals(e.g. we can cast decimal(1, -20) to
>>>> decimal(20, 0) before writing).
>>>>
>>>> Another question is, shall we set a min scale? e.g. shall we allow
>>>> decimal(1, -10000000)?
>>>>
>>>> On Thu, Oct 25, 2018 at 9:49 PM Marco Gaido <ma...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> a bit more than one month ago, I sent a proposal for handling properly
>>>>> decimals with negative scales in our operations. This is a long standing
>>>>> problem in our codebase as we derived our rules from Hive and SQLServer
>>>>> where negative scales are forbidden, while in Spark they are not.
>>>>>
>>>>> The discussion has been stale for a while now. No more comments on the
>>>>> design doc:
>>>>> https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm
>>>>> .
>>>>>
>>>>> So I am writing this e-mail in order to check whether there are more
>>>>> comments on it or we can go ahead with the PR.
>>>>>
>>>>> Thanks,
>>>>> Marco
>>>>>
>>>>

Re: [DISCUSS] Support decimals with negative scale in decimal operation

Posted by Marco Gaido <ma...@gmail.com>.
In general we can say that some datasources allow them, others fail. At the
moment, we are doing no casting before writing (so we can state so in the
doc). But since there is ongoing discussion for DSv2, we can maybe add a
flag/interface there for "negative scale intollerant" DS and try and cast
before writing to them. What do you think about this?

Il giorno lun 7 gen 2019 alle ore 15:03 Wenchen Fan <cl...@gmail.com>
ha scritto:

> AFAIK parquet spec says decimal scale can't be negative. If we want to
> officially support negative-scale decimal, we should clearly define the
> behavior when writing negative-scale decimals to parquet and other data
> sources. The most straightforward way is to fail for this case, but maybe
> we can do something better, like casting decimal(1, -20) to decimal(20, 0)
> before writing.
>
> On Mon, Jan 7, 2019 at 9:32 PM Marco Gaido <ma...@gmail.com> wrote:
>
>> Hi Wenchen,
>>
>> thanks for your email. I agree adding doc for decimal type, but I am not
>> sure what you mean speaking of the behavior when writing: we are not
>> performing any automatic casting before writing; if we want to do that, we
>> need a design about it I think.
>>
>> I am not sure if it makes sense to set a min for it. That would break
>> backward compatibility (for very weird use case), so I wouldn't do that.
>>
>> Thanks,
>> Marco
>>
>> Il giorno lun 7 gen 2019 alle ore 05:53 Wenchen Fan <cl...@gmail.com>
>> ha scritto:
>>
>>> I think we need to do this for backward compatibility, and according to
>>> the discussion in the doc, SQL standard allows negative scale.
>>>
>>> To do this, I think the PR should also include a doc for the decimal
>>> type, like the definition of precision and scale(this one
>>> <https://stackoverflow.com/questions/35435691/bigdecimal-precision-and-scale>
>>> looks pretty good), and the result type of decimal operations, and the
>>> behavior when writing out decimals(e.g. we can cast decimal(1, -20) to
>>> decimal(20, 0) before writing).
>>>
>>> Another question is, shall we set a min scale? e.g. shall we allow
>>> decimal(1, -10000000)?
>>>
>>> On Thu, Oct 25, 2018 at 9:49 PM Marco Gaido <ma...@gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> a bit more than one month ago, I sent a proposal for handling properly
>>>> decimals with negative scales in our operations. This is a long standing
>>>> problem in our codebase as we derived our rules from Hive and SQLServer
>>>> where negative scales are forbidden, while in Spark they are not.
>>>>
>>>> The discussion has been stale for a while now. No more comments on the
>>>> design doc:
>>>> https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm
>>>> .
>>>>
>>>> So I am writing this e-mail in order to check whether there are more
>>>> comments on it or we can go ahead with the PR.
>>>>
>>>> Thanks,
>>>> Marco
>>>>
>>>

Re: [DISCUSS] Support decimals with negative scale in decimal operation

Posted by Wenchen Fan <cl...@gmail.com>.
AFAIK parquet spec says decimal scale can't be negative. If we want to
officially support negative-scale decimal, we should clearly define the
behavior when writing negative-scale decimals to parquet and other data
sources. The most straightforward way is to fail for this case, but maybe
we can do something better, like casting decimal(1, -20) to decimal(20, 0)
before writing.

On Mon, Jan 7, 2019 at 9:32 PM Marco Gaido <ma...@gmail.com> wrote:

> Hi Wenchen,
>
> thanks for your email. I agree adding doc for decimal type, but I am not
> sure what you mean speaking of the behavior when writing: we are not
> performing any automatic casting before writing; if we want to do that, we
> need a design about it I think.
>
> I am not sure if it makes sense to set a min for it. That would break
> backward compatibility (for very weird use case), so I wouldn't do that.
>
> Thanks,
> Marco
>
> Il giorno lun 7 gen 2019 alle ore 05:53 Wenchen Fan <cl...@gmail.com>
> ha scritto:
>
>> I think we need to do this for backward compatibility, and according to
>> the discussion in the doc, SQL standard allows negative scale.
>>
>> To do this, I think the PR should also include a doc for the decimal
>> type, like the definition of precision and scale(this one
>> <https://stackoverflow.com/questions/35435691/bigdecimal-precision-and-scale>
>> looks pretty good), and the result type of decimal operations, and the
>> behavior when writing out decimals(e.g. we can cast decimal(1, -20) to
>> decimal(20, 0) before writing).
>>
>> Another question is, shall we set a min scale? e.g. shall we allow
>> decimal(1, -10000000)?
>>
>> On Thu, Oct 25, 2018 at 9:49 PM Marco Gaido <ma...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> a bit more than one month ago, I sent a proposal for handling properly
>>> decimals with negative scales in our operations. This is a long standing
>>> problem in our codebase as we derived our rules from Hive and SQLServer
>>> where negative scales are forbidden, while in Spark they are not.
>>>
>>> The discussion has been stale for a while now. No more comments on the
>>> design doc:
>>> https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm
>>> .
>>>
>>> So I am writing this e-mail in order to check whether there are more
>>> comments on it or we can go ahead with the PR.
>>>
>>> Thanks,
>>> Marco
>>>
>>

Re: [DISCUSS] Support decimals with negative scale in decimal operation

Posted by Marco Gaido <ma...@gmail.com>.
Hi Wenchen,

thanks for your email. I agree adding doc for decimal type, but I am not
sure what you mean speaking of the behavior when writing: we are not
performing any automatic casting before writing; if we want to do that, we
need a design about it I think.

I am not sure if it makes sense to set a min for it. That would break
backward compatibility (for very weird use case), so I wouldn't do that.

Thanks,
Marco

Il giorno lun 7 gen 2019 alle ore 05:53 Wenchen Fan <cl...@gmail.com>
ha scritto:

> I think we need to do this for backward compatibility, and according to
> the discussion in the doc, SQL standard allows negative scale.
>
> To do this, I think the PR should also include a doc for the decimal type,
> like the definition of precision and scale(this one
> <https://stackoverflow.com/questions/35435691/bigdecimal-precision-and-scale>
> looks pretty good), and the result type of decimal operations, and the
> behavior when writing out decimals(e.g. we can cast decimal(1, -20) to
> decimal(20, 0) before writing).
>
> Another question is, shall we set a min scale? e.g. shall we allow
> decimal(1, -10000000)?
>
> On Thu, Oct 25, 2018 at 9:49 PM Marco Gaido <ma...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> a bit more than one month ago, I sent a proposal for handling properly
>> decimals with negative scales in our operations. This is a long standing
>> problem in our codebase as we derived our rules from Hive and SQLServer
>> where negative scales are forbidden, while in Spark they are not.
>>
>> The discussion has been stale for a while now. No more comments on the
>> design doc:
>> https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm
>> .
>>
>> So I am writing this e-mail in order to check whether there are more
>> comments on it or we can go ahead with the PR.
>>
>> Thanks,
>> Marco
>>
>

Re: [DISCUSS] Support decimals with negative scale in decimal operation

Posted by Wenchen Fan <cl...@gmail.com>.
I think we need to do this for backward compatibility, and according to the
discussion in the doc, SQL standard allows negative scale.

To do this, I think the PR should also include a doc for the decimal type,
like the definition of precision and scale(this one
<https://stackoverflow.com/questions/35435691/bigdecimal-precision-and-scale>
looks pretty good), and the result type of decimal operations, and the
behavior when writing out decimals(e.g. we can cast decimal(1, -20) to
decimal(20, 0) before writing).

Another question is, shall we set a min scale? e.g. shall we allow
decimal(1, -10000000)?

On Thu, Oct 25, 2018 at 9:49 PM Marco Gaido <ma...@gmail.com> wrote:

> Hi all,
>
> a bit more than one month ago, I sent a proposal for handling properly
> decimals with negative scales in our operations. This is a long standing
> problem in our codebase as we derived our rules from Hive and SQLServer
> where negative scales are forbidden, while in Spark they are not.
>
> The discussion has been stale for a while now. No more comments on the
> design doc:
> https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit#heading=h.x7062zmkubwm
> .
>
> So I am writing this e-mail in order to check whether there are more
> comments on it or we can go ahead with the PR.
>
> Thanks,
> Marco
>