You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Vachan Shetty <va...@google.com> on 2021/03/18 18:47:21 UTC

[Question] What is the best Beam datatype to map to BigQuery's BIGNUMERIC?

Hello, I am currently trying to add support for BigQuery's new BIGNUMERIC
datatype [1] in Beam's BigQueryIO. I am currently following the steps that
were used for adding the NUMERIC datatype [2]. AFAICT Beam's DECIMAL is the
most appropriate datatype to map to BIGNUMERIC in BQ. However, the Beam
DECIMAL datatype is already mapped to NUMERIC in BQ [2, 3]. Given this,
should I simply map all Beam DECIMAL to BQ BIGNUMERIC? Or should this
conversion be done based on other information? [1]:
https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#decimal_types
[2]: https://issues.apache.org/jira/browse/BEAM-4417 [3]:
https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L188

Re: [Question] What is the best Beam datatype to map to BigQuery's BIGNUMERIC?

Posted by Reuven Lax <re...@google.com>.
It does not, which might've been a mistake. The user can pass in an
arbitrary  BigDecimal object, and we will encode whatever scale parameter
is encoded. This means that for DECIMAL, each record encodes the scale.

On Thu, Mar 18, 2021 at 12:33 PM Mingyu Zhong <my...@google.com> wrote:

> Just wanted to clarify: BigQuery BIGNUMERIC type costs more than NUMERIC
> type, so if NUMERIC is sufficient, the users likely won't want to switch to
> BIGNUMERIC.
>
> Does Beam DECIMAL datatype contain the precision/scale parameters in the
> metadata? If so, can we use those parameters to determine the mapped type?
>
> On Thu, Mar 18, 2021 at 12:08 PM Brian Hulette <bh...@google.com>
> wrote:
>
>> Hi Vachan,
>> I don't think Beam DECIMAL is really a great mapping for either
>> BigQuery's NUMERIC or BIGNUMERIC type. Beam's DECIMAL represents arbitrary
>> precision decimals [1] to map well to Java's BigDecimal class [2].
>>
>> Maybe we should add a fixed-precision decimal logical type [3], then have
>> specific instances of it with the appropriate precision that map to NUMERIC
>> and BIGNUMERIC. We could also shunt Beam DECIMAL to BIGNUMERIC for
>> compatibility.
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L424
>> [2] https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html
>> [3]
>> https://github.com/apache/beam/tree/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/logicaltypes
>>
>> On Thu, Mar 18, 2021 at 12:00 PM Vachan Shetty <va...@google.com> wrote:
>>
>>> Hello, I am currently trying to add support for BigQuery's new
>>> BIGNUMERIC datatype [1] in Beam's BigQueryIO. I am currently following the
>>> steps that were used for adding the NUMERIC datatype [2]. AFAICT Beam's
>>> DECIMAL is the most appropriate datatype to map to BIGNUMERIC in BQ.
>>> However, the Beam DECIMAL datatype is already mapped to NUMERIC in BQ
>>> [2, 3]. Given this, should I simply map all Beam DECIMAL to BQ BIGNUMERIC?
>>> Or should this conversion be done based on other information? [1]:
>>> https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#decimal_types
>>> [2]: https://issues.apache.org/jira/browse/BEAM-4417 [3]:
>>> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L188
>>>
>>
>
> --
> Thanks,
>
> Mingyu
>

Re: [Question] What is the best Beam datatype to map to BigQuery's BIGNUMERIC?

Posted by Kenneth Knowles <kl...@google.com>.
Would it make sense to separate BQ -> Beam and Beam -> BQ mappings? Looking
at the code I can't tell if this is already done. There's no particular
reason to expect every type to have an equivalent with a bijection.

If the user wants NUMERIC for cost, and is confident their data would fit,
we can let them request it and fail on individual bad values, maybe?

Kenn

On Thu, Mar 18, 2021 at 12:33 PM Mingyu Zhong <my...@google.com> wrote:

> Just wanted to clarify: BigQuery BIGNUMERIC type costs more than NUMERIC
> type, so if NUMERIC is sufficient, the users likely won't want to switch to
> BIGNUMERIC.
>
> Does Beam DECIMAL datatype contain the precision/scale parameters in the
> metadata? If so, can we use those parameters to determine the mapped type?
>
> On Thu, Mar 18, 2021 at 12:08 PM Brian Hulette <bh...@google.com>
> wrote:
>
>> Hi Vachan,
>> I don't think Beam DECIMAL is really a great mapping for either
>> BigQuery's NUMERIC or BIGNUMERIC type. Beam's DECIMAL represents arbitrary
>> precision decimals [1] to map well to Java's BigDecimal class [2].
>>
>> Maybe we should add a fixed-precision decimal logical type [3], then have
>> specific instances of it with the appropriate precision that map to NUMERIC
>> and BIGNUMERIC. We could also shunt Beam DECIMAL to BIGNUMERIC for
>> compatibility.
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L424
>> [2] https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html
>> [3]
>> https://github.com/apache/beam/tree/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/logicaltypes
>>
>> On Thu, Mar 18, 2021 at 12:00 PM Vachan Shetty <va...@google.com> wrote:
>>
>>> Hello, I am currently trying to add support for BigQuery's new
>>> BIGNUMERIC datatype [1] in Beam's BigQueryIO. I am currently following the
>>> steps that were used for adding the NUMERIC datatype [2]. AFAICT Beam's
>>> DECIMAL is the most appropriate datatype to map to BIGNUMERIC in BQ.
>>> However, the Beam DECIMAL datatype is already mapped to NUMERIC in BQ
>>> [2, 3]. Given this, should I simply map all Beam DECIMAL to BQ BIGNUMERIC?
>>> Or should this conversion be done based on other information? [1]:
>>> https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#decimal_types
>>> [2]: https://issues.apache.org/jira/browse/BEAM-4417 [3]:
>>> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L188
>>>
>>
>
> --
> Thanks,
>
> Mingyu
>

Re: [Question] What is the best Beam datatype to map to BigQuery's BIGNUMERIC?

Posted by Mingyu Zhong <my...@google.com>.
Just wanted to clarify: BigQuery BIGNUMERIC type costs more than NUMERIC
type, so if NUMERIC is sufficient, the users likely won't want to switch to
BIGNUMERIC.

Does Beam DECIMAL datatype contain the precision/scale parameters in the
metadata? If so, can we use those parameters to determine the mapped type?

On Thu, Mar 18, 2021 at 12:08 PM Brian Hulette <bh...@google.com> wrote:

> Hi Vachan,
> I don't think Beam DECIMAL is really a great mapping for either BigQuery's
> NUMERIC or BIGNUMERIC type. Beam's DECIMAL represents arbitrary precision
> decimals [1] to map well to Java's BigDecimal class [2].
>
> Maybe we should add a fixed-precision decimal logical type [3], then have
> specific instances of it with the appropriate precision that map to NUMERIC
> and BIGNUMERIC. We could also shunt Beam DECIMAL to BIGNUMERIC for
> compatibility.
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L424
> [2] https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html
> [3]
> https://github.com/apache/beam/tree/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/logicaltypes
>
> On Thu, Mar 18, 2021 at 12:00 PM Vachan Shetty <va...@google.com> wrote:
>
>> Hello, I am currently trying to add support for BigQuery's new BIGNUMERIC
>> datatype [1] in Beam's BigQueryIO. I am currently following the steps that
>> were used for adding the NUMERIC datatype [2]. AFAICT Beam's DECIMAL is
>> the most appropriate datatype to map to BIGNUMERIC in BQ. However, the
>> Beam DECIMAL datatype is already mapped to NUMERIC in BQ [2, 3]. Given
>> this, should I simply map all Beam DECIMAL to BQ BIGNUMERIC? Or should this
>> conversion be done based on other information? [1]:
>> https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#decimal_types
>> [2]: https://issues.apache.org/jira/browse/BEAM-4417 [3]:
>> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L188
>>
>

-- 
Thanks,

Mingyu

Re: [Question] What is the best Beam datatype to map to BigQuery's BIGNUMERIC?

Posted by Brian Hulette <bh...@google.com>.
Hi Vachan,
I don't think Beam DECIMAL is really a great mapping for either BigQuery's
NUMERIC or BIGNUMERIC type. Beam's DECIMAL represents arbitrary precision
decimals [1] to map well to Java's BigDecimal class [2].

Maybe we should add a fixed-precision decimal logical type [3], then have
specific instances of it with the appropriate precision that map to NUMERIC
and BIGNUMERIC. We could also shunt Beam DECIMAL to BIGNUMERIC for
compatibility.

[1]
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L424
[2] https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html
[3]
https://github.com/apache/beam/tree/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/logicaltypes

On Thu, Mar 18, 2021 at 12:00 PM Vachan Shetty <va...@google.com> wrote:

> Hello, I am currently trying to add support for BigQuery's new BIGNUMERIC
> datatype [1] in Beam's BigQueryIO. I am currently following the steps that
> were used for adding the NUMERIC datatype [2]. AFAICT Beam's DECIMAL is
> the most appropriate datatype to map to BIGNUMERIC in BQ. However, the
> Beam DECIMAL datatype is already mapped to NUMERIC in BQ [2, 3]. Given
> this, should I simply map all Beam DECIMAL to BQ BIGNUMERIC? Or should this
> conversion be done based on other information? [1]:
> https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#decimal_types
> [2]: https://issues.apache.org/jira/browse/BEAM-4417 [3]:
> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L188
>

Re: [Question] What is the best Beam datatype to map to BigQuery's BIGNUMERIC?

Posted by Reuven Lax <re...@google.com>.
Good question - Beam DECIMAL is arbitrary precision, so is probably a
better fit for BIGNUMERIC.

On Thu, Mar 18, 2021 at 12:00 PM Vachan Shetty <va...@google.com> wrote:

> Hello, I am currently trying to add support for BigQuery's new BIGNUMERIC
> datatype [1] in Beam's BigQueryIO. I am currently following the steps that
> were used for adding the NUMERIC datatype [2]. AFAICT Beam's DECIMAL is
> the most appropriate datatype to map to BIGNUMERIC in BQ. However, the
> Beam DECIMAL datatype is already mapped to NUMERIC in BQ [2, 3]. Given
> this, should I simply map all Beam DECIMAL to BQ BIGNUMERIC? Or should this
> conversion be done based on other information? [1]:
> https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#decimal_types
> [2]: https://issues.apache.org/jira/browse/BEAM-4417 [3]:
> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L188
>