You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/02/22 17:49:10 UTC

[GitHub] [spark] MaxGekk opened a new pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

MaxGekk opened a new pull request #31614:
URL: https://github.com/apache/spark/pull/31614


   ### What changes were proposed in this pull request?
   In the PR, I propose to extend Catalyst's type system by two new types that conform to the SQL standard (see SQL:2016, section 4.6.3):
   - `DayTimeIntervalType` represents the day-time interval type,
   - `YearMonthIntervalType` for SQL year-month interval type.
   
   ### Why are the changes needed?
   Spark as it is today supports an INTERVAL datatype. However this type is of very limited use. Existing interval values cannot be compared with any other interval values, or persisted to storage. Spark users request to either implement new or expand existing built-in functions which produce some sort of measures for elapsed time, such as `DATEDIFF()`. Rather than work around the edges to fill the potholes of the existing INTERVAL data type, I would like to propose to deliver a proper ANSI compliant INTERVAL type that can be introduced with minimal incompatibility, is comparable and thus sortable, and can be persisted in tables.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   1. By checking coding style via:
   ```
   $ ./dev/scalastyle
   $ ./dev/lint-java
   ```
   2. Run the test for the default sizes:
   ```
   $ build/sbt "test:testOnly *DataTypeSuite"
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-788920967


   Also cc @gatorsmile @rxin @marmbrus @srowen since you reviewed similar PR for TimeType: https://github.com/apache/spark/pull/25678


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31614:
URL: https://github.com/apache/spark/pull/31614#discussion_r582089245



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.types
+
+import scala.math.Ordering
+import scala.reflect.runtime.universe.typeTag
+
+import org.apache.spark.annotation.Unstable
+
+/**
+ * The type represents day-time intervals of the SQL standard. A day-time interval is made up
+ * of a contiguous subset of the following fields:
+ *   - SECOND, seconds within minutes and possibly fractions of a second [0..59.999999],
+ *   - MINUTE, minutes within hours [0..59],
+ *   - HOUR, hours within days [0..23],
+ *   - DAY, days in the range [0..106751991].

Review comment:
       can this be negative?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783554169


   @cloud-fan @HyukjinKwon @srielau @bart-samwel @yaooqinn Could you take a look at this PR, please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31614:
URL: https://github.com/apache/spark/pull/31614#discussion_r582087795



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.types
+
+import scala.math.Ordering
+import scala.reflect.runtime.universe.typeTag
+
+import org.apache.spark.annotation.Unstable
+
+/**
+ * The type represents day-time intervals of the SQL standard. A day-time interval is made up
+ * of a contiguous subset of the following fields:
+ *   - SECOND, seconds within minutes and possibly fractions of a second [0..59.999999],
+ *   - MINUTE, minutes within hours [0..59],
+ *   - HOUR, hours within days [0..23],
+ *   - DAY, days in the range [0..106751991].

Review comment:
       how do you get the upper limit?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31614:
URL: https://github.com/apache/spark/pull/31614#discussion_r582088492



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/YearMonthIntervalType.scala
##########
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.types
+
+import scala.math.Ordering
+import scala.reflect.runtime.universe.typeTag
+
+import org.apache.spark.annotation.Unstable
+
+/**
+ * The type represents year-month intervals of the SQL standard. A year-month interval is made up
+ * of a contiguous subset of the following fields:
+ *   - MONTH, months within years [0..11],
+ *   - YEAR, years in the range [0..178956970].

Review comment:
       ditto




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-789170628


   Thank you for pinging me, @cloud-fan !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-789424291


   @dongjoon-hyun I think we can replace `Unstable` with `Experimental` after the first milestone is completed, which should be before 3.2.0. We can remove `Experimental` after the entire project is completed and being used by 1 or more releases.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk edited a comment on pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
MaxGekk edited a comment on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-788908289


   @cloud-fan @HyukjinKwon I have created the umbrella JIRA: SPARK-27790. This PR implements the first sub-task SPARK-27793.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783646663


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39936/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783619639


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39936/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783646663


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39936/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-788908289


   @cloud-fan @HyukjinKwon I have created the umbrellas JIRA: SPARK-27790. This PR implements the first sub-task SPARK-27793.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #31614:
URL: https://github.com/apache/spark/pull/31614


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-788914895


   Also cc @viirya @maropu @dongjoon-hyun 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783573138


   **[Test build #135355 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135355/testReport)** for PR 31614 at commit [`c72a455`](https://github.com/apache/spark/commit/c72a455b4a741bfa57b66a8cd15df6029ed34b21).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-789424645


   thanks, merging to master!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #31614:
URL: https://github.com/apache/spark/pull/31614#discussion_r582099033



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.types
+
+import scala.math.Ordering
+import scala.reflect.runtime.universe.typeTag
+
+import org.apache.spark.annotation.Unstable
+
+/**
+ * The type represents day-time intervals of the SQL standard. A day-time interval is made up
+ * of a contiguous subset of the following fields:
+ *   - SECOND, seconds within minutes and possibly fractions of a second [0..59.999999],
+ *   - MINUTE, minutes within hours [0..59],
+ *   - HOUR, hours within days [0..23],
+ *   - DAY, days in the range [0..106751991].

Review comment:
       > how do you get the upper limit?
   
   From the formula below:
   ```python
   >>> 2**63/((60*60*23 + 60 * 59 + 59.999999) * 1000000)
   106751991.1685362
   ```
   
   > can this be negative?
   
   Interval value, yes. The field, no, according to SQL:2016:
   <img width="989" alt="Screenshot 2021-02-24 at 19 17 31" src="https://user-images.githubusercontent.com/1580697/109030606-ffb30a00-76d4-11eb-9ee4-3c705c6f4a8a.png">
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783600688


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39936/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783747466


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135355/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783747466


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135355/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yaooqinn commented on pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
yaooqinn commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-789361021


   LGTM.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31614: [SPARK-27793][SQL] Add ANSI SQL day-time and year-month interval types

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-788916172


   @MaxGekk can you mention in the PR description that, this PR only adds the two new `DataType` implementations, and there will be more PRs to completely support the new ANSI interval types?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783573138


   **[Test build #135355 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135355/testReport)** for PR 31614 at commit [`c72a455`](https://github.com/apache/spark/commit/c72a455b4a741bfa57b66a8cd15df6029ed34b21).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #31614:
URL: https://github.com/apache/spark/pull/31614#discussion_r582099033



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.types
+
+import scala.math.Ordering
+import scala.reflect.runtime.universe.typeTag
+
+import org.apache.spark.annotation.Unstable
+
+/**
+ * The type represents day-time intervals of the SQL standard. A day-time interval is made up
+ * of a contiguous subset of the following fields:
+ *   - SECOND, seconds within minutes and possibly fractions of a second [0..59.999999],
+ *   - MINUTE, minutes within hours [0..59],
+ *   - HOUR, hours within days [0..23],
+ *   - DAY, days in the range [0..106751991].

Review comment:
       > how do you get the upper limit?
   
   From the formula below:
   ```python
   >>> 2**63/((60*60*23 + 60 * 59 + 59.999999) * 1000000)
   106751991.1685362
   ```
   even more conservative:
   ```python
   >>> 2**63 / (24 * 60 * 60 * 1000000.0)
   106751991.16730064
   ```
   
   > can this be negative?
   
   Interval value, yes. The field, no, according to SQL:2016:
   <img width="989" alt="Screenshot 2021-02-24 at 19 17 31" src="https://user-images.githubusercontent.com/1580697/109030606-ffb30a00-76d4-11eb-9ee4-3c705c6f4a8a.png">
   
   So, individual interval fields are non-negative but interval itself can have a sign, and be negative.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #31614:
URL: https://github.com/apache/spark/pull/31614#discussion_r582136847



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/YearMonthIntervalType.scala
##########
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.types
+
+import scala.math.Ordering
+import scala.reflect.runtime.universe.typeTag
+
+import org.apache.spark.annotation.Unstable
+
+/**
+ * The type represents year-month intervals of the SQL standard. A year-month interval is made up
+ * of a contiguous subset of the following fields:
+ *   - MONTH, months within years [0..11],
+ *   - YEAR, years in the range [0..178956970].

Review comment:
       The same
   ```python
   >>> 2**31/12.0
   178956970.66666666
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #31614:
URL: https://github.com/apache/spark/pull/31614#discussion_r582099033



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.types
+
+import scala.math.Ordering
+import scala.reflect.runtime.universe.typeTag
+
+import org.apache.spark.annotation.Unstable
+
+/**
+ * The type represents day-time intervals of the SQL standard. A day-time interval is made up
+ * of a contiguous subset of the following fields:
+ *   - SECOND, seconds within minutes and possibly fractions of a second [0..59.999999],
+ *   - MINUTE, minutes within hours [0..59],
+ *   - HOUR, hours within days [0..23],
+ *   - DAY, days in the range [0..106751991].

Review comment:
       > how do you get the upper limit?
   
   From the formula below:
   ```python
   >>> 2**63/((60*60*23 + 60 * 59 + 59.999999) * 1000000)
   106751991.1685362
   ```
   even more conservative:
   ```python
   >>> 2**63 / (24 * 60 * 60 * 1000000.0)
   106751991.16730064
   ```
   
   > can this be negative?
   
   Interval value, yes. The field, no, according to SQL:2016:
   <img width="989" alt="Screenshot 2021-02-24 at 19 17 31" src="https://user-images.githubusercontent.com/1580697/109030606-ffb30a00-76d4-11eb-9ee4-3c705c6f4a8a.png">
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31614: [SPARK-27790][SQL] Support SQL INTERVAL types

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31614:
URL: https://github.com/apache/spark/pull/31614#issuecomment-783726521


   **[Test build #135355 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135355/testReport)** for PR 31614 at commit [`c72a455`](https://github.com/apache/spark/commit/c72a455b4a741bfa57b66a8cd15df6029ed34b21).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org