You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@livy.apache.org by GitBox <gi...@apache.org> on 2020/05/20 20:53:24 UTC

[GitHub] [incubator-livy] wypoon opened a new pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

wypoon opened a new pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296


   ## What changes were proposed in this pull request?
   
   The decimal SQL type is mapped to `java.math.BigDecimal`. Livy removes trailing zeros from the `BigDecimal` before storing its string representation in a `ColumnBuffer`. This causes loss of information in queries of tables with columns of decimal(precision, scale) type.
   In this change, we simply do not remove trailing zeros from the `BigDecimal`.
   
   ## How was this patch tested?
   
   Added cases to integration tests, which fail without this change and pass with it.
   Fixed `ColumnBufferTest` to account for the change in behavior.
   Also tested manually by running beeline against a Livy Thrift server with this change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-632359536


   @mgaido I explained the issue in the description of [https://issues.apache.org/jira/browse/LIVY-771](url). I also just called out in comments on the code what the problem is.
   To repeat myself, if you have a column in a table of decimal(5, 2) type, e.g., then you should have values that have 2 places after the decimal point. Livy TS is returning values that do not conform to that type. It returns values such as 123.4 instead of 123.40, or 123 instead of 123.00, or 1.2E+2 instead of 120.00.
   
   beeline to HS2:
   
   ```
   > select f_decimal from hive_types_test order by f_int limit 1;
   +------------+
   | f_decimal  |
   +------------+
   | 9.40       |
   +------------+
   1 row selected (14.31 seconds)
   ```
   
   beeline to Livy TS:
   ```
   > select f_decimal from hive_types_test order by f_int limit 1;
   +------------+
   | f_decimal  |
   +------------+
   | 9.4        |
   +------------+
   1 row selected (11.544 seconds)
   ```
   
   spark-shell:
   ```
   scala> val df = spark.sql("select f_decimal from hive_types_test order by f_int")
   df: org.apache.spark.sql.DataFrame = [f_decimal: decimal(5,2)]
   
   scala> val row = df.first()
   row: org.apache.spark.sql.Row = [9.40]                                          
   
   scala> val elem = row.get(0)
   elem: Any = 9.40
   
   scala> elem.isInstanceOf[java.math.BigDecimal]
   res2: Boolean = true
   
   scala> elem == new java.math.BigDecimal("9.40")
   res3: Boolean = true
   
   scala> elem == new java.math.BigDecimal("9.4")
   res4: Boolean = false
   
   scala> new java.math.BigDecimal("9.40").stripTrailingZeros() == new java.math.BigDecimal("9.4")
   res5: Boolean = true
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-632244971


   CI completed and passed. I don't know why the status did not get updated here.
   @mgaido91 @jerryshao can you please review?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429357552



##########
File path: thriftserver/session/src/test/java/org/apache/livy/thriftserver/session/ColumnBufferTest.java
##########
@@ -76,6 +76,8 @@ public void testColumnBuffer() throws Exception {
 
       Encoder<TestBean> encoder = Encoders.bean(TestBean.class);
       Dataset<TestBean> ds = spark.createDataset(Arrays.asList(tb), encoder);
+      // In ds, the decimal field will be a org.apache.spark.sql.types.DecimalType,

Review comment:
       I will try to rephrase this comment. This is where the `Dataset` `ds` is created and further down, `ds.schema()` is used in the construction of a `org.apache.livy.thriftserver.session.ResultSet`. What I want to point out is that the inferred schema of `ds` maps the `decimal` field in `TestBean` to `org.apache.spark.sql.types.DecimalType.SYSTEM_DEFAULT`, i.e., `DecimalType(38, 18)`. This inference is due to the `Encoder` here. I put the comment here because this is where we define the `Encoder` and create `ds`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] mgaido91 commented on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
mgaido91 commented on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-632726650


   thanks for the explanation @wypoon. From the PR description I was confused by the sentence " Livy removes trailing zeros ... before storing its string representation ...", so I was not sure whether this was only in the visualization phase.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r443858197



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       That is an extremely contrived scenario - creating a temp view in an interactive Livy session and connecting to it in a Livy Thrift session to query the temp view. This is not a real use case.
   Here is the SQL 99 standard on data types:
   http://web.cecs.pdx.edu/~len/sql1999.pdf
   In section `4.5 Numbers`, it says
   ```
   4.5.1 Characteristics of numbers
   An exact numeric value has a precision and a scale. The precision is a positive integer that
   determines the number of significant digits in a particular radix (binary or decimal). The
   scale is a non-negative integer. A scale of 0 (zero) indicates that the number is an integer.
   For a scale of S, the exact numeric value is the integer value of the significant digits
   multiplied by 10^-S.
   ```
   
   The scale is a non-negative integer.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429364576



##########
File path: thriftserver/session/src/test/java/org/apache/livy/thriftserver/session/ColumnBufferTest.java
##########
@@ -76,6 +76,8 @@ public void testColumnBuffer() throws Exception {
 
       Encoder<TestBean> encoder = Encoders.bean(TestBean.class);
       Dataset<TestBean> ds = spark.createDataset(Arrays.asList(tb), encoder);
+      // In ds, the decimal field will be a org.apache.spark.sql.types.DecimalType,

Review comment:
       Further down, in the for loop iterating over the fields of `ds`, in case of the `decimal` field, I have a comment that "As decimal maps to DecimalType(38, 18), we need to change the scale before comparing". I needed to explain why "decimal maps to DecimalType(38, 18)". I could also move this comment there, but I explained my thinking behind putting it here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r428912662



##########
File path: thriftserver/server/src/test/scala/org/apache/livy/thriftserver/ThriftServerSuites.scala
##########
@@ -113,6 +117,13 @@ trait CommonThriftTests {
     assert(rsMetaData.getPrecision(16) == 10)
     assert(rsMetaData.getScale(16) == 0)
 
+    assert(resultSet.getString(17) == "123.45")
+    assert(resultSet.getBigDecimal(17) == new BigDecimal("123.45"))
+    assert(resultSet.getString(18) == "123.40")
+    assert(resultSet.getBigDecimal(18) == new BigDecimal("123.40"))

Review comment:
       Here, `resultSet` is a `java.sql.ResultSet`. Without my change `resultSet.getBigDecimal` returns a `BigDecimal` that is not equal to the `BigDecimal` in the `Row` returned by Spark SQL. That is the problem.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] mgaido91 commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
mgaido91 commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r443259717



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       with Livy thriftserver, you can run SQL queries on an existing Spark session. You can, for instance, create a Livy scala session, run the first two commands of my example, then create a Livy thriftsession which re-uses the scala session (by specifying its `sessionId` when connecting) and run the SQL query from my example.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] mgaido91 commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
mgaido91 commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429289089



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       to be coherent with Hive, shall we use `toPlainString`? can we also add a test with negative scale for this case? can you also check Spark's behavior in this case?

##########
File path: thriftserver/session/src/test/java/org/apache/livy/thriftserver/session/ColumnBufferTest.java
##########
@@ -76,6 +76,8 @@ public void testColumnBuffer() throws Exception {
 
       Encoder<TestBean> encoder = Encoders.bean(TestBean.class);
       Dataset<TestBean> ds = spark.createDataset(Arrays.asList(tb), encoder);
+      // In ds, the decimal field will be a org.apache.spark.sql.types.DecimalType,

Review comment:
       a weird place for a comment... don't understand why to put it here...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] codecov-commenter edited a comment on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-632966608


   # [Codecov](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=h1) Report
   > Merging [#296](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-livy/commit/3b9bbefbaf8ddb3116952caf8c15b97d676386a7&el=desc) will **increase** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-livy/pull/296/graphs/tree.svg?width=650&height=150&src=pr&token=0MkVbiUFwE)](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master     #296      +/-   ##
   ============================================
   + Coverage     68.24%   68.26%   +0.01%     
   - Complexity      965      966       +1     
   ============================================
     Files           104      104              
     Lines          5952     5952              
     Branches        900      900              
   ============================================
   + Hits           4062     4063       +1     
     Misses         1311     1311              
   + Partials        579      578       -1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...c/main/scala/org/apache/livy/repl/ReplDriver.scala](https://codecov.io/gh/apache/incubator-livy/pull/296/diff?src=pr&el=tree#diff-cmVwbC9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2xpdnkvcmVwbC9SZXBsRHJpdmVyLnNjYWxh) | `30.76% <0.00%> (-2.57%)` | `7.00% <0.00%> (ø%)` | |
   | [...in/java/org/apache/livy/rsc/driver/JobWrapper.java](https://codecov.io/gh/apache/incubator-livy/pull/296/diff?src=pr&el=tree#diff-cnNjL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9saXZ5L3JzYy9kcml2ZXIvSm9iV3JhcHBlci5qYXZh) | `88.57% <0.00%> (+5.71%)` | `9.00% <0.00%> (+1.00%)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=footer). Last update [3b9bbef...623ef5e](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r452339853



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       Ok, I see that you need a global temp view:
   ```
   scala> df.createGlobalTempView("aa")
   Hive Session ID = fe4de24f-f7dc-496c-8dde-460c24f68548
   
   scala> val spark2 = spark.newSession()
   spark2: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@59ede173
   
   scala> spark2.sql("select * from global_temp.aa").show()
   +----+
   |  id|
   +----+
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   +----+
   
   
   ```
   In any case, I still stand by my argument.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r446464042



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       The message for the [commit](https://github.com/apache/spark/commit/ff39c9271ca04951b045c5d9fca2128a82d50b46) for [SPARK-30252](https://issues.apache.org/jira/browse/SPARK-30252) pretty much says everything I pointed out, and concludes "it would be better to disallow negative scale totally and make behaviors above be consistent."




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r451965507



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       My understanding of the implementation of the Livy Thrift server (from ThriftSessionState.java) is that each Thrift session creates a new SparkSession (with the same underlying SparkContext). So when you connect to the Thrift server using the session id of an existing interactive scala session, the Thrift session is using a new SparkSession, not the SparkSession in the existing scala session.
   ```
   scala> import org.apache.spark.sql.types.DecimalType
   import org.apache.spark.sql.types.DecimalType
   
   scala> val df = spark.range(10).select($"id".cast(DecimalType(10,-1)))
   df: org.apache.spark.sql.DataFrame = [id: decimal(10,-1)]
   
   scala> df.show()
   +----+                                                                          
   |  id|
   +----+
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   +----+
   
   
   scala> df.createTempView("aa")
   
   scala> val spark2 = spark.newSession()
   spark2: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@7abdfd36
   
   scala> spark2.sql("select * from aa").show()
   Hive Session ID = c73613e6-cb1e-4156-a6dc-31165d02dec0
   org.apache.spark.sql.AnalysisException: Table or view not found: aa; line 1 pos 14
     at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:47)
   ...
   Caused by: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or view 'aa' not found in database 'default';
     at org.apache.spark.sql.hive.client.HiveClient$$anonfun$getTable$1.apply(HiveClient.scala:81)
   ...
   ```
   I don't see how you can query the temp view from another SparkSession.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] mgaido91 commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
mgaido91 commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r446423157



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       I agree it is a corner case, but I think tests exist also for corner cases, you don't want a software which behaves poorly even in corner cases. Moreover I think it is not a big issue to just add a test for this case, which if/when Spark won't support at all that case anymore can be removed. I see no harm in adding a test and a benefit.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429492105



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       I have not found any definitive statement in Hive documentation about this (and I did not search Hive code), but generally in SQL, in the decimal type, scale is a non-negative integer (0 <= scale <= precision).
   E.g., 
   https://docs.microsoft.com/en-us/sql/t-sql/data-types/decimal-and-numeric-transact-sql?view=sql-server-ver15
   https://dev.mysql.com/doc/refman/5.7/en/precision-math-decimal-characteristics.html
   https://github.com/prestodb/presto/issues/9103#issuecomment-347268516
   
   Your suggestion to add a test with negative scale does not make sense.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-719009684


   @mgaido91 I think this is an important bug, and the fix is straightforward. I think I have provided an adequate test. I do not plan to spend any more time on this. I think it would be good for Livy to have the fix merged. However, you or anyone else can feel free to take this over if you don't agree.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429563925



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       Well, I stand corrected about Spark SQL. However, Hive, like other SQL databases (MySQL, MS SQL Server), does not allow negative scale:
   ```
   > create table dec_test1 (id int, foo decimal(10, 1));
   INFO  : Compiling command(queryId=hive_20200523170810_7a19e5b9-bb45-4b28-888c-1a4af5650d9e): create table dec_test1 (id int, foo decimal(10, 1))
   INFO  : Semantic Analysis Completed (retrial = false)
   INFO  : Created Hive schema: Schema(fieldSchemas:null, properties:null)
   INFO  : Completed compiling command(queryId=hive_20200523170810_7a19e5b9-bb45-4b28-888c-1a4af5650d9e); Time taken: 0.025 seconds
   INFO  : Executing command(queryId=hive_20200523170810_7a19e5b9-bb45-4b28-888c-1a4af5650d9e): create table dec_test1 (id int, foo decimal(10, 1))
   INFO  : Starting task [Stage-0:DDL] in serial mode
   INFO  : Completed executing command(queryId=hive_20200523170810_7a19e5b9-bb45-4b28-888c-1a4af5650d9e); Time taken: 0.08 seconds
   INFO  : OK
   No rows affected (0.137 seconds)
   ```
   while
   ```
   > create table dec_test2 (id int, foo decimal(10, -1));
   Error: Error while compiling statement: FAILED: ParseException line 1:48 extraneous input '-' expecting Number near '<EOF>' (state=42000,code=40000)
   ```
   
   My change here produces the correct behavior for non-negative scale, which I believe is what most users will care about, and is also consistent for negative scale (since as I pointed out above, `BigDecimal#toString` allows faithful roundtrip back to `BigDecimal` from the String).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429583086



##########
File path: thriftserver/session/src/test/java/org/apache/livy/thriftserver/session/ColumnBufferTest.java
##########
@@ -76,6 +76,8 @@ public void testColumnBuffer() throws Exception {
 
       Encoder<TestBean> encoder = Encoders.bean(TestBean.class);
       Dataset<TestBean> ds = spark.createDataset(Arrays.asList(tb), encoder);
+      // In ds, the decimal field will be a org.apache.spark.sql.types.DecimalType,

Review comment:
       I moved the comment as @squito suggested.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] squito commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
squito commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429467605



##########
File path: thriftserver/session/src/test/java/org/apache/livy/thriftserver/session/ColumnBufferTest.java
##########
@@ -76,6 +76,8 @@ public void testColumnBuffer() throws Exception {
 
       Encoder<TestBean> encoder = Encoders.bean(TestBean.class);
       Dataset<TestBean> ds = spark.createDataset(Arrays.asList(tb), encoder);
+      // In ds, the decimal field will be a org.apache.spark.sql.types.DecimalType,

Review comment:
       I think I agree with marco about the comment placement -- it would probably be more natural just *above* the `ds = ...`
   
   ```java
   // In this dataset, the "decimal" field will have type 
   // org.apache.spark.sql.types.DecimalType.SYSTEM_DEFAULT, i.e., DecimalType(38, 18)
   Dataset<TestBean> ds = spark.createDataset(Arrays.asList(tb), encoder);
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429360039



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       No, we should not use `BigDecimal#toPlainString`, since that can cause information loss. I believe we need to use `BigDecimal#toString` which uses the string representation that can be converted back to a `BigDecimal` using the string constructor.
   The javadoc for `BigDecimal#toPlainString` says:
   "Note that if the result of this method is passed to the string constructor, only the numerical value of this BigDecimal will necessarily be recovered; the representation of the new BigDecimal may have a different scale. "
   While the javadoc for `BigDecimal#toString` says:
   "There is a one-to-one mapping between the distinguishable BigDecimal values and the result of this conversion. That is, every distinguishable BigDecimal value (unscaled value and scale) has a unique string representation as a result of using toString. If that string representation is converted back to a BigDecimal using the BigDecimal(String) constructor, then the original value will be recovered."




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] mgaido91 commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
mgaido91 commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r514160935



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       > There is exactly zero chance that any actual users out there are doing this.
   
   That is not true, I am an actual user and I had that use case in mind.
   
   > In any case, I still stand by my argument.
   
   I'll stand by mine: an additional test case has no harm and enforces the behavior in this case. So I see really no reason to avoid adding such test.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-637030336


   Hi @mgaido91 , I have responded to your point about negative scale in the decimal type in Spark SQL, and addressed the nit about the comment in the ColumnBufferTest. I have also clarified the Jira description and the PR description. This is a simple but important fix. Can we move forward now?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r451832688



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       I do not actually see the benefit. The fact that org.apache.spark.sql.types.DecimalType allowed negative scale is actually a mistake that Spark itself has now corrected. The SQL standard clearly states that scale is a non-negative integer.
   @jerryshao has already added support for Spark 3.0 to Livy. You yourself said that if/when Spark no longer supports the case you're pushing for, the test can be removed. Why add a test that should be removed? @jerryshao can you comment?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] mgaido91 commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
mgaido91 commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429547231



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       It does make sense, as in Spark, which we use, negative scales are definitely allowed. So we must define a behavior, comparing us with Spark and Hive (if it supports it) for these cases too.
   
   Just to give you a reproducer, you can create a table with a negative scale very easily with Spark and query it through SQL like this:
   ```
   scala> val df = spark.range(10).select($"id".cast(DecimalType(10,-1)))
   df: org.apache.spark.sql.DataFrame = [id: decimal(10,-1)]
   
   scala> df.show
   +----+
   |  id|
   +----+
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   +----+
   
   
   scala> df.createTempView("aa")
   
   scala> sql("select * from aa")
   res3: org.apache.spark.sql.DataFrame = [id: decimal(10,-1)]
   
   scala> sql("select * from aa").show
   +----+
   |  id|
   +----+
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   +----+
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-633148970


   I updated the Jira as well as the PR description to point out that the BigDecimal values returned in JDBC can be incorrect due to this bug.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r446455553



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       The fact is that Spark SQL does not actually support a decimal type with negative scale, and for good reason, the SQL standard itself states that scale is a non-negative integer.
   https://issues.apache.org/jira/browse/SPARK-30252 fixes org.apache.spark.sql.types.DecimalType to disallow negative scale. The reason given is the SQL standard.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r428910336



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();

Review comment:
       `BigDecimal#stripTrailingZeros` returns a new, different `BigDecimal` which has a different scale. Thus there is information loss.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] codecov-commenter commented on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-632966608


   # [Codecov](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=h1) Report
   > Merging [#296](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-livy/commit/3b9bbefbaf8ddb3116952caf8c15b97d676386a7&el=desc) will **increase** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-livy/pull/296/graphs/tree.svg?width=650&height=150&src=pr&token=0MkVbiUFwE)](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master     #296      +/-   ##
   ============================================
   + Coverage     68.24%   68.26%   +0.01%     
   - Complexity      965      966       +1     
   ============================================
     Files           104      104              
     Lines          5952     5952              
     Branches        900      900              
   ============================================
   + Hits           4062     4063       +1     
   - Misses         1311     1312       +1     
   + Partials        579      577       -2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...c/main/scala/org/apache/livy/repl/ReplDriver.scala](https://codecov.io/gh/apache/incubator-livy/pull/296/diff?src=pr&el=tree#diff-cmVwbC9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2xpdnkvcmVwbC9SZXBsRHJpdmVyLnNjYWxh) | `30.76% <0.00%> (-2.57%)` | `7.00% <0.00%> (ø%)` | |
   | [.../scala/org/apache/livy/sessions/SessionState.scala](https://codecov.io/gh/apache/incubator-livy/pull/296/diff?src=pr&el=tree#diff-Y29yZS9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2xpdnkvc2Vzc2lvbnMvU2Vzc2lvblN0YXRlLnNjYWxh) | `61.11% <0.00%> (ø)` | `2.00% <0.00%> (ø%)` | |
   | [...in/java/org/apache/livy/rsc/driver/JobWrapper.java](https://codecov.io/gh/apache/incubator-livy/pull/296/diff?src=pr&el=tree#diff-cnNjL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9saXZ5L3JzYy9kcml2ZXIvSm9iV3JhcHBlci5qYXZh) | `88.57% <0.00%> (+5.71%)` | `9.00% <0.00%> (+1.00%)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=footer). Last update [3b9bbef...623ef5e](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429568032



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       It seems that while `DecimalType` may take a negative scale, you can't use negative scale in the decimal type in Spark SQL either:
   ```
   scala> val df1 = spark.sql("select cast(120 as decimal(5, 1))")
   df1: org.apache.spark.sql.DataFrame = [CAST(120 AS DECIMAL(5,1)): decimal(5,1)]
   
   scala> val bd1 = df1.first().get(0)
   bd1: Any = 120.0                                                                
   
   scala> val df2 = spark.sql("select cast(120 as decimal(5, -1))")
   org.apache.spark.sql.catalyst.parser.ParseException:
   extraneous input '-' expecting INTEGER_VALUE(line 1, pos 30)
   
   == SQL ==
   select cast(120 as decimal(5, -1))
   ------------------------------^^^
   
     at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241)
     at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117)
     at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
     at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
     at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651)
     ... 49 elided
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] codecov-commenter edited a comment on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-632966608


   # [Codecov](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=h1) Report
   > Merging [#296](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-livy/commit/3b9bbefbaf8ddb3116952caf8c15b97d676386a7&el=desc) will **increase** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-livy/pull/296/graphs/tree.svg?width=650&height=150&src=pr&token=0MkVbiUFwE)](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master     #296      +/-   ##
   ============================================
   + Coverage     68.24%   68.26%   +0.01%     
   - Complexity      965      966       +1     
   ============================================
     Files           104      104              
     Lines          5952     5952              
     Branches        900      900              
   ============================================
   + Hits           4062     4063       +1     
     Misses         1311     1311              
   + Partials        579      578       -1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...c/main/scala/org/apache/livy/repl/ReplDriver.scala](https://codecov.io/gh/apache/incubator-livy/pull/296/diff?src=pr&el=tree#diff-cmVwbC9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2xpdnkvcmVwbC9SZXBsRHJpdmVyLnNjYWxh) | `30.76% <0.00%> (-2.57%)` | `7.00% <0.00%> (ø%)` | |
   | [...in/java/org/apache/livy/rsc/driver/JobWrapper.java](https://codecov.io/gh/apache/incubator-livy/pull/296/diff?src=pr&el=tree#diff-cnNjL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9saXZ5L3JzYy9kcml2ZXIvSm9iV3JhcHBlci5qYXZh) | `88.57% <0.00%> (+5.71%)` | `9.00% <0.00%> (+1.00%)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=footer). Last update [3b9bbef...623ef5e](https://codecov.io/gh/apache/incubator-livy/pull/296?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r443859934



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       I think it is simply an accident that the CaseClass `org.apache.spark.sql.types.DecimalType` allows a negative value for scale. Other parts of the system enforce a check to disallow it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r452339853



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       Ok, I see that you need a global temp view:
   ```
   scala> df.createGlobalTempView("aa")
   Hive Session ID = fe4de24f-f7dc-496c-8dde-460c24f68548
   
   scala> val spark2 = spark.newSession()
   spark2: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@59ede173
   
   scala> spark2.sql("select * from global_temp.aa").show()
   +----+
   |  id|
   +----+
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |0E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   |1E+1|
   +----+
   
   
   ```
   In any case, I still stand by my argument.
   There is exactly zero chance that any actual users out there are doing this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r443858197



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       That is an extremely contrived scenario - creating a temp view in an interactive Livy session and connecting to it in a Livy Thrift session to query the temp view. This is not a real use case.
   Here the SQL 99 standard on data types:
   http://web.cecs.pdx.edu/~len/sql1999.pdf
   In section `4.5 Numbers`, it says
   ```
   4.5.1 Characteristics of numbers
   An exact numeric value has a precision and a scale. The precision is a positive integer that
   determines the number of significant digits in a particular radix (binary or decimal). The
   scale is a non-negative integer. A scale of 0 (zero) indicates that the number is an integer.
   For a scale of S, the exact numeric value is the integer value of the significant digits
   multiplied by 10^-S.
   ```
   
   The scale is a non-negative integer.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] mgaido91 commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
mgaido91 commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r451698726



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       yes, the problem is Livy support Spark 2.4 currently... I can just repeat what I said earlier: adding such a test can just bring a benefit and no harm. If you think that adding this test can be harmful for Livy, please just state why, otherwise I think we can add the test and go ahead with this PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r443087452



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       What you have shown is that in Spark, you can create a `DataFrame` with a negative scale `org.apache.spark.sql.types.DecimalType` column (because `DecimalType` allows it), but I don't see how you can actually create a persistent table with a column of decimal type with a negative scale.
   Using your example, in spark-shell:
   ```
   scala> val df = spark.range(10).select($"id".cast(DecimalType(10,-1)))
   df: org.apache.spark.sql.DataFrame = [id: decimal(10,-1)]
   
   scala> df.write.saveAsTable("neg_scale_dec_test")
   Hive Session ID = e164a210-0268-4ab7-aca2-3c9a36892d54
   [Stage 2:>                                                          (0 + 1) / 2]20/06/19 17:31:46 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, quasar-tnqtgl-2.vpc.cloudera.com, executor 2): java.lang.IllegalArgumentException: Invalid DECIMAL scale: -1
   	at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:53)
   	at org.apache.parquet.schema.Types$BasePrimitiveBuilder.decimalMetadata(Types.java:611)
   	at org.apache.parquet.schema.Types$BasePrimitiveBuilder.build(Types.java:441)
   	at org.apache.parquet.schema.Types$BasePrimitiveBuilder.build(Types.java:336)
   	at org.apache.parquet.schema.Types$Builder.named(Types.java:314)
   	at org.apache.spark.sql.execution.datasources.parquet.SparkToParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:414)
   ...
   
   scala> df.write.format("text").saveAsTable("neg_scale_dec_test")
   20/06/19 17:35:49 ERROR command.CreateDataSourceTableAsSelectCommand: Failed to write to table neg_scale_dec_test
   org.apache.spark.sql.AnalysisException: Text data source does not support decimal(10,-1) data type.;
   	at org.apache.spark.sql.execution.datasources.DataSourceUtils$$anonfun$verifySchema$1.apply(DataSourceUtils.scala:69)
   	at org.apache.spark.sql.execution.datasources.DataSourceUtils$$anonfun$verifySchema$1.apply(DataSourceUtils.scala:67)
   ...
   
   scala> df.write.format("orc").saveAsTable("neg_scale_dec_test")
   [Stage 3:>                                                          (0 + 2) / 2]20/06/19 17:36:25 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 3.0 (TID 10, quasar-tnqtgl-4.vpc.cloudera.com, executor 4): java.lang.IllegalArgumentException: Missing integer at 'struct<`id`:decimal(10,^-1)>'
   	at org.apache.orc.TypeDescription.parseInt(TypeDescription.java:244)
   	at org.apache.orc.TypeDescription.parseType(TypeDescription.java:362)
   	at org.apache.orc.TypeDescription.parseStruct(TypeDescription.java:333)
   	at org.apache.orc.TypeDescription.parseType(TypeDescription.java:391)
   	at org.apache.orc.TypeDescription.fromString(TypeDescription.java:412)
   	at org.apache.spark.sql.execution.datasources.orc.OrcSerializer.org$apache$spark$sql$execution$datasources$orc$OrcSerializer$$createOrcValue(OrcSerializer.scala:216)
   ...
   ```
   You can create a temp view from the DataFrame in Spark, but that won't be of interest to users of Livy Thrift server. Users of the Thrift server want to run SQL queries on tables created in Hive or Spark SQL.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] mgaido91 commented on pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
mgaido91 commented on pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#issuecomment-632297561


   I am not sure about this change, may you please elaborate more on the previous behavior and why it is problematic? Maybe an example or some screenshots would help clarifying. Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] mgaido91 commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
mgaido91 commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r437230996



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       The SQL parser does not allow it, indeed, but this doesn't mean you cannot create a table with negative scales and then query it with Spark SQL, as I have shown above. I think it is important to enforce the behavior we have in that case too and the correct behavior should be what SparkSQL shows IMHO. So please do add a test case.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-livy] wypoon commented on a change in pull request #296: [LIVY-771][THRIFT] Do not remove trailing zeros from decimal values.

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #296:
URL: https://github.com/apache/incubator-livy/pull/296#discussion_r429492105



##########
File path: thriftserver/session/src/main/java/org/apache/livy/thriftserver/session/ResultSet.java
##########
@@ -88,7 +88,7 @@ private String toHiveString(Object value, boolean quoteStrings) {
     } else if (quoteStrings && value instanceof String) {
       return "\"" + value + "\"";
     } else if (value instanceof BigDecimal) {
-      return ((BigDecimal) value).stripTrailingZeros().toString();
+      return ((BigDecimal) value).toString();

Review comment:
       I have not found any definitive statement in Hive documentation about this (and I did not search Hive code), but generally in SQL, in the decimal type, scale is a non-negative integer (0 <= scale <= precision).
   E.g., 
   https://docs.microsoft.com/en-us/sql/t-sql/data-types/decimal-and-numeric-transact-sql?view=sql-server-ver15
   https://dev.mysql.com/doc/refman/5.7/en/precision-math-decimal-characteristics.html
   
   Your suggestion to add a test with negative scale does not make sense.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org