You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jun Zheng (JIRA)" <ji...@apache.org> on 2018/10/05 13:57:00 UTC

[jira] [Updated] (SPARK-25648) Spark 2.3.1 reads orc format files with native and hive, and return different results

     [ https://issues.apache.org/jira/browse/SPARK-25648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Zheng updated SPARK-25648:
------------------------------
    Description: 
Hi All

I am testing TPCx-BB[link title|www.tpc.org/tpcx-bb/default.asp] with the code from [https://github.com/BigData-Lab-Frankfurt/Big-Data-Benchmark-for-Big-Bench,] 
 # The test data are loaded by spark-sql, the parameter _spark_.sql._orc_.impl sets to native;
 # During the engine validation power test,  when use the different read engines that is set _spark_.sql._orc_.impl = hive or _spark_.sql._orc_.impl = native, the q22 return different results. When set to hive,  the result is right, but set to native, less results are returned. Can someone help to find why it happens.

Thanks in advance

  was:
Hi All

I am testing TPCx-BB[link title|www.tpc.org/tpcx-bb/default.asp] with the code from [https://github.com/BigData-Lab-Frankfurt/Big-Data-Benchmark-for-Big-Bench,] 
 # The test data are loaded by spark-sql, the parameter _spark_.sql._orc_.impl sets to native;
 # During the engine validation power test,  when use the different read engines that is set _spark_.sql._orc_.impl = hive or _spark_.sql._orc_.impl = native, the q02 return different results. When set to hive,  the result is right, but set to native, less results are returned. Can someone help to find why it happens.

Thanks in advance


> Spark 2.3.1 reads orc format  files with native and hive, and return different results
> --------------------------------------------------------------------------------------
>
>                 Key: SPARK-25648
>                 URL: https://issues.apache.org/jira/browse/SPARK-25648
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.1
>            Reporter: Jun Zheng
>            Priority: Major
>
> Hi All
> I am testing TPCx-BB[link title|www.tpc.org/tpcx-bb/default.asp] with the code from [https://github.com/BigData-Lab-Frankfurt/Big-Data-Benchmark-for-Big-Bench,] 
>  # The test data are loaded by spark-sql, the parameter _spark_.sql._orc_.impl sets to native;
>  # During the engine validation power test,  when use the different read engines that is set _spark_.sql._orc_.impl = hive or _spark_.sql._orc_.impl = native, the q22 return different results. When set to hive,  the result is right, but set to native, less results are returned. Can someone help to find why it happens.
> Thanks in advance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org