You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "Vandana Yadav (JIRA)" <ji...@apache.org> on 2018/07/02 08:32:00 UTC

[jira] [Created] (CARBONDATA-2680) Incorrect result displays after applying the histogram_numeric function on carbon and hive table

Vandana Yadav created CARBONDATA-2680:
-----------------------------------------

             Summary: Incorrect result displays after applying  the histogram_numeric function on carbon and hive table
                 Key: CARBONDATA-2680
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2680
             Project: CarbonData
          Issue Type: Bug
          Components: data-query
    Affects Versions: 1.5.0
         Environment: spark 2.2
            Reporter: Vandana Yadav
         Attachments: 100_hive_test.csv

Incorrect result displays after applying the histogram_numeric function on carbon and hive table:

Steps to reproduce:

1) Create carbon table and load data in it:

a) create table Carbon_automation (imei string,deviceInformationId int,MAC string,deviceColor string,device_backColor string,modelId string,marketName string,AMSize string,ROMSize string,CUPAudit string,CPIClocked string,series string,productionDate timestamp,bomCode string,internalModels string, deliveryTime string, channelsId string, channelsName string , deliveryAreaId string, deliveryCountry string, deliveryProvince string, deliveryCity string,deliveryDistrict string, deliveryStreet string, oxSingleNumber string, ActiveCheckTime string, ActiveAreaId string, ActiveCountry string, ActiveProvince string, Activecity string, ActiveDistrict string, ActiveStreet string, ActiveOperatorId string, Active_releaseId string, Active_EMUIVersion string, Active_operaSysVersion string, Active_BacVerNumber string, Active_BacFlashVer string, Active_webUIVersion string, Active_webUITypeCarrVer string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, Latest_DAY int, Latest_HOUR string, Latest_areaId string, Latest_country string, Latest_province string, Latest_city string, Latest_district string, Latest_street string, Latest_releaseId string, Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber string, Latest_BacFlashVer string, Latest_webUIVersion string, Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, Latest_operatorId string, gamePointDescription string,gamePointId double,contractNumber double,imei_count int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('DICTIONARY_INCLUDE'='deviceInformationId,Latest_YEAR,Latest_MONTH,Latest_DAY')

b) LOAD DATA INPATH 'hdfs://hadoop-master:54311/BabuStore/Data/HiveData/100_hive_test.csv' INTO TABLE Carbon_automation OPTIONS('DELIMITER'=',','QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='imei,deviceInformationId,MAC,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CPIClocked,series,productionDate,bomCode,internalModels,deliveryTime,channelsId,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,deliveryStreet,oxSingleNumber,contractNumber,ActiveCheckTime,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,ActiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_operaSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Active_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Active_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaSysVersion,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITypeCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePADPartitionedVersions,Latest_operatorId,gamePointId,gamePointDescription,imei_count');

2) Create Hive table and load data in it:

a) create table Carbon_automation_h (imei string,deviceInformationId int,MAC string,deviceColor string,device_backColor string,modelId string,marketName string,AMSize string,ROMSize string,CUPAudit string,CPIClocked string,series string,productionDate timestamp,bomCode string,internalModels string, deliveryTime string, channelsId string, channelsName string , deliveryAreaId string, deliveryCountry string, deliveryProvince string, deliveryCity string,deliveryDistrict string, deliveryStreet string, oxSingleNumber string, ActiveCheckTime string, ActiveAreaId string, ActiveCountry string, ActiveProvince string, Activecity string, ActiveDistrict string, ActiveStreet string, ActiveOperatorId string, Active_releaseId string, Active_EMUIVersion string, Active_operaSysVersion string, Active_BacVerNumber string, Active_BacFlashVer string, Active_webUIVersion string, Active_webUITypeCarrVer string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, Latest_DAY int, Latest_HOUR string, Latest_areaId string, Latest_country string, Latest_province string, Latest_city string, Latest_district string, Latest_street string, Latest_releaseId string, Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber string, Latest_BacFlashVer string, Latest_webUIVersion string, Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, Latest_operatorId string, gamePointDescription string,gamePointId double,contractNumber double,imei_count int)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

b) load data local inpath '/opt/Carbon/CarbonData/TestData/Data/HiveData/100_hive_test.csv' OVERWRITE INTO TABLE Carbon_automation_h;

3) Execute Query:

on carbon table:

select histogram_numeric(1, 5000)from carbon_automation;

on Hive table:

select histogram_numeric(1, 5000)from Carbon_automation_h;

4) Expected Result:

As both tables have similar Data content the the output of executed query should be same.

 

5) Actual Result:

Carbon Table:

Output:

+------------------------------+--+
| histogram_numeric( 1, 5000) |
+------------------------------+--+
| [\{"x":1.0,"y":99.0}] |
+------------------------------+--+
1 row selected (0.204 seconds)

 

Hive Table:

Output:

+------------------------------------------+--+
| histogram_numeric( 1, 5000) |
+------------------------------------------+--+
| [\{"x":1.0,"y":50.0},\{"x":1.0,"y":49.0}] |
+------------------------------------------+--+
1 row selected (2.171 seconds)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)