You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "Ravindra Pesala (JIRA)" <ji...@apache.org> on 2017/09/06 04:29:00 UTC

[jira] [Commented] (CARBONDATA-1424) Delete Operation working incorrectly when subquery returns bad-record

    [ https://issues.apache.org/jira/browse/CARBONDATA-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154790#comment-16154790 ] 

Ravindra Pesala commented on CARBONDATA-1424:
---------------------------------------------

[~sangeeta04] It is not an issue as nulls cannot be filtered in that way. Please take the below scenario and compare with hive.

{code}
select * from uniqdata_delete where cust_id in (select cust_id from uniqdata_delete limit 10)
{code}
The above query only returns non null data only even for hive as well. so delete also behaves in the same way. Please verify the above query with hive and close it if not an issue

> Delete Operation working incorrectly when subquery returns bad-record
> ---------------------------------------------------------------------
>
>                 Key: CARBONDATA-1424
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1424
>             Project: CarbonData
>          Issue Type: Bug
>          Components: sql, test
>    Affects Versions: 1.2.0
>            Reporter: Sangeeta Gulia
>            Assignee: Ravindra Pesala
>            Priority: Minor
>         Attachments: 3000_UniqData.csv
>
>
> Delete Operation is working incorrectly when subquery returns bad-record for a particular table. 
> For the given query, 
> delete from uniqdata_delete where cust_id in (select cust_id from uniqdata_delete limit 10);
> As an Example, if "select cust_id from uniqdata_delete limit 10" returns  
> +----------+--+
> | cust_id  |
> +----------+--+
> | NULL     |
> | NULL     |
> | NULL     |
> | NULL     |
> | 11000    |
> | 11001     |
> | 11002     |
> | 11003     |
> | 11004     |
> | 11005    |
> +----------+--+
> then the query should delete all rows where cust_id is Null or matches any values from the returned values(11000-11005) whereas it deletes only those records where customer id is from (11000-11005).
> I have attached the sample csv file which i have used for reference.
> To Regenerate the issue, you can use below commands : 
> CREATE TABLE uniqdata_delete (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA INPATH 'hdfs://localhost:54310/user/hduser/input-files/3000_UniqData.csv' into table uniqdata_delete OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> NOTE : Load should be such that starting rows of data should have null stored for cust_id field. 
> delete from uniqdata_delete where cust_id in (select cust_id from uniqdata_delete limit 10);



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)