You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2019/07/13 05:59:57 UTC

[GitHub] [carbondata] ajantha-bhat opened a new pull request #3324: [HOTFIX] Fix task id in FileFormat write

ajantha-bhat opened a new pull request #3324: [HOTFIX] Fix task id in FileFormat write
URL: https://github.com/apache/carbondata/pull/3324
 
 
   problem : in FIleFormat write carbon is using task id as System.nanoTime()
   cause :  when multiple tasks launched concurrently, there is a chance that two task can have same id  very rarely, due to this two spark task launched for one insert will have same carbondata file name.
   so, when both tasks write to one file, chances are more to corrupt the file. which leads in query failure
   solution: use own unique task id instead of nano seconds.
   here use spark task id  + global counter to generate unique task id across jobs.
   
   Be sure to do all of the following checklist to help us incorporate 
   your contribution quickly and easily:
   
    - [ ] Any interfaces changed? NA
    
    - [ ] Any backward compatibility impacted? NA
    
    - [ ] Document update required? NA
   
    - [ ] Testing done
   done. Attached the report
   [testReport.txt](https://github.com/apache/carbondata/files/3388501/testReport.txt)
   
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.  [NA]
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services