You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@carbondata.apache.org by manishnalla1994 <gi...@git.apache.org> on 2019/01/08 10:53:02 UTC

[GitHub] carbondata pull request #3056: [CARBONDATA-3236] Fix for JVM Crash for inser...

GitHub user manishnalla1994 opened a pull request:

    https://github.com/apache/carbondata/pull/3056

    [CARBONDATA-3236] Fix for JVM Crash for insert into new table from old table

    Problem: Insert into new table from old table fails with JVM crash. This happened because both the query and load flow were assigned the same taskId and once query finished it freed the unsafe memory while the insert still in progress.
    
    Solution: As the flow for file format is direct flow and uses on-heap(safe) so no need to free the unsafe memory.
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [x] Testing done
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - How it is tested? Please attach test report.
            - Is it a performance related change? Please attach the performance test report.
            - Any additional information to help reviewers in testing this change.
           
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishnalla1994/carbondata JVMCrashForLoadAndQuery

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/3056.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3056
    
----
commit 150c710218ff3c09ccfccc9b2df970006964ef6d
Author: manishnalla1994 <ma...@...>
Date:   2019-01-08T10:42:55Z

    Fix for JVM Crash for file format

----


---

[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3056
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2217/



---

[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...

Posted by chenliang613 <gi...@git.apache.org>.

Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/3056
  
    Reviewed, LGTM


---

[GitHub] carbondata pull request #3056: [CARBONDATA-3236] Fix for JVM Crash for inser...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/3056


---

[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...

Posted by xuchuanyin <gi...@git.apache.org>.

Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/3056
  
    @manishnalla1994 
    
    > Solution:
    Check if any other RDD is sharing the same task context. If so, don't the clear the resource at that time, the other RDD which shared the context should clear the memory once after the task is finished.
    
    It seems in #2591, for data source table scenario, if the query and insert procedures also share the same context, it can also benefit from the implementation in #2591 without any changes. Right?


---

[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...

Posted by xuchuanyin <gi...@git.apache.org>.

Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/3056
  
    > because both the query and load flow were assigned the same taskId and once query finished it freed the unsafe memory while the insert still in progress.
    
    How do you handle the scenario for stored-by-carbondata carbontable? In that scenario, both of the query flow and load flow use offheap and encounter the same problem just as you described above.
    But I remembered we handle that differently from the current PR, which I think their modification can be similar. Please check this again.



---

[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3056
  
    Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10475/



---

[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...

Posted by kumarvishal09 <gi...@git.apache.org>.

Github user kumarvishal09 commented on the issue:

    https://github.com/apache/carbondata/pull/3056
  
    LGTM


---

[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...

Posted by CarbonDataQA <gi...@git.apache.org>.

Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/3056
  
    Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2437/



---

[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...

Posted by manishnalla1994 <gi...@git.apache.org>.

Github user manishnalla1994 commented on the issue:

    https://github.com/apache/carbondata/pull/3056
  
    @xuchuanyin Datasource table uses direct filling flow. As in direct flow there is no intermediate buffer so we are not using off-heap to store the page data in memory(filling all the records of a page to vector instead of filling batch wise). So in this case we can remove freeing of unsafe memory for Query as its not required.
    
    In case of stored by table, handling will be different as we support both batch wise filling and direct filling and for batch filling we are using unsafe, so we have to clear unsafe memory in this case.
    Here same handling is not required for data source table.
    Please refer https://github.com/apache/carbondata/pull/2591 for stored by handling of this issue.
    



---

[GitHub] carbondata issue #3056: [CARBONDATA-3236] Fix for JVM Crash for insert into ...

Posted by ravipesala <gi...@git.apache.org>.

Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/3056
  
    LGTM


---