You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Sandhya Sundaresan (JIRA)" <ji...@apache.org> on 2019/01/24 06:31:00 UTC

[jira] [Created] (TRAFODION-3263) Disable LOB locking and refactor order of LOB iid expression evaluation

Sandhya Sundaresan created TRAFODION-3263:
---------------------------------------------

             Summary: Disable LOB locking and refactor order of  LOB iid expression evaluation
                 Key: TRAFODION-3263
                 URL: https://issues.apache.org/jira/browse/TRAFODION-3263
             Project: Apache Trafodion
          Issue Type: Improvement
          Components: sql-general
    Affects Versions: 2.2.0
            Reporter: Sandhya Sundaresan
            Assignee: Sandhya Sundaresan


 The change to use JNI to do HDFS writes improved the interface by returning more useful infomration to the caller. In TRAFODION-2946, we ddescribe the need for LOB locking because of a condition where multiple threads writing to the same LOB column could interleave and cause  problems. TWith the new JNI interface and HDFS write will now return the offset where the data was written. So we can use this return offset to store in the descriptor tables. Prior to this while using the libhdfs API, we would not get back the "written offset".

 

So the order of operations before this change  used to be :
 # Get the EOD for the LOB data file in HDFS
 # Store this offset into the LOB descriptor tables so we know where to retrieve the data from during a read. 
 # call hdfsWrite to write to the LOB data file. And hope that the offset where the hdfsWrite writes is the same as the EOD calculated in 1. hdfs being an "append only"file system, this is usually how it works. But if another process comes in and does an insert into the LOB column between 2 and 3, then we have an incorrect offset stored int he descriptor tables. Hence we added a Lob Lock to make steps 1,2 and 3 atomic as part of Trafodion-2946 to address this issue.

The order of operations with this change is as follows :
 # Call JNI hdfs Write API to write the lob data to hdfs. 
 # Use return data offset from JNI hdfswrite API in 1. as the offset to store in the LOB descriptor tables. 
 # If there are multiple chunks to write, do it in a loop and append to the first chunk. This way each chunk can be anywhere in hdfs and not necessarily continguous. But we are guaranteed that whatever we wrote will be stored in our internalLOB descriptor files.
 # If any failure or TM erro occurs whilewriting to the LOB descriptor tables,  transaction gets rolled back and the chunk of hdfs data written becomes "dead data". It doesn't harm the next operation. 
 # GC check is now done before an update or insert. Earlier it was done as part of the ::allocateDesc operation to get the EOD of the file. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)