You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by DaveBirdsall <gi...@git.apache.org> on 2017/10/03 22:19:25 UTC

[GitHub] incubator-trafodion pull request #1247: [TRAFODION-2755] Create *Trafodion S...

Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/1247#discussion_r142538517
  
    --- Diff: docs/lob_guide/src/asciidoc/_chapters/working_with_lob.adoc ---
    @@ -751,4 +751,380 @@ CQD `LOB_MAX_SIZE` (default 10G expressed in M [10000M]).
     
     * Extract Target Locations
     +
    -The file to extract to can be a local linux file or a local HDFS file.
    \ No newline at end of file
    +The file to extract to can be a local linux file or a local HDFS file.
    +
    +[#deleting column from a sql table containing lob columns]
    +== Deleting Column from a SQL Table Containing LOB columns
    +
    +[#syntax]
    +=== Syntax
    +
    +```
    +DELETE lob-column-name FROM table-name [WHERE CLAUSE]
    +```
    +
    +[#considerations]
    +=== Considerations
    +
    +When one or more rows containing LOB columns are deleted from LOB table, only the metadata information is dropped and the hdfs data remains as it is. The references to the lob data are removed from the lob descriptor file. 
    +
    +This mechanism has not been implemented yet as a separate utility but it is triggered as a part of insert, update and append operations. For more information, see <<garbage collection,Garbage Collection>>.
    +
    +[#dropping a sql table containing lob columns ]
    +== Dropping a SQL Table Containing LOB Columns 
    +
    +Drop works like any other drop table. All dependent tables are deleted. All files in hdfs (data and descriptor) files are also deleted.
    +
    +For more information, see <<drop_table_statement,DROP TABLE Statement>> in http://trafodion.incubator.apache.org/docs/sql_reference/index.html[Trafodion SQL Reference Manual].
    +
    +[#garbage collection]
    +== Garbage Collection
    +
    +When a lob datafile for a column has reached a certain limit, defined by a CQD `LOB_GC_LIMIT_SIZE`, then a compaction is triggered automatically. +
    +The default GC Limit is 10GB and can be changed if needed. 
    +
    +The need for GC arises because when a delete operation or an update operation is performed, the old data black in the hdfs file will be left as unused. +
    +In the case of update, the old data will be left as unused and the new data will be written into a new section, so all these “holes” in the LOB data file are needlessly occupying space. 
    +
    +The LOB descriptor chunks file is looked at to see which ranges and offsets are actually used. The LOB datafile is temporarily saved. The compaction is done into a new tempfile. When the sections have all been copied into the tempfile, Trafodion will delete the existing lob data file and rename the tempfile. 
    +
    +Finally, the saved copy of the LOB datafile is dropped. The saved copy is there just in case you need to fall back to it in case of an error. Since this operation is triggered as part of an IUD operation, a definite slowdown will occur for that insert/update operation compared to subsequent inserts/updates. 
    +
    +Also, each lob column of a table can be compacted separately as needed. GC does not have to be done to all columns of the LOB table all at once. 
    +
    +NOTE: Currently the GC is done in the same transaction as the transaction being used for the insert or update operation. If any part of the GC fails, then the entire transaction is aborted. 
    +
    +When Trafodion has support for local transactions, Trafodion will do the GC in a separate transaction or in a separate process, so you can fail the GC with a warning and allow the insert to go through. 
    --- End diff --
    
    Consider: "When Trafodion has support for local transactions, Trafodion will do the GC in a separate transaction or in a separate process, so GC can fail with a warning yet the insert can continue." (I'm assuming it is the system that causes the GC to fail and not the end-user. If I'm wrong, ignore this comment.)


---