You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by liuyu000 <gi...@git.apache.org> on 2017/12/19 07:45:48 UTC

[GitHub] incubator-trafodion pull request #1356: [TRAFODION-2855] Correct the syntax ...

GitHub user liuyu000 opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/1356

    [TRAFODION-2855] Correct the syntax descriptions of LOAD Statement for *Trafodion SQL Reference Manual* 2

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liuyu000/incubator-trafodion LoadStatement2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/1356.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1356
    
----
commit 30f3ea9a2b3ed6bd52e83227911b46b17dd9c99b
Author: liu.yu <yu...@esgyn.cn>
Date:   2017-12-19T07:42:53Z

    Correct the syntax descriptions of LOAD Statement for *Trafodion Reference Manual* 2

----


---

[GitHub] incubator-trafodion pull request #1356: [TRAFODION-2855] Correct the syntax ...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/1356#discussion_r158144904
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_utilities.adoc ---
    @@ -443,24 +443,39 @@ specify one or more of these options:
     
     ** `CONTINUE ON ERROR`
     +
    -LOAD statement will continue after errors encountered while scanning rows from source table. 
    +LOAD statement will continue after ignorable errors while scanning rows from source table or loading into the target table. The ignorable errors are usually data conversion errors.
     +
     Errors during the load or sort phase will cause the LOAD statement to abort. 
     +
    -Error rows will be logged by default in HDFS files in the directory `/user/trafodion/bulkload/logs`. The default name of the error files will be of the form `ERR_<three-part-target-table-name>_<date>_<id>`, where `<id>` is a numeric identifier unique to the process where the error was seen.
    -+
    -This option is implied if `LOG ERROR ROWS [TO _error-location-name_]` or `STOP AFTER _num_ ERROR ROWS` is specified and it is not enabled by default.
    +This option is implied if `LOG ERROR ROWS [TO _error-location-name_]` or `STOP AFTER _num_ ERROR ROWS` is specified.
     
     ** `LOG ERROR ROWS [TO _error-location-name_]`
    +*** Error rows
     +
     If error rows must be written to a specified location, then specify TO _error-location-name_, otherwise they will be written to the default location.
    +`_error-location-name_` must be a HDFS directory name to which trafodion has write access.
     +
    -Error logs are written in separate files by the processes involved in the load command under sub-directory representing the load command in the given location.
    -The actual log file location is displayed in the load command output.
    +Error rows will be logged in HDFS files in the *directory* `/user/trafodion/bulkload/logs` if the error log location is not specified. 
    ++
    +The default name of the *subdirectory* is `_ERR_catalog.schema.target_table_date_id_`, where `_id_` is a numeric identifier timestamp (YYYYMMDD_HHMMSS) unique to the process where the error was seen.
    ++
    +The default name of the *error file* is `_loggingFileNamePrefix_catalog.schema.target_table_instanceID_`, where `_loggingFileNamePrefix_` is hive_scan_err or traf_upsert_err depending on the data source table, and `_instanceID_` is the ID of instance starting from 0, generally there is only one instance.
    --- End diff --
    
    Suggest "...is the instance ID starting from 0, ..."


---

[GitHub] incubator-trafodion pull request #1356: [TRAFODION-2855] Correct the syntax ...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/1356#discussion_r158144917
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_utilities.adoc ---
    @@ -443,24 +443,39 @@ specify one or more of these options:
     
     ** `CONTINUE ON ERROR`
     +
    -LOAD statement will continue after errors encountered while scanning rows from source table. 
    +LOAD statement will continue after ignorable errors while scanning rows from source table or loading into the target table. The ignorable errors are usually data conversion errors.
     +
     Errors during the load or sort phase will cause the LOAD statement to abort. 
     +
    -Error rows will be logged by default in HDFS files in the directory `/user/trafodion/bulkload/logs`. The default name of the error files will be of the form `ERR_<three-part-target-table-name>_<date>_<id>`, where `<id>` is a numeric identifier unique to the process where the error was seen.
    -+
    -This option is implied if `LOG ERROR ROWS [TO _error-location-name_]` or `STOP AFTER _num_ ERROR ROWS` is specified and it is not enabled by default.
    +This option is implied if `LOG ERROR ROWS [TO _error-location-name_]` or `STOP AFTER _num_ ERROR ROWS` is specified.
     
     ** `LOG ERROR ROWS [TO _error-location-name_]`
    +*** Error rows
     +
     If error rows must be written to a specified location, then specify TO _error-location-name_, otherwise they will be written to the default location.
    +`_error-location-name_` must be a HDFS directory name to which trafodion has write access.
     +
    -Error logs are written in separate files by the processes involved in the load command under sub-directory representing the load command in the given location.
    -The actual log file location is displayed in the load command output.
    +Error rows will be logged in HDFS files in the *directory* `/user/trafodion/bulkload/logs` if the error log location is not specified. 
    ++
    +The default name of the *subdirectory* is `_ERR_catalog.schema.target_table_date_id_`, where `_id_` is a numeric identifier timestamp (YYYYMMDD_HHMMSS) unique to the process where the error was seen.
    ++
    +The default name of the *error file* is `_loggingFileNamePrefix_catalog.schema.target_table_instanceID_`, where `_loggingFileNamePrefix_` is hive_scan_err or traf_upsert_err depending on the data source table, and `_instanceID_` is the ID of instance starting from 0, generally there is only one instance.
    ++
    +For example, the full path of the table test_load_log is `/user/trafodion/bulkload/logs/test/ERR_TRAFODION.SEABASE.TEST_LOAD_LOG_20171218_035918/traf_upsert_err_TRAFODION.SEABASE.TEST_LOAD_LOG_0`,
    ++
    +where:
    ++
    +1. `/user/trafodion/bulkload/logs/test` is the default name of *directory*.
    ++
    +2. `ERR_TRAFODION.SEABASE.TEST_LOAD_LOG_20171218_035918` is the default name of *subdirectory*.
    ++
    +3. `traf_upsert_err_TRAFODION.SEABASE.TEST_LOAD_LOG_0` is the default name of *error file*.
     
    -*** `_error-location-name_`
    +*** Error logs
    ++
    +Error logs are written in separate files by the processes involved in the load command under sub-directory representing the load command in the given location.
     +
    -must be a HDFS directory name to which trafodion has write access.
    +The actual log file location is displayed in the load command output. It is recommended that use the same location for load as it’s easier to find the error logs.
    --- End diff --
    
    Suggest "It is recommended that you use..." (add the word "you")


---

[GitHub] incubator-trafodion pull request #1356: [TRAFODION-2855] Correct the syntax ...

Posted by liuyu000 <gi...@git.apache.org>.
Github user liuyu000 commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/1356#discussion_r158206030
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_utilities.adoc ---
    @@ -443,24 +443,39 @@ specify one or more of these options:
     
     ** `CONTINUE ON ERROR`
     +
    -LOAD statement will continue after errors encountered while scanning rows from source table. 
    +LOAD statement will continue after ignorable errors while scanning rows from source table or loading into the target table. The ignorable errors are usually data conversion errors.
     +
     Errors during the load or sort phase will cause the LOAD statement to abort. 
     +
    -Error rows will be logged by default in HDFS files in the directory `/user/trafodion/bulkload/logs`. The default name of the error files will be of the form `ERR_<three-part-target-table-name>_<date>_<id>`, where `<id>` is a numeric identifier unique to the process where the error was seen.
    -+
    -This option is implied if `LOG ERROR ROWS [TO _error-location-name_]` or `STOP AFTER _num_ ERROR ROWS` is specified and it is not enabled by default.
    +This option is implied if `LOG ERROR ROWS [TO _error-location-name_]` or `STOP AFTER _num_ ERROR ROWS` is specified.
     
     ** `LOG ERROR ROWS [TO _error-location-name_]`
    +*** Error rows
     +
     If error rows must be written to a specified location, then specify TO _error-location-name_, otherwise they will be written to the default location.
    +`_error-location-name_` must be a HDFS directory name to which trafodion has write access.
     +
    -Error logs are written in separate files by the processes involved in the load command under sub-directory representing the load command in the given location.
    -The actual log file location is displayed in the load command output.
    +Error rows will be logged in HDFS files in the *directory* `/user/trafodion/bulkload/logs` if the error log location is not specified. 
    ++
    +The default name of the *subdirectory* is `_ERR_catalog.schema.target_table_date_id_`, where `_id_` is a numeric identifier timestamp (YYYYMMDD_HHMMSS) unique to the process where the error was seen.
    ++
    +The default name of the *error file* is `_loggingFileNamePrefix_catalog.schema.target_table_instanceID_`, where `_loggingFileNamePrefix_` is hive_scan_err or traf_upsert_err depending on the data source table, and `_instanceID_` is the ID of instance starting from 0, generally there is only one instance.
    --- End diff --
    
    OK, thanks Dave :)


---

[GitHub] incubator-trafodion pull request #1356: [TRAFODION-2855] Correct the syntax ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-trafodion/pull/1356


---

[GitHub] incubator-trafodion pull request #1356: [TRAFODION-2855] Correct the syntax ...

Posted by liuyu000 <gi...@git.apache.org>.
Github user liuyu000 commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/1356#discussion_r158206657
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_utilities.adoc ---
    @@ -443,24 +443,39 @@ specify one or more of these options:
     
     ** `CONTINUE ON ERROR`
     +
    -LOAD statement will continue after errors encountered while scanning rows from source table. 
    +LOAD statement will continue after ignorable errors while scanning rows from source table or loading into the target table. The ignorable errors are usually data conversion errors.
     +
     Errors during the load or sort phase will cause the LOAD statement to abort. 
     +
    -Error rows will be logged by default in HDFS files in the directory `/user/trafodion/bulkload/logs`. The default name of the error files will be of the form `ERR_<three-part-target-table-name>_<date>_<id>`, where `<id>` is a numeric identifier unique to the process where the error was seen.
    -+
    -This option is implied if `LOG ERROR ROWS [TO _error-location-name_]` or `STOP AFTER _num_ ERROR ROWS` is specified and it is not enabled by default.
    +This option is implied if `LOG ERROR ROWS [TO _error-location-name_]` or `STOP AFTER _num_ ERROR ROWS` is specified.
     
     ** `LOG ERROR ROWS [TO _error-location-name_]`
    +*** Error rows
     +
     If error rows must be written to a specified location, then specify TO _error-location-name_, otherwise they will be written to the default location.
    +`_error-location-name_` must be a HDFS directory name to which trafodion has write access.
     +
    -Error logs are written in separate files by the processes involved in the load command under sub-directory representing the load command in the given location.
    -The actual log file location is displayed in the load command output.
    +Error rows will be logged in HDFS files in the *directory* `/user/trafodion/bulkload/logs` if the error log location is not specified. 
    ++
    +The default name of the *subdirectory* is `_ERR_catalog.schema.target_table_date_id_`, where `_id_` is a numeric identifier timestamp (YYYYMMDD_HHMMSS) unique to the process where the error was seen.
    ++
    +The default name of the *error file* is `_loggingFileNamePrefix_catalog.schema.target_table_instanceID_`, where `_loggingFileNamePrefix_` is hive_scan_err or traf_upsert_err depending on the data source table, and `_instanceID_` is the ID of instance starting from 0, generally there is only one instance.
    ++
    +For example, the full path of the table test_load_log is `/user/trafodion/bulkload/logs/test/ERR_TRAFODION.SEABASE.TEST_LOAD_LOG_20171218_035918/traf_upsert_err_TRAFODION.SEABASE.TEST_LOAD_LOG_0`,
    ++
    +where:
    ++
    +1. `/user/trafodion/bulkload/logs/test` is the default name of *directory*.
    ++
    +2. `ERR_TRAFODION.SEABASE.TEST_LOAD_LOG_20171218_035918` is the default name of *subdirectory*.
    ++
    +3. `traf_upsert_err_TRAFODION.SEABASE.TEST_LOAD_LOG_0` is the default name of *error file*.
     
    -*** `_error-location-name_`
    +*** Error logs
    ++
    +Error logs are written in separate files by the processes involved in the load command under sub-directory representing the load command in the given location.
     +
    -must be a HDFS directory name to which trafodion has write access.
    +The actual log file location is displayed in the load command output. It is recommended that use the same location for load as it’s easier to find the error logs.
    --- End diff --
    
    OK, thanks for your eagle eye, Dave :)


---