You are viewing a plain text version of this content. The canonical link for it is here.

Posted to codereview@trafodion.apache.org by mashengchen <gi...@git.apache.org> on 2018/08/14 03:37:18 UTC

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

GitHub user mashengchen opened a pull request:

    https://github.com/apache/trafodion/pull/1694

    [TRAFODION-3183]  fetch huge data give rise to core

    each time do fetch , calc whether fetch size is bigger than 1GB, if true ,devide fetch times depend on fetchsize/1gb

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mashengchen/trafodion TRAFODION-3183

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/trafodion/pull/1694.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1694
    
----
commit bb68005e05988e471aa629a5759fdb5686a4f6bc
Author: aven <sh...@...>
Date:   2018-08-14T03:11:20Z

    [TRAFODION-3183]  fetch huge data give rise to core

----


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by mashengchen <gi...@git.apache.org>.

Github user mashengchen commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r219021318
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    MEMORY_LIMIT_ROWSET_IN_MB  seems to use for insert don't for fetch


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by selvaganesang <gi...@git.apache.org>.

Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r243471406
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    Just spent some time looking at this change. As per JDBC specification, Statement.setMaxRows  sets the upper limit on the number of rows in the result set. Statement.setFetchSize sets the hint for the driver to fetch how many rows are fetched in next(). maxRows_ and fetchSize_ in this method corresponds to these values respectively.  It is possible that the application doesn't  set this value at all.  Any case, JDBC driver can decide the fetchSize based on the hint or its limitations.  I would suggest the following:
    
    1. Move this logic to calculate the number of rows to getFetchSize
    2. Use a data  source property to get the fetch size in terms of MB rather than assuming 1GB in calculation. Assume a default value a far less than 1GB says 50 or 100 MB if the property is not set.
    3. Calculate the number of rows fetchSize_ based on the setFetchSize and this  property value which ever doesn't exceed the size in terms of MB.
    4. Use getFetchSize() in this method to get the fetchSize_
    
    This would ensure that Trafodion conforms to JDBC specification in a better way because it would  let the application know how the hint of setFetchSize works.


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by selvaganesang <gi...@git.apache.org>.

Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r209965872
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    Server side is supposed to limit the size both at compile and run-time.  Do you mean that limitation didn't kick in


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by mashengchen <gi...@git.apache.org>.

Github user mashengchen commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r243514685
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    add note, there may have situations : one time fetch larger than 1GB when select hive datas or get lob data, at that time mxosrvr will down.


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by selvaganesang <gi...@git.apache.org>.

Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r243471435
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    BTW, sorry for not responding sooner


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by mashengchen <gi...@git.apache.org>.

Github user mashengchen commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r243512059
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    thank you for the suggestion, aggress with point,1,3,4. for the third one, the 1GB limit currently is the max fetch size, if a larger value setted , it may cause mxosrvr core. also the purpose of this PR is to limit the fetch size, so I don't think it's need to give users the entance to set max fetch size.(what to do if users set it to 10GB), so i think only when fetch size larger than 1gb, we may do the limitation, to reduce the size to 1gb


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by selvaganesang <gi...@git.apache.org>.

Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r210374167
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    In that case, similar issue will be seen with ODBC application and Type 2 driver applications.
    
    It looks like the cqd MEMORY_LIMIT_ROWSET_IN_MB limits the rowset size during compile time, but doesn't limit the rowset size during run-time. It is possible to change the rowset size during run-time without specifying it during compile time. In that case, it would be good if we use the same CQD value to limit it during run-time too.


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by mashengchen <gi...@git.apache.org>.

Github user mashengchen commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r233000726
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    hi Selva , here is just change the maxRowCnt to a reasonable value. No need to loop,  jdbc then will fetch rows according to the changed maxRowCnt 


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by mashengchen <gi...@git.apache.org>.

Github user mashengchen commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r210172301
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    yes, server side not work well.
    The fetch row count is sent from client side, so i think server side limition is not suitable


---

[GitHub] trafodion pull request #1694: [TRAFODION-3183] fetch huge data give rise to ...

Posted by selvaganesang <gi...@git.apache.org>.

Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1694#discussion_r232720071
  
    --- Diff: core/conn/jdbcT4/src/main/java/org/trafodion/jdbc/t4/TrafT4ResultSet.java ---
    @@ -2779,6 +2779,19 @@ public boolean next() throws SQLException {
     					maxRowCnt = maxRows - totalRowsFetched_;
     				}
     
    +        // if (row width) * (fetch rows) too large, there will have core in server side.
    +        // once fetch bytes bigger than 1GB, devide several times to fetch,
    +        // each time fetch bytes less than 1GB.
    +        if (outputDesc_ != null && outputDesc_[0] != null) {
    +          long rowLength = outputDesc_[0].rowLength_;
    +          long oneGB = 1024 * 1024 * 1024;
    +          if (rowLength * maxRowCnt >= oneGB) {
    +            double multi = (rowLength * maxRowCnt) / (double)oneGB;
    +            multi = Math.ceil(multi); // devide several times to fetch
    +            maxRowCnt = (int) (maxRowCnt / multi);
    +          }
    +        }
    +
    --- End diff --
    
    Yes @mashengchen you are correct. I would think it is better to move this logic to server side if possible. Also, Is there a need to loop around till maxRowCont or EOF is reached.  If not, I am ok with PR


---