You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/04/16 10:59:19 UTC

[GitHub] [incubator-doris] liutang123 opened a new issue #5668: Stream load may exec long time when rpc slow

liutang123 opened a new issue #5668:
URL: https://github.com/apache/incubator-doris/issues/5668


   **Describe the bug**
   A clear and concise description of what the bug is.
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Do something to let rpc call slow. For example: sleep 60s in `PInternalServiceImpl::tablet_writer_add_batch`.
   2. Exec a stream load, set the timeout to 70s.
   3. The each batch will cost 60s, and the total time may very big.
   
   **Expected behavior**
   Total time should not bigger than stream load timeout.
   
   **Screenshots**
   In our cluster:
   BE stack:
   ```Thread 355 (Thread 0x7f98f057e700 (LWP 108203)):
   #0  0x00007f9932ecae3d in nanosleep () from /lib64/libpthread.so.0
   #1  0x000000000162ac56 in base::SleepForNanoseconds (nanoseconds=<optimized out>) at /opt/meituan/jenkins/workspace/big_data_component/doris-build-package/palo/be/src/gutil/
   sysinfo.cc:95
   #2  0x0000000001534046 in doris::stream_load::NodeChannel::close_wait (this=this@entry=0x1807c96b40, state=0x2b4796300) at /opt/meituan/jenkins/workspace/big_data_component/doris-build-package/palo/be/src/exec/tablet_sink.cpp:255
   ```
   BE log:
   ```
   W0406 23:39:59.249436 108409 tablet_sink.h:123] failed to send brpc batch, error=RPC call is timed out, error_text=[E1008]Reached timeout=1800,000ms @10.16.97.120:8060
   W0406 23:39:59.249523 108409 tablet_sink.cpp:144] NodeChannel[225744932-222983563] add batch req rpc failed, load_id=c2407b0396faa65b-b9db45a92f6b6ead, txn_id=2585192528, node=10.16.97.120:8060
   I0406 23:39:59.255098 108203 plan_fragment_executor.cpp:579] Fragment c2407b0396faa65b-b9db45a92f6b6eae:(Active: 11h22m, non-child: 0.00%)
   
   ...
    OlapTableSink:(Active: 11h22m, non-child: 99.99%)
        - CloseWaitTime: 11h21m
        - ConvertBatchTime: 0.000ns
        - NonBlockingSendTime: 0.000ns
        - OpenTime: 97.053ms
        - RowsFiltered: 0
        - RowsRead: 3.34M
        - RowsReturned: 3.34M
        - SendDataTime: 4s714ms
        - SerializeBatchTime: 12s223ms
        - ValidateDataTime: 2s197ms
   ```
   The stream load timeout is 1800ms,but the OlapTableSink exec `close_wait` more than 11hours .
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman closed issue #5668: Stream load may exec long time when rpc slow

Posted by GitBox <gi...@apache.org>.
morningman closed issue #5668:
URL: https://github.com/apache/incubator-doris/issues/5668


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org