You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Andras Salamon (JIRA)" <ji...@apache.org> on 2018/11/12 11:17:00 UTC

[jira] [Updated] (OOZIE-3382) Optimize SshActionExecutor's drainBuffers method

     [ https://issues.apache.org/jira/browse/OOZIE-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andras Salamon updated OOZIE-3382:
----------------------------------
    Description: 
OOZIE-3354 improved {{SshActionExecutor}} to avoid {{Process#waitFor()}} blocks and modified the {{drainBuffers}} method to keep draining the standard output (and standard error) continuously.

Right now the speed of the drain is hardwired. As long as the process is running the method only reads 1024 bytes in each cycle (half a second) which can take very long time if we want to drain several megabytes (for instance {{oozie.servlet.CallbackServlet.max.data.len}} is increased).

Let's optimize the draining.

We can either read 1024 bytes multiple times in each cycle (as long as there are data in the buffer), or we can increase the value of the buffer size (1024). 

In the latter case the default of the buffer size could be half of the {{oozie.servlet.CallbackServlet.max.data.len}} value, but we also need an additional property to specify the buffer size (to avoid memory problems because of using a very big buffer). We can keep 1024 as a minimum buffer size. 

It would be also useful to refactor the code and put the buffer draining into a separate class and create unit tests for the class. Using this class in {{ShellMain}} to avoid code duplication would also be very useful, but we have to fix OOZIE-3359 first.

  was:
OOZIE-3354 improved {{SshActionExecutor}} to avoid {{Process#waitFor()}} blocks and modified the {{drainBuffers}} method to keep draining the standard output (and standard error) continuously.

Right now the speed of the drain is hardwired. As long as the process is running the method only reads 1024 bytes in each cycle (half a second) which can take very long time if we want to drain several megabytes (for instance {{oozie.servlet.CallbackServlet.max.data.len}} is increased).

Let's optimize the draining.

We can either read 1024 bytes multiple times in each cycle (as long as there are data in the buffer), or we can increase the value of the buffer size (1024). 

In the latter case the default of the buffer size could be half of the {{oozie.servlet.CallbackServlet.max.data.len}} value, but we also need an additional property to specify the buffer size (to avoid memory problems because of using a very big buffer). We can keep 1024 as a minimum buffer size. 


> Optimize SshActionExecutor's drainBuffers method
> ------------------------------------------------
>
>                 Key: OOZIE-3382
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3382
>             Project: Oozie
>          Issue Type: Improvement
>            Reporter: Andras Salamon
>            Assignee: Andras Salamon
>            Priority: Major
>
> OOZIE-3354 improved {{SshActionExecutor}} to avoid {{Process#waitFor()}} blocks and modified the {{drainBuffers}} method to keep draining the standard output (and standard error) continuously.
> Right now the speed of the drain is hardwired. As long as the process is running the method only reads 1024 bytes in each cycle (half a second) which can take very long time if we want to drain several megabytes (for instance {{oozie.servlet.CallbackServlet.max.data.len}} is increased).
> Let's optimize the draining.
> We can either read 1024 bytes multiple times in each cycle (as long as there are data in the buffer), or we can increase the value of the buffer size (1024). 
> In the latter case the default of the buffer size could be half of the {{oozie.servlet.CallbackServlet.max.data.len}} value, but we also need an additional property to specify the buffer size (to avoid memory problems because of using a very big buffer). We can keep 1024 as a minimum buffer size. 
> It would be also useful to refactor the code and put the buffer draining into a separate class and create unit tests for the class. Using this class in {{ShellMain}} to avoid code duplication would also be very useful, but we have to fix OOZIE-3359 first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)