You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Shu Zhang (Created) (JIRA)" <ji...@apache.org> on 2011/12/17 00:52:30 UTC

[jira] [Created] (FLUME-889) All events in memory channel are lost on reconfiguration

All events in memory channel are lost on reconfiguration
--------------------------------------------------------

                 Key: FLUME-889
                 URL: https://issues.apache.org/jira/browse/FLUME-889
             Project: Flume
          Issue Type: Bug
          Components: Channel
    Affects Versions: NG alpha 2, NG alpha 1
            Reporter: Shu Zhang
            Assignee: Shu Zhang
             Fix For: NG alpha 3


this line is at the end MemoryChannel.configure(Context)

    queue = new LinkedBlockingDeque<StampedEvent>(capacity);

memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Juhani Connolly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188995#comment-13188995 ] 

Juhani Connolly commented on FLUME-889:
---------------------------------------

The more I look over the rest of this, the less comfortable I am with it. In isolation it definitely isn't thread safe, and I don't really think it's a good idea to make assumptions about the usage patterns. I'll go over it again, and may add extra unit tests.
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Shu Zhang (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188942#comment-13188942 ] 

Shu Zhang commented on FLUME-889:
---------------------------------

I don't seem capable of re-assigning this ticket to Juhani, if anyone can, please do so. Thanks.
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Shu Zhang (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188928#comment-13188928 ] 

Shu Zhang edited comment on FLUME-889 at 1/19/12 5:37 AM:
----------------------------------------------------------

Thanks for taking care of this Juhani, I have not had time to work on this.

I do have few comments though. First, it looks like you might have a thread safety issue (and I think likely, configure() will likely be called from different threads than those calling puts and takes). Here's the problematic block in the resizeQueue method. 

      LinkedBlockingDeque<StampedEvent> lbQueue = (LinkedBlockingDeque<StampedEvent>)queue;
      ...
      queue = new LinkedBlockingDeque<StampedEvent>(capacity);
      queue.addAll(lbQueue);

The queue is not locked through those last two statements, so other threads/methods can potentially manipulate 'queue' between the execution of those last two statements. The first problem is, any events added in between the execution of those 2 statements will end up being higher up in the queue than events added in the past. Ordering in the queue appears to matter here, see for example undoPut(). 
Potentially even more problematic is the scenario where two configure() calls are triggered close to each other from 2 threads; in that case, we could potentially have complete data loss in the queue.

My second comment is on disallowing shrinking the capacity. It's not a big deal, but it seems like an arbitrary limitation, and I'm against that sort of thing in principle. If we have more events than a reconfigured capacity, it seems right to keep all old events but in the future disallow more events than the current allowed capacity.

I have a suggestion for an approach that can solve the thread safety issue, and remove the arbitrary capacity resizing limitation without adding performance overhead. Please consider: maintain an ordered list or a queue of queues where the last queue is 'current'. We only add to the last (or current queue), but we keep the old ones around until everything's been dequeued from them. This potentially requires no synchronization to ensure correctness regardless of concurrency and it also makes it simple to implement the capacity shrinking scheme that I think makes sense.

Again, sorry I haven't had to time to work on this, and thanks for taking care of it.
                
      was (Author: shuzhang):
    Thanks for taking care of this Juhani, I have not had time to work on this.

I do have few comments though. First, it looks like you might have a thread safety issue (and I think likely, configure() will likely be called from different threads than those calling puts and takes). Here's the problematic block in the resizeQueue method. 

      LinkedBlockingDeque<StampedEvent> lbQueue = (LinkedBlockingDeque<StampedEvent>)queue;
      ...
      queue = new LinkedBlockingDeque<StampedEvent>(capacity);
      queue.addAll(lbQueue);

The queue is not locked through those last two statements, so other threads/methods can potentially manipulate 'queue' between the execution of those last two statements. The first problem is, any events added in between the execution of those 2 statements will end up being higher up in the queue than events added in the past. Ordering in the queue appears to matter here, see for example undoPut() among other examples. 
Potentially even more problematic is the scenario where two configure() calls are triggered close to each other from 2 threads; in that case, we could potentially have complete data loss in the queue.

My second comment is on disallowing shrinking the capacity. It's not a big deal, but it seems like an arbitrary limitation, and I'm against that sort of thing in principle. If we have too more events than a reconfigured capacity, it seems right to keep all old events but in the future disallow more events than the current allowed capacity.

I have a suggestion for an approach that can solve the thread safety issue, and remove the arbitrary capacity resizing limitation without adding performance overhead. Please consider: maintain an ordered list or a queue of queues where the last queue is 'current'. We only add to the last (or current queue), but we keep the old ones around until everything's been dequeued from them. This requires no synchronization to ensure correctness regardless of concurrency and it also makes it simple to implement the capacity shrinking scheme that I think makes sense.

Again, sorry I haven't had to time to work on this, and thanks for taking care of it.
                  
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Juhani Connolly (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Juhani Connolly updated FLUME-889:
----------------------------------

    Attachment: FLUME-889-2.patch

Here's the patch.

I'm somewhat concerned however about the threading issue mentioned.
Following the code as it is, my understanding was that configurations are going to get run through doLoad in AbstractConfigurationProvider, which is thread safe. Please correct me if I'm wrong on that, and I'll fix the issue
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Juhani Connolly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189032#comment-13189032 ] 

Juhani Connolly commented on FLUME-889:
---------------------------------------

While going through the thread safety issues, it turned out that the channel itself has some problems which I made another JIRA for along with a test-case to demonstrate it https://issues.apache.org/jira/browse/FLUME-936
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188877#comment-13188877 ] 

jiraposter@reviews.apache.org commented on FLUME-889:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3524/
-----------------------------------------------------------

(Updated 2012-01-19 02:03:00.640334)


Review request for Flume.


Changes
-------

Changed to abort capacity change if it would result in dataloss, updating the test to reflect this.
Also lost all the trailing spaces


Summary
-------

Modified configure to check for an exisitng deque, and copy across data if the size has changed. If the new capacity is smaller than the number of items remaining in the old one, data is still lost, with a logger warning.


This addresses bug FLUME-889.
    https://issues.apache.org/jira/browse/FLUME-889


Diffs (updated)
-----

  /branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannel.java 1228002 
  /branches/flume-728/flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java 1228002 

Diff: https://reviews.apache.org/r/3524/diff


Testing
-------

Created new unit test to check assumptions made along with capacity limits.
New unit test passes
All existing unit tests are fine(except for ExecSource which fails because of development environment)


Thanks,

Juhani


                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "E. Sammer (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

E. Sammer reassigned FLUME-889:
-------------------------------

    Assignee: Juhani Connolly  (was: Shu Zhang)

Assigning to Juhani per Shu's request.
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Juhani Connolly
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Juhani Connolly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197423#comment-13197423 ] 

Juhani Connolly commented on FLUME-889:
---------------------------------------

I'm fixing this issue in https://issues.apache.org/jira/browse/FLUME-936 where I'm rewriting memorychannel around flume-935 and fixing the threading issues, so this issue can be cancelled
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Juhani Connolly
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch, FLUME-889-4.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Prasad Mujumdar (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188914#comment-13188914 ] 

Prasad Mujumdar commented on FLUME-889:
---------------------------------------

Hello Juhani,

     Could you please attach the updated patch to the Jira (and make sure to select grant license to Apache). I shall go ahead and commit it to NG branch. 
Thanks for taking care of the issue !

thanks
Prasad

                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Shu Zhang (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188928#comment-13188928 ] 

Shu Zhang commented on FLUME-889:
---------------------------------

Thanks for taking care of this Juhani, I have not had time to work on this.

A few comments though. First, it looks like you might have some thread safety issues (and configure() will likely be called from a different thread than the thread executing puts and takes).

      queue = new LinkedBlockingDeque<StampedEvent>(capacity);
      queue.addAll(lbQueue);

Those 2 statement are not atomic, which means any events added in between the execution of those 2 statements will end up being higher up in the queue than events added in the past.
Ordering in the queue appears to matter in the current implementation, see for example undoPut() among other examples. More potentially problematic is another configure call in between those statements' execution, events I think will simply be lost.

Also I don't think there's a good reason to disallow shrinking the capacity. It's not a big deal, but seems like sort of an arbitrary limitation. If we have too many events already, we can keep those around and in the future not allowing more events to come in than the capacity.

One approach that might solve both the the thread safety issue, and remove the arbitrary resizing limitation is to maintain an ordered list of queues where the last queue is the 'current'. We keep the old ones around until everything's been dequeued from them. This requires no synchronization to ensure correctness and it's not any more difficult to shrink capacity.
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Arvind Prabhakar (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arvind Prabhakar resolved FLUME-889.
------------------------------------

    Resolution: Fixed

Resolving this issue as it will be addressed by work done in FLUME-936.
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Juhani Connolly
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch, FLUME-889-4.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Prasad Mujumdar (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190203#comment-13190203 ] 

Prasad Mujumdar commented on FLUME-889:
---------------------------------------

@Juhani,  the patch is not applying cleanly. If you are using git, create the patch with 'git diff --no-prefix' . Could you please give another try to generate the patch file ?
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Shu Zhang (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189357#comment-13189357 ] 

Shu Zhang commented on FLUME-889:
---------------------------------

"In isolation it definitely isn't thread safe, and I don't really think it's a good idea to make assumptions about the usage patterns. I'll go over it again, and may add extra unit tests."

I think that's a good idea since configure() is public and some of the javadoc comments suggests the channel is thread safe. Thanks.

Also it looks like AbstractFileConfigurationProvider would not prevent put()s and remove()s from being called concurrently with configure() calls, which as things stand, can corrupt internal state.
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188910#comment-13188910 ] 

jiraposter@reviews.apache.org commented on FLUME-889:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3524/#review4463
-----------------------------------------------------------

Ship it!


lgtm

- Prasad


On 2012-01-19 02:03:00, Juhani Connolly wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3524/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-01-19 02:03:00)
bq.  
bq.  
bq.  Review request for Flume.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Modified configure to check for an exisitng deque, and copy across data if the size has changed. If the new capacity is smaller than the number of items remaining in the old one, data is still lost, with a logger warning.
bq.  
bq.  
bq.  This addresses bug FLUME-889.
bq.      https://issues.apache.org/jira/browse/FLUME-889
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    /branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannel.java 1228002 
bq.    /branches/flume-728/flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java 1228002 
bq.  
bq.  Diff: https://reviews.apache.org/r/3524/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Created new unit test to check assumptions made along with capacity limits.
bq.  New unit test passes
bq.  All existing unit tests are fine(except for ExecSource which fails because of development environment)
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Juhani
bq.  
bq.


                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Juhani Connolly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188340#comment-13188340 ] 

Juhani Connolly commented on FLUME-889:
---------------------------------------

Added a fix to this to review board: https://reviews.apache.org/r/3524/
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188810#comment-13188810 ] 

jiraposter@reviews.apache.org commented on FLUME-889:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3524/#review4456
-----------------------------------------------------------


In my opinion its better not re-size when the new size is smaller that number of events in the queue. The warning in log will can indicate that the queue was not re-sized. 

The changes overall look fine to me (few spacing nits).

- Prasad


On 2012-01-18 08:47:35, Juhani Connolly wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3524/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-01-18 08:47:35)
bq.  
bq.  
bq.  Review request for Flume.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Modified configure to check for an exisitng deque, and copy across data if the size has changed. If the new capacity is smaller than the number of items remaining in the old one, data is still lost, with a logger warning.
bq.  
bq.  
bq.  This addresses bug FLUME-889.
bq.      https://issues.apache.org/jira/browse/FLUME-889
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    /branches/flume-728/flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java 1228002 
bq.    /branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannel.java 1228002 
bq.  
bq.  Diff: https://reviews.apache.org/r/3524/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Created new unit test to check assumptions made along with capacity limits.
bq.  New unit test passes
bq.  All existing unit tests are fine(except for ExecSource which fails because of development environment)
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Juhani
bq.  
bq.


                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Juhani Connolly (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188340#comment-13188340 ] 

Juhani Connolly edited comment on FLUME-889 at 1/18/12 9:06 AM:
----------------------------------------------------------------

I presume that Mr. Zhang isn't actually working on this, and should be able to do it.

Added a fix to this to review board: https://reviews.apache.org/r/3524/
                
      was (Author: juhanic):
    Added a fix to this to review board: https://reviews.apache.org/r/3524/
                  
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Shu Zhang (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188928#comment-13188928 ] 

Shu Zhang edited comment on FLUME-889 at 1/19/12 5:35 AM:
----------------------------------------------------------

Thanks for taking care of this Juhani, I have not had time to work on this.

I do have few comments though. First, it looks like you might have a thread safety issue (and I think likely, configure() will likely be called from different threads than those calling puts and takes). Here's the problematic block in the resizeQueue method. 

      LinkedBlockingDeque<StampedEvent> lbQueue = (LinkedBlockingDeque<StampedEvent>)queue;
      ...
      queue = new LinkedBlockingDeque<StampedEvent>(capacity);
      queue.addAll(lbQueue);

The queue is not locked through those last two statements, so other threads/methods can potentially manipulate 'queue' between the execution of those last two statements. The first problem is, any events added in between the execution of those 2 statements will end up being higher up in the queue than events added in the past. Ordering in the queue appears to matter here, see for example undoPut() among other examples. 
Potentially even more problematic is the scenario where two configure() calls are triggered close to each other from 2 threads; in that case, we could potentially have complete data loss in the queue.

My second comment is on disallowing shrinking the capacity. It's not a big deal, but it seems like an arbitrary limitation, and I'm against that sort of thing in principle. If we have too more events than a reconfigured capacity, it seems right to keep all old events but in the future disallow more events than the current allowed capacity.

I have a suggestion for an approach that can solve the thread safety issue, and remove the arbitrary capacity resizing limitation without adding performance overhead. Please consider: maintain an ordered list or a queue of queues where the last queue is 'current'. We only add to the last (or current queue), but we keep the old ones around until everything's been dequeued from them. This requires no synchronization to ensure correctness regardless of concurrency and it also makes it simple to implement the capacity shrinking scheme that I think makes sense.

Again, sorry I haven't had to time to work on this, and thanks for taking care of it.
                
      was (Author: shuzhang):
    Thanks for taking care of this Juhani, I have not had time to work on this.

A few comments though. First, it looks like you might have some thread safety issues (and configure() will likely be called from a different thread than the thread executing puts and takes).

      queue = new LinkedBlockingDeque<StampedEvent>(capacity);
      queue.addAll(lbQueue);

Those 2 statement are not atomic, which means any events added in between the execution of those 2 statements will end up being higher up in the queue than events added in the past.
Ordering in the queue appears to matter in the current implementation, see for example undoPut() among other examples. More potentially problematic is another configure call in between those statements' execution, events I think will simply be lost.

Also I don't think there's a good reason to disallow shrinking the capacity. It's not a big deal, but seems like sort of an arbitrary limitation. If we have too many events already, we can keep those around and in the future not allowing more events to come in than the capacity.

One approach that might solve both the the thread safety issue, and remove the arbitrary resizing limitation is to maintain an ordered list of queues where the last queue is the 'current'. We keep the old ones around until everything's been dequeued from them. This requires no synchronization to ensure correctness and it's not any more difficult to shrink capacity.
                  
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Juhani Connolly (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Juhani Connolly updated FLUME-889:
----------------------------------

    Attachment: FLUME-889-4.patch

Didn:t notice the problem with the patch.
Resubmitting... I'm using svn but the setup is kind of borked right now. I manually editted the paths on the index so it should work properly against the intended branch
                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Juhani Connolly
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch, FLUME-889-4.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188341#comment-13188341 ] 

jiraposter@reviews.apache.org commented on FLUME-889:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3524/
-----------------------------------------------------------

Review request for Flume.


Summary
-------

Modified configure to check for an exisitng deque, and copy across data if the size has changed. If the new capacity is smaller than the number of items remaining in the old one, data is still lost, with a logger warning.


This addresses bug FLUME-889.
    https://issues.apache.org/jira/browse/FLUME-889


Diffs
-----

  /branches/flume-728/flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java 1228002 
  /branches/flume-728/flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannel.java 1228002 

Diff: https://reviews.apache.org/r/3524/diff


Testing
-------

Created new unit test to check assumptions made along with capacity limits.
New unit test passes
All existing unit tests are fine(except for ExecSource which fails because of development environment)


Thanks,

Juhani


                
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (FLUME-889) All events in memory channel are lost on reconfiguration

Posted by "Juhani Connolly (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FLUME-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188976#comment-13188976 ] 

Juhani Connolly edited comment on FLUME-889 at 1/19/12 7:50 AM:
----------------------------------------------------------------

Here's the patch.

I'm somewhat concerned however about the threading issue mentioned.
Following the code as it is, my understanding was that configurations are going to get run through doLoad in AbstractConfigurationProvider, which is thread safe. Please correct me if I'm wrong on that, and I'll fix the issue. 

As far as the disallowing of the capacity resizing, that could be a valid concern though it would make the simple code more finicky.  I guess it depends on how important others feel it would be to others, I can revisit this if people feel it's a big deal.
                
      was (Author: juhanic):
    Here's the patch.

I'm somewhat concerned however about the threading issue mentioned.
Following the code as it is, my understanding was that configurations are going to get run through doLoad in AbstractConfigurationProvider, which is thread safe. Please correct me if I'm wrong on that, and I'll fix the issue
                  
> All events in memory channel are lost on reconfiguration
> --------------------------------------------------------
>
>                 Key: FLUME-889
>                 URL: https://issues.apache.org/jira/browse/FLUME-889
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: NG alpha 1, NG alpha 2
>            Reporter: Shu Zhang
>            Assignee: Shu Zhang
>             Fix For: v1.1.0
>
>         Attachments: FLUME-889-2.patch
>
>
> this line is at the end MemoryChannel.configure(Context)
>     queue = new LinkedBlockingDeque<StampedEvent>(capacity);
> memory channel is meant to be dynamically configurable, however every time it's reconfigured, all existing events are dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira