You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2015/01/16 20:34:35 UTC

[jira] [Commented] (FLUME-2581) File Channel replay speed improvement

    [ https://issues.apache.org/jira/browse/FLUME-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280711#comment-14280711 ] 

Otis Gospodnetic commented on FLUME-2581:
-----------------------------------------

[~roshan_naik] - so is the point that performance issue has already been fixed and this should be resolved as Won't Fix?

> File Channel replay speed improvement
> -------------------------------------
>
>                 Key: FLUME-2581
>                 URL: https://issues.apache.org/jira/browse/FLUME-2581
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: v1.5.1
>            Reporter: Roshan Naik
>
> The following two jiras were meant to address file channel replay performance in v1.4:
> FLUME-2155 committed in flume-1.5 (released)
> FLUME-2450 committed in flume-1.6 (currently trunk)
> My measurements are showing that file channel replay speed was actually fastest in flume 1.4 (prior to these jiras being committed).  It was significantly slower in flume 1.5 (due to flume FLUME-2155). 
> FLUME-2450 has brought it back to near 1.4 speed. So net effect is almost 0. 
> For measuring i wrote a unit test that pumped 5 million events and drained about half of it. Every time a batch of events was inserted, about half of them were drained.  At the end the FC was left with about 2.5 mill events. At this point i restart the FC and measure how long it took to come up.
> Here are a couple readings for each flume version:
> 1.4 :  34 sec,   34 sec
> 1.5 : 236 sec, 295 sec
> 1.6 :  34 sec,    36 sec
> below is test code that i added in TestCheckpointRebuilder
> {code}
>   // Starts with empty FC, pumps and drains some data, does a chkPt,
>   // again pumps and drains some data, stops fc, measures replay speed
>   @Test
>   public void testReplaySpeed_WithDrain() throws Exception {
>     //1 start FC
>     Map<String, String> overrides = Maps.newHashMap();
>     overrides.put(FileChannelConfiguration.CAPACITY,
>             String.valueOf(6000000));
>     overrides.put(FileChannelConfiguration.TRANSACTION_CAPACITY,
>             String.valueOf(5000));
>     channel = createFileChannel(overrides);
>     channel.start();
>     //2 pump and drain data
>     Assert.assertTrue(channel.isOpen());
>     fillAndDrainChannel(channel, "test");
>     //3 stop
>     channel.stop();
>     channel.close();
>     // 4 instantiate new fc
>     channel = createFileChannel(overrides);
>     //5 start FC .. measure replay
>     System.out.println("=========== > CLOCK STARTS NOW ");
>     long start = System.currentTimeMillis();
>     channel.start();
>     long end = System.currentTimeMillis();
>     System.out.println("=========== > Time Taken : " + (end-start) + " sec");
>     Assert.assertTrue(channel.isOpen());
>     // 6 wrap up
>     channel.stop();
>     channel.close();
>   }
>   public static void fillAndDrainChannel(final Channel channel, final String prefix)
>           throws Exception {
>     int[] batchSizes = new int[] {
>             1000, 100, 10, 1
>     };
>     for (int i = 0; i < batchSizes.length; i++) {
>       try {
>         while(true) {
>           Set<String> batch = putEvents(channel, prefix, batchSizes[i],
>                   Integer.MAX_VALUE, true);
>           if(batch.isEmpty()) {
>             break;
>           }
>           TestUtils.takeEvents(channel, batch.size() / 2, batch.size() / 2);
>         }
>       } catch (ChannelException e) {
>         Assert.assertTrue(("The channel has reached it's capacity. This might "
>                 + "be the result of a sink on the channel having too low of batch "
>                 + "size, a downstream system running slower than normal, or that "
>                 + "the channel capacity is just too low. [channel="
>                 + channel.getName() + "]").equals(e.getMessage())
>                 || e.getMessage().startsWith("Put queue for FileBackedTransaction " +
>                 "of capacity "));
>       }
>     }
>   }
> {code}
> FYI: TestUtils.createFileChannel seems to set the capacity.. so i  commented it out for this test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)