You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Shara Shi <sh...@dhgate.com> on 2012/08/28 07:08:57 UTC

答复: 答复: 答复: HDFS SINK Performacne

HI Anchlia ,

 

If I use hadoop fs �Cput xxx xxx , the performance is ok much faster than
flume��s .

 

Regards

Shara

 

������: Mohit Anchlia [mailto:mohitanchlia@gmail.com] 
����ʱ��: 2012��8��28�� 12:49
�ռ���: user@flume.apache.org
����: Re: ��: ��: HDFS SINK Performacne

 

Do you get better performance when you directly write to the cluster? Can
you perform some tests writing to cluster directly and compare?

On Mon, Aug 27, 2012 at 8:19 PM, Shara Shi <sh...@dhgate.com> wrote:

Hi Denny

 

It is 20MB /min , I confirmed 

I sent data from avro-client from local to flume agent , I really got
20MB/min

So I try to find out the reason why. 

 

Regards 

Shara

������: Denny Ye [mailto:dennyy99@gmail.com] 

����ʱ��: 2012��8��28�� 11:02
�ռ���: user@flume.apache.org
����: Re: ��: HDFS SINK Performacne 

 

20MB/min or 20MB/sec?

I doubt that it may have presentation mistake. Can you confirm it?

 

-Regards

Denny Ye

2012/8/28 Shara Shi <sh...@dhgate.com>

Hi Denny

 

The throughput is 45MB/sec is OK for me . 

But I just got 20M / Minutes 

What��s wrong with my configuration?

 

Regards

Shara

 

 

������: Denny Ye [mailto:dennyy99@gmail.com] 
����ʱ��: 2012��8��27�� 20:05
�ռ���: user@flume.apache.org
����: Re: HDFS SINK Performacne

 

hi Shara,

    You are using MemoryChannel as repository. I tested it with outcomes:
45MB/sec without full GC in local updated code. Is this your goal? or more
high throughput?

 

-Regards

Denny Ye

2012/8/27 Shara Shi <sh...@dhgate.com>

Hi All, 

 

Whatever I have tuned parameters of hdfs sink, It can��t get higher
performance over than 20MB per minutes.

Is that normal? I think it is weird.

How can I improve it

 

Regards

Ruihong Shi

==========================================

 

# or more contributor license agreements.  See the NOTICE file

# distributed with this work for additional information

# regarding copyright ownership.  The ASF licenses this file

# to you under the Apache License, Version 2.0 (the

# "License"); you may not use this file except in compliance

# with the License.  You may obtain a copy of the License at

#

#  http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing,

# software distributed under the License is distributed on an

# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY

# KIND, either express or implied.  See the License for the

# specific language governing permissions and limitations

# under the License.

 

# Define a memory channel called ch1 on collector1

collector2.channels.ch2.type = memory

collector2.channels.ch2.capacity=500000

collector2.channels.ch2.keep-alive=1

 

 

# Define an Avro source called avro-source1 on agent1 and tell it

# to bind to 0.0.0.0:41414 <http://0.0.0.0:41414/> . Connect it to channel
ch1.

collector2.sources.avro-source1.channels = ch2

collector2.sources.avro-source1.type = avro

collector2.sources.avro-source1.bind = 0.0.0.0

collector2.sources.avro-source1.port = 41415

collector2.sources.avro-soruce1.threads = 10

 

 

# Define a hdfs sink

collector2.sinks.hdfs.channel = ch2

collector2.sinks.hdfs.type= hdfs

collector2.sinks.hdfs.hdfs.path=hdfs://namenode:8020/user/root/flume/webdata
/exec/%Y/%m/%d/%H

collector2.sinks.hdfs.batchsize=50000

collector2.sinks.hdfs.runner.type=polling

collector2.sinks.hdfs.runner.polling.interval = 1

collector2.sinks.hdfs.hdfs.rollInterval = 120

collector2.sinks.hdfs.hdfs.rollSize =0

collector2.sinks.hdfs.hdfs.rollCount = 300000

collector2.sinks.hdfs.hdfs.fileType=DataStream

collector2.sinks.hdfs.hdfs.round =true

collector2.sinks.hdfs.hdfs.roundValue = 10

collector2.sinks.hdfs.hdfs.roundUnit = minute

collector2.sinks.hdfs.hdfs.threadsPoolSize = 10

collector2.sinks.hdfs.hdfs.rollTimerPoolSize = 10

 

# Finally, now that we've defined all of our components, tell

# agent1 which ones we want to activate.