You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by 周梦想 <ab...@gmail.com> on 2013/02/07 09:28:30 UTC

how can I know which file source from on collector sink?

Hi,
I want gather different format and name log file from agent. I want to
write each file to hdfs with different file name prefix or different dir so
that I can recognize the files are from which.

source like:
config [MAgent-44, text("D:\\TKServer\\_BakLog\\20130207 655game.log"),
batch(1000) agentDFOSink("192.168.10.48", 35853)]
config [MAgent-44, text("D:\\TKServer\\_BakLog\\20130207 655user.log"),
batch(1000) agentDFOSink("192.168.10.48", 35853)]

config [MAgent-45, text("D:\\TKServer\\_BakLog\\20130207 655game.log"),
batch(1000) agentDFOSink("192.168.10.48", 35853)]
config [MAgent-45, text("D:\\TKServer\\_BakLog\\20130207 655user.log"),
batch(1000) agentDFOSink("192.168.10.48", 35853)]


collector like
config [co1, collectorSource( 35853 ), gunzip unbatch collectorSink(
"hdfs://hadoop48:54310/user/flume/%y%m/%d","%{host}-%{sourcefile}-")]

note:%{sourcefile} isn't existed.

results like:
-rw-r--r--   2 zhouhh supergroup    7309058 2013-02-07 16:14
/user/flume/1302/07/MAgent-44-game.log-20130207-161423054+0800.885170506522053.00035553
-rw-r--r--   2 zhouhh supergroup   17922102 2013-02-07 16:14
/user/flume/1302/07/MAgent-44-user.log-20130207-161453158+0800.885200610609053.00035551
-rw-r--r--   2 zhouhh supergroup   17854942 2013-02-07 16:15
/user/flume/1302/07/MAgent-45-game.log-20130207-161523249+0800.885230701184053.00035551
-rw-r--r--   2 zhouhh supergroup   17827087 2013-02-07 16:15
/user/flume/1302/07/MAgent-45-user.log-20130207-161553269+0800.885260721933053.00035551
-rw-r--r--   2 zhouhh supergroup   17820650 2013-02-07 16:16
/user/flume/1302/07/MAgent-44-game.log-20130207-161623290+0800.885290742023053.00035551

how can I do this?
can I using different ports of collector to map different source files?

Thanks.
Andy

Re: how can I know which file source from on collector sink?

Posted by Hari Shreedharan <hs...@cloudera.com>.
We recently committed thrift RPc support. If you are willing to try out
some new code, you can checkout trunk and try it out

Hari

On Sunday, February 17, 2013, 周梦想 wrote:

> thank you smth.
> but there isn't windows version of flume-ng. we have to gather logs from
> windows servers.
>
> :)
> Andy
>
> 2013/2/16 hoo.smth <hoo.smth@gmail.com <javascript:_e({}, 'cvml',
> 'hoo.smth@gmail.com');>>
>
>> You may use flume-ng, then you can set information in header.
>>
>>
>> On Thu, Feb 7, 2013 at 4:28 PM, 周梦想 <ablozhou@gmail.com<javascript:_e({}, 'cvml', 'ablozhou@gmail.com');>
>> > wrote:
>>
>>> Hi,
>>> I want gather different format and name log file from agent. I want to
>>> write each file to hdfs with different file name prefix or different dir so
>>> that I can recognize the files are from which.
>>>
>>> source like:
>>> config [MAgent-44, text("D:\\TKServer\\_BakLog\\20130207 655game.log"),
>>> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>>> config [MAgent-44, text("D:\\TKServer\\_BakLog\\20130207 655user.log"),
>>> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>>>
>>> config [MAgent-45, text("D:\\TKServer\\_BakLog\\20130207 655game.log"),
>>> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>>> config [MAgent-45, text("D:\\TKServer\\_BakLog\\20130207 655user.log"),
>>> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>>>
>>>
>>> collector like
>>> config [co1, collectorSource( 35853 ), gunzip unbatch collectorSink(
>>> "hdfs://hadoop48:54310/user/flume/%y%m/%d","%{host}-%{sourcefile}-")]
>>>
>>> note:%{sourcefile} isn't existed.
>>>
>>> results like:
>>> -rw-r--r--   2 zhouhh supergroup    7309058 2013-02-07 16:14
>>> /user/flume/1302/07/MAgent-44-game.log-20130207-161423054+0800.885170506522053.00035553
>>> -rw-r--r--   2 zhouhh supergroup   17922102 2013-02-07 16:14
>>> /user/flume/1302/07/MAgent-44-user.log-20130207-161453158+0800.885200610609053.00035551
>>> -rw-r--r--   2 zhouhh supergroup   17854942 2013-02-07 16:15
>>> /user/flume/1302/07/MAgent-45-game.log-20130207-161523249+0800.885230701184053.00035551
>>> -rw-r--r--   2 zhouhh supergroup   17827087 2013-02-07 16:15
>>> /user/flume/1302/07/MAgent-45-user.log-20130207-161553269+0800.885260721933053.00035551
>>> -rw-r--r--   2 zhouhh supergroup   17820650 2013-02-07 16:16
>>> /user/flume/1302/07/MAgent-44-game.log-20130207-161623290+0800.885290742023053.00035551
>>>
>>> how can I do this?
>>> can I using different ports of collector to map different source files?
>>>
>>> Thanks.
>>> Andy
>>>
>>
>>
>

Re: how can I know which file source from on collector sink?

Posted by 周梦想 <ab...@gmail.com>.
thank you smth.
but there isn't windows version of flume-ng. we have to gather logs from
windows servers.

:)
Andy

2013/2/16 hoo.smth <ho...@gmail.com>

> You may use flume-ng, then you can set information in header.
>
>
> On Thu, Feb 7, 2013 at 4:28 PM, 周梦想 <ab...@gmail.com> wrote:
>
>> Hi,
>> I want gather different format and name log file from agent. I want to
>> write each file to hdfs with different file name prefix or different dir so
>> that I can recognize the files are from which.
>>
>> source like:
>> config [MAgent-44, text("D:\\TKServer\\_BakLog\\20130207 655game.log"),
>> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>> config [MAgent-44, text("D:\\TKServer\\_BakLog\\20130207 655user.log"),
>> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>>
>> config [MAgent-45, text("D:\\TKServer\\_BakLog\\20130207 655game.log"),
>> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>> config [MAgent-45, text("D:\\TKServer\\_BakLog\\20130207 655user.log"),
>> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>>
>>
>> collector like
>> config [co1, collectorSource( 35853 ), gunzip unbatch collectorSink(
>> "hdfs://hadoop48:54310/user/flume/%y%m/%d","%{host}-%{sourcefile}-")]
>>
>> note:%{sourcefile} isn't existed.
>>
>> results like:
>> -rw-r--r--   2 zhouhh supergroup    7309058 2013-02-07 16:14
>> /user/flume/1302/07/MAgent-44-game.log-20130207-161423054+0800.885170506522053.00035553
>> -rw-r--r--   2 zhouhh supergroup   17922102 2013-02-07 16:14
>> /user/flume/1302/07/MAgent-44-user.log-20130207-161453158+0800.885200610609053.00035551
>> -rw-r--r--   2 zhouhh supergroup   17854942 2013-02-07 16:15
>> /user/flume/1302/07/MAgent-45-game.log-20130207-161523249+0800.885230701184053.00035551
>> -rw-r--r--   2 zhouhh supergroup   17827087 2013-02-07 16:15
>> /user/flume/1302/07/MAgent-45-user.log-20130207-161553269+0800.885260721933053.00035551
>> -rw-r--r--   2 zhouhh supergroup   17820650 2013-02-07 16:16
>> /user/flume/1302/07/MAgent-44-game.log-20130207-161623290+0800.885290742023053.00035551
>>
>> how can I do this?
>> can I using different ports of collector to map different source files?
>>
>> Thanks.
>> Andy
>>
>
>

Re: how can I know which file source from on collector sink?

Posted by "hoo.smth" <ho...@gmail.com>.
You may use flume-ng, then you can set information in header.

On Thu, Feb 7, 2013 at 4:28 PM, 周梦想 <ab...@gmail.com> wrote:

> Hi,
> I want gather different format and name log file from agent. I want to
> write each file to hdfs with different file name prefix or different dir so
> that I can recognize the files are from which.
>
> source like:
> config [MAgent-44, text("D:\\TKServer\\_BakLog\\20130207 655game.log"),
> batch(1000) agentDFOSink("192.168.10.48", 35853)]
> config [MAgent-44, text("D:\\TKServer\\_BakLog\\20130207 655user.log"),
> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>
> config [MAgent-45, text("D:\\TKServer\\_BakLog\\20130207 655game.log"),
> batch(1000) agentDFOSink("192.168.10.48", 35853)]
> config [MAgent-45, text("D:\\TKServer\\_BakLog\\20130207 655user.log"),
> batch(1000) agentDFOSink("192.168.10.48", 35853)]
>
>
> collector like
> config [co1, collectorSource( 35853 ), gunzip unbatch collectorSink(
> "hdfs://hadoop48:54310/user/flume/%y%m/%d","%{host}-%{sourcefile}-")]
>
> note:%{sourcefile} isn't existed.
>
> results like:
> -rw-r--r--   2 zhouhh supergroup    7309058 2013-02-07 16:14
> /user/flume/1302/07/MAgent-44-game.log-20130207-161423054+0800.885170506522053.00035553
> -rw-r--r--   2 zhouhh supergroup   17922102 2013-02-07 16:14
> /user/flume/1302/07/MAgent-44-user.log-20130207-161453158+0800.885200610609053.00035551
> -rw-r--r--   2 zhouhh supergroup   17854942 2013-02-07 16:15
> /user/flume/1302/07/MAgent-45-game.log-20130207-161523249+0800.885230701184053.00035551
> -rw-r--r--   2 zhouhh supergroup   17827087 2013-02-07 16:15
> /user/flume/1302/07/MAgent-45-user.log-20130207-161553269+0800.885260721933053.00035551
> -rw-r--r--   2 zhouhh supergroup   17820650 2013-02-07 16:16
> /user/flume/1302/07/MAgent-44-game.log-20130207-161623290+0800.885290742023053.00035551
>
> how can I do this?
> can I using different ports of collector to map different source files?
>
> Thanks.
> Andy
>