You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flume.apache.org by 鹰 <98...@qq.com> on 2015/05/14 05:06:39 UTC

回复： set flume send logs to hdfs error

I send data by python scripts use socket send the code like this:

import sys 
from socket import *


HOST = '192.168.1.117'                                                                                                                                                                                                                         
PORT =44444
BUFSIZ = 1024
ADDR = (HOST, PORT)

tcpCliSock = socket(AF_INET, SOCK_STREAM)
tcpCliSock.connect(ADDR)
i=0
for x in range(3):
    print x, "xx"
    n=tcpCliSock.send("test datas from flume")
tcpCliSock.close()





------------------ 原始邮件 ------------------
发件人: "Hari Shreedharan";<hs...@cloudera.com>;
发送时间: 2015年5月14日(星期四) 上午10:53
收件人: "user@flume.apache.org"<us...@flume.apache.org>; 

主题: Re: set flume send logs to hdfs error



How are you sending data to the Avro Source?


Thanks,
Hari



 
On Wed, May 13, 2015 at 7:38 PM, 鹰 <98...@qq.com> wrote:
hi all ,
 i'm want set flume send data to hdfs my configure file is lile this :
tier1.sources=source1  
tier1.channels=channel1  
tier1.sinks=sink1  

tier1.sources.source1.type=avro  
tier1.sources.source1.bind=0.0.0.0  
tier1.sources.source1.port=44444  
tier1.sources.source1.channels=channel1  

tier1.channels.channel1.type=memory  
tier1.channels.channel1.capacity=10000  
tier1.channels.channel1.transactionCapacity=1000  
tier1.channels.channel1.keep-alive=30  

tier1.sinks.sink1.type=hdfs  
tier1.sinks.sink1.channel=channel1  
tier1.sinks.sink1.hdfs.path=hdfs://hadoop-home.com:9000/user/hadoop/ 
tier1.sinks.sink1.hdfs.fileType=DataStream  
tier1.sinks.sink1.hdfs.writeFormat=Text  
tier1.sinks.sink1.hdfs.rollInterval=0  
tier1.sinks.sink1.hdfs.rollSize=10240  
tier1.sinks.sink1.hdfs.rollCount=0  
tier1.sinks.sink1.hdfs.idleTimeout=60  

when I start the flume by this configure file and send data to the port 44444 I get an error :
org.apache.avro.AvroRuntimeException: Excessively large list allocation request detected: 154218761 items! Connection closed;
dose anybody can help me ,thanks.

Re: set flume send logs to hdfs error

Posted by Johny Rufus <jr...@cloudera.com>.

Are you running your Hadoop cluster in Kerberos mode ? If so, is your
kerberos principal/keytab combination  correct ? You can try to login into
the KDC server of your hadoop cluster independently using the
specified principal/keytab to make sure the combination can login to the
KDC/cluster and use the above combination in your flume config.

Thanks,
Rufus

On Thu, May 14, 2015 at 7:04 PM, 鹰 <98...@qq.com> wrote:

>
> thanks Hari,
>                     when i use avor client send data to the flume port I
> get an kerberos error ,it looks like this :
> [ERROR -
> org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:561)]
> Hadoop running in secure mode, but Flume config doesn't specify a principal
> to use for Kerberos auth.
>  [ERROR -
> org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:273)]
> Failed to authenticate!
> and then I add the hdfs.kerberosPrincipal to the configure file but it
> dosen't work , what can I do now?
> thanks for any help .
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Hari Shreedharan";<hs...@cloudera.com>;
> *发送时间:* 2015年5月14日(星期四) 中午11:55
> *收件人:* "user"<us...@flume.apache.org>;
> *主题:* Re: set flume send logs to hdfs error
>
> You need to use Flume’s client API to send data to the Avro Source. Use
> the thrift source, and generate thrift client for Python using this IDL
> file:
> https://github.com/apache/flume/blob/trunk/flume-ng-sdk/src/main/thrift/flume.thrift
>
> You can use that to send data to the thrift source at that point.
>
> Thanks,
> Hari Shreedharan
>
>
>
>
> On May 13, 2015, at 8:06 PM, 鹰 <98...@qq.com> wrote:
>
> I send data by python scripts use socket send the code like this:
>
> import sys
> from socket import *
>
>
> HOST =
> '192.168.1.117'
>
> PORT =44444
> BUFSIZ = 1024
> ADDR = (HOST, PORT)
>
> tcpCliSock = socket(AF_INET, SOCK_STREAM)
> tcpCliSock.connect(ADDR)
> i=0
> for x in range(3):
>     print x, "xx"
>     n=tcpCliSock.send("test datas from flume")
> tcpCliSock.close()
>
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Hari Shreedharan";<hs...@cloudera.com>;
> *发送时间:* 2015年5月14日(星期四) 上午10:53
> *收件人:* "user@flume.apache.org"<us...@flume.apache.org>;
> *主题:* Re: set flume send logs to hdfs error
>
> How are you sending data to the Avro Source?
>
>
> Thanks,
> Hari
>
> On Wed, May 13, 2015 at 7:38 PM, 鹰 <98...@qq.com> wrote:
>
>> hi all ,
>>  i'm want set flume send data to hdfs my configure file is lile this :
>> tier1.sources=source1
>> tier1.channels=channel1
>> tier1.sinks=sink1
>>
>> tier1.sources.source1.type=avro
>> tier1.sources.source1.bind=0.0.0.0
>> tier1.sources.source1.port=44444
>> tier1.sources.source1.channels=channel1
>>
>> tier1.channels.channel1.type=memory
>> tier1.channels.channel1.capacity=10000
>> tier1.channels.channel1.transactionCapacity=1000
>> tier1.channels.channel1.keep-alive=30
>>
>> tier1.sinks.sink1.type=hdfs
>> tier1.sinks.sink1.channel=channel1
>> tier1.sinks.sink1.hdfs.path=hdfs://hadoop-home.com:9000/user/hadoop/
>> tier1.sinks.sink1.hdfs.fileType=DataStream
>> tier1.sinks.sink1.hdfs.writeFormat=Text
>> tier1.sinks.sink1.hdfs.rollInterval=0
>> tier1.sinks.sink1.hdfs.rollSize=10240
>> tier1.sinks.sink1.hdfs.rollCount=0
>> tier1.sinks.sink1.hdfs.idleTimeout=60
>>
>> when I start the flume by this configure file and send data to the port
>> 44444 I get an error :
>> org.apache.avro.AvroRuntimeException: Excessively large list allocation
>> request detected: 154218761 items! Connection closed;
>> dose anybody can help me ,thanks.
>>
>
>
>

回复： set flume send logs to hdfs error

Posted by 鹰 <98...@qq.com>.

thanks Hari,
                    when i use avor client send data to the flume port I get an kerberos error ,it looks like this :
[ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:561)] Hadoop running in secure mode, but Flume config doesn't specify a principal to use for Kerberos auth.
 [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:273)] Failed to authenticate!
and then I add the hdfs.kerberosPrincipal to the configure file but it dosen't work , what can I do now? 
thanks for any help .




------------------ 原始邮件 ------------------
发件人: "Hari Shreedharan";<hs...@cloudera.com>;
发送时间: 2015年5月14日(星期四) 中午11:55
收件人: "user"<us...@flume.apache.org>; 

主题: Re: set flume send logs to hdfs error



You need to use Flume’s client API to send data to the Avro Source. Use the thrift source, and generate thrift client for Python using this IDL file: https://github.com/apache/flume/blob/trunk/flume-ng-sdk/src/main/thrift/flume.thrift

You can use that to send data to the thrift source at that point.
 
Thanks,
Hari Shreedharan





 
 
On May 13, 2015, at 8:06 PM, 鹰 <98...@qq.com> wrote:

I send data by python scripts use socket send the code like this:

import sys 
from socket import *


HOST = '192.168.1.117'                                                                                                                                                                                                                         
PORT =44444
BUFSIZ = 1024
ADDR = (HOST, PORT)

tcpCliSock = socket(AF_INET, SOCK_STREAM)
tcpCliSock.connect(ADDR)
i=0
for x in range(3):
    print x, "xx"
    n=tcpCliSock.send("test datas from flume")
tcpCliSock.close()





------------------ 原始邮件 ------------------
发件人: "Hari Shreedharan";<hs...@cloudera.com>;
发送时间: 2015年5月14日(星期四) 上午10:53
收件人: "user@flume.apache.org"<us...@flume.apache.org>; 

主题: Re: set flume send logs to hdfs error



How are you sending data to the Avro Source?


Thanks,
Hari



 
On Wed, May 13, 2015 at 7:38 PM, 鹰 <98...@qq.com> wrote:
hi all ,
 i'm want set flume send data to hdfs my configure file is lile this :
tier1.sources=source1  
tier1.channels=channel1  
tier1.sinks=sink1  

tier1.sources.source1.type=avro  
tier1.sources.source1.bind=0.0.0.0  
tier1.sources.source1.port=44444  
tier1.sources.source1.channels=channel1  

tier1.channels.channel1.type=memory  
tier1.channels.channel1.capacity=10000  
tier1.channels.channel1.transactionCapacity=1000  
tier1.channels.channel1.keep-alive=30  

tier1.sinks.sink1.type=hdfs  
tier1.sinks.sink1.channel=channel1  
tier1.sinks.sink1.hdfs.path=hdfs://hadoop-home.com:9000/user/hadoop/ 
tier1.sinks.sink1.hdfs.fileType=DataStream  
tier1.sinks.sink1.hdfs.writeFormat=Text  
tier1.sinks.sink1.hdfs.rollInterval=0  
tier1.sinks.sink1.hdfs.rollSize=10240  
tier1.sinks.sink1.hdfs.rollCount=0  
tier1.sinks.sink1.hdfs.idleTimeout=60  

when I start the flume by this configure file and send data to the port 44444 I get an error :
org.apache.avro.AvroRuntimeException: Excessively large list allocation request detected: 154218761 items! Connection closed;
dose anybody can help me ,thanks.

Re: set flume send logs to hdfs error

Posted by Hari Shreedharan <hs...@cloudera.com>.

You need to use Flume’s client API to send data to the Avro Source. Use the thrift source, and generate thrift client for Python using this IDL file: https://github.com/apache/flume/blob/trunk/flume-ng-sdk/src/main/thrift/flume.thrift <https://github.com/apache/flume/blob/trunk/flume-ng-sdk/src/main/thrift/flume.thrift>

You can use that to send data to the thrift source at that point.

Thanks,
Hari Shreedharan




> On May 13, 2015, at 8:06 PM, 鹰 <98...@qq.com> wrote:
> 
> I send data by python scripts use socket send the code like this:
> 
> import sys 
> from socket import *
> 
> 
> HOST = '192.168.1.117'                                                                                                                                                                                                                         
> PORT =44444
> BUFSIZ = 1024
> ADDR = (HOST, PORT)
> 
> tcpCliSock = socket(AF_INET, SOCK_STREAM)
> tcpCliSock.connect(ADDR)
> i=0
> for x in range(3):
>     print x, "xx"
>     n=tcpCliSock.send("test datas from flume")
> tcpCliSock.close()
> 
> 
> 
> ------------------ 原始邮件 ------------------
> 发件人: "Hari Shreedharan";<hs...@cloudera.com>;
> 发送时间: 2015年5月14日(星期四) 上午10:53
> 收件人: "user@flume.apache.org"<us...@flume.apache.org>;
> 主题: Re: set flume send logs to hdfs error
> 
> How are you sending data to the Avro Source?
> 
> 
> Thanks,
> Hari
> 
> On Wed, May 13, 2015 at 7:38 PM, 鹰 <980548079@qq.com <ma...@qq.com>> wrote:
> hi all ,
>  i'm want set flume send data to hdfs my configure file is lile this :
> tier1.sources=source1  
> tier1.channels=channel1  
> tier1.sinks=sink1  
> 
> tier1.sources.source1.type=avro  
> tier1.sources.source1.bind=0.0.0.0  
> tier1.sources.source1.port=44444  
> tier1.sources.source1.channels=channel1  
> 
> tier1.channels.channel1.type=memory  
> tier1.channels.channel1.capacity=10000  
> tier1.channels.channel1.transactionCapacity=1000  
> tier1.channels.channel1.keep-alive=30  
> 
> tier1.sinks.sink1.type=hdfs  
> tier1.sinks.sink1.channel=channel1  
> tier1.sinks.sink1.hdfs.path=hdfs://hadoop-home.com:9000/user/hadoop/ <http://hadoop-home.com:9000/user/hadoop/> 
> tier1.sinks.sink1.hdfs.fileType=DataStream  
> tier1.sinks.sink1.hdfs.writeFormat=Text  
> tier1.sinks.sink1.hdfs.rollInterval=0  
> tier1.sinks.sink1.hdfs.rollSize=10240  
> tier1.sinks.sink1.hdfs.rollCount=0  
> tier1.sinks.sink1.hdfs.idleTimeout=60  
> 
> when I start the flume by this configure file and send data to the port 44444 I get an error :
> org.apache.avro.AvroRuntimeException: Excessively large list allocation request detected: 154218761 items! Connection closed;
> dose anybody can help me ,thanks.
>