You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by 鹰 <98...@qq.com> on 2015/05/14 05:06:39 UTC
回复: set flume send logs to hdfs error
I send data by python scripts use socket send the code like this:
import sys
from socket import *
HOST = '192.168.1.117'
PORT =44444
BUFSIZ = 1024
ADDR = (HOST, PORT)
tcpCliSock = socket(AF_INET, SOCK_STREAM)
tcpCliSock.connect(ADDR)
i=0
for x in range(3):
print x, "xx"
n=tcpCliSock.send("test datas from flume")
tcpCliSock.close()
------------------ 原始邮件 ------------------
发件人: "Hari Shreedharan";<hs...@cloudera.com>;
发送时间: 2015年5月14日(星期四) 上午10:53
收件人: "user@flume.apache.org"<us...@flume.apache.org>;
主题: Re: set flume send logs to hdfs error
How are you sending data to the Avro Source?
Thanks,
Hari
On Wed, May 13, 2015 at 7:38 PM, 鹰 <98...@qq.com> wrote:
hi all ,
i'm want set flume send data to hdfs my configure file is lile this :
tier1.sources=source1
tier1.channels=channel1
tier1.sinks=sink1
tier1.sources.source1.type=avro
tier1.sources.source1.bind=0.0.0.0
tier1.sources.source1.port=44444
tier1.sources.source1.channels=channel1
tier1.channels.channel1.type=memory
tier1.channels.channel1.capacity=10000
tier1.channels.channel1.transactionCapacity=1000
tier1.channels.channel1.keep-alive=30
tier1.sinks.sink1.type=hdfs
tier1.sinks.sink1.channel=channel1
tier1.sinks.sink1.hdfs.path=hdfs://hadoop-home.com:9000/user/hadoop/
tier1.sinks.sink1.hdfs.fileType=DataStream
tier1.sinks.sink1.hdfs.writeFormat=Text
tier1.sinks.sink1.hdfs.rollInterval=0
tier1.sinks.sink1.hdfs.rollSize=10240
tier1.sinks.sink1.hdfs.rollCount=0
tier1.sinks.sink1.hdfs.idleTimeout=60
when I start the flume by this configure file and send data to the port 44444 I get an error :
org.apache.avro.AvroRuntimeException: Excessively large list allocation request detected: 154218761 items! Connection closed;
dose anybody can help me ,thanks.
Re: set flume send logs to hdfs error
Posted by Johny Rufus <jr...@cloudera.com>.
Are you running your Hadoop cluster in Kerberos mode ? If so, is your
kerberos principal/keytab combination correct ? You can try to login into
the KDC server of your hadoop cluster independently using the
specified principal/keytab to make sure the combination can login to the
KDC/cluster and use the above combination in your flume config.
Thanks,
Rufus
On Thu, May 14, 2015 at 7:04 PM, 鹰 <98...@qq.com> wrote:
>
> thanks Hari,
> when i use avor client send data to the flume port I
> get an kerberos error ,it looks like this :
> [ERROR -
> org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:561)]
> Hadoop running in secure mode, but Flume config doesn't specify a principal
> to use for Kerberos auth.
> [ERROR -
> org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:273)]
> Failed to authenticate!
> and then I add the hdfs.kerberosPrincipal to the configure file but it
> dosen't work , what can I do now?
> thanks for any help .
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Hari Shreedharan";<hs...@cloudera.com>;
> *发送时间:* 2015年5月14日(星期四) 中午11:55
> *收件人:* "user"<us...@flume.apache.org>;
> *主题:* Re: set flume send logs to hdfs error
>
> You need to use Flume’s client API to send data to the Avro Source. Use
> the thrift source, and generate thrift client for Python using this IDL
> file:
> https://github.com/apache/flume/blob/trunk/flume-ng-sdk/src/main/thrift/flume.thrift
>
> You can use that to send data to the thrift source at that point.
>
> Thanks,
> Hari Shreedharan
>
>
>
>
> On May 13, 2015, at 8:06 PM, 鹰 <98...@qq.com> wrote:
>
> I send data by python scripts use socket send the code like this:
>
> import sys
> from socket import *
>
>
> HOST =
> '192.168.1.117'
>
> PORT =44444
> BUFSIZ = 1024
> ADDR = (HOST, PORT)
>
> tcpCliSock = socket(AF_INET, SOCK_STREAM)
> tcpCliSock.connect(ADDR)
> i=0
> for x in range(3):
> print x, "xx"
> n=tcpCliSock.send("test datas from flume")
> tcpCliSock.close()
>
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Hari Shreedharan";<hs...@cloudera.com>;
> *发送时间:* 2015年5月14日(星期四) 上午10:53
> *收件人:* "user@flume.apache.org"<us...@flume.apache.org>;
> *主题:* Re: set flume send logs to hdfs error
>
> How are you sending data to the Avro Source?
>
>
> Thanks,
> Hari
>
> On Wed, May 13, 2015 at 7:38 PM, 鹰 <98...@qq.com> wrote:
>
>> hi all ,
>> i'm want set flume send data to hdfs my configure file is lile this :
>> tier1.sources=source1
>> tier1.channels=channel1
>> tier1.sinks=sink1
>>
>> tier1.sources.source1.type=avro
>> tier1.sources.source1.bind=0.0.0.0
>> tier1.sources.source1.port=44444
>> tier1.sources.source1.channels=channel1
>>
>> tier1.channels.channel1.type=memory
>> tier1.channels.channel1.capacity=10000
>> tier1.channels.channel1.transactionCapacity=1000
>> tier1.channels.channel1.keep-alive=30
>>
>> tier1.sinks.sink1.type=hdfs
>> tier1.sinks.sink1.channel=channel1
>> tier1.sinks.sink1.hdfs.path=hdfs://hadoop-home.com:9000/user/hadoop/
>> tier1.sinks.sink1.hdfs.fileType=DataStream
>> tier1.sinks.sink1.hdfs.writeFormat=Text
>> tier1.sinks.sink1.hdfs.rollInterval=0
>> tier1.sinks.sink1.hdfs.rollSize=10240
>> tier1.sinks.sink1.hdfs.rollCount=0
>> tier1.sinks.sink1.hdfs.idleTimeout=60
>>
>> when I start the flume by this configure file and send data to the port
>> 44444 I get an error :
>> org.apache.avro.AvroRuntimeException: Excessively large list allocation
>> request detected: 154218761 items! Connection closed;
>> dose anybody can help me ,thanks.
>>
>
>
>
回复: set flume send logs to hdfs error
Posted by 鹰 <98...@qq.com>.
thanks Hari,
when i use avor client send data to the flume port I get an kerberos error ,it looks like this :
[ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:561)] Hadoop running in secure mode, but Flume config doesn't specify a principal to use for Kerberos auth.
[ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:273)] Failed to authenticate!
and then I add the hdfs.kerberosPrincipal to the configure file but it dosen't work , what can I do now?
thanks for any help .
------------------ 原始邮件 ------------------
发件人: "Hari Shreedharan";<hs...@cloudera.com>;
发送时间: 2015年5月14日(星期四) 中午11:55
收件人: "user"<us...@flume.apache.org>;
主题: Re: set flume send logs to hdfs error
You need to use Flume’s client API to send data to the Avro Source. Use the thrift source, and generate thrift client for Python using this IDL file: https://github.com/apache/flume/blob/trunk/flume-ng-sdk/src/main/thrift/flume.thrift
You can use that to send data to the thrift source at that point.
Thanks,
Hari Shreedharan
On May 13, 2015, at 8:06 PM, 鹰 <98...@qq.com> wrote:
I send data by python scripts use socket send the code like this:
import sys
from socket import *
HOST = '192.168.1.117'
PORT =44444
BUFSIZ = 1024
ADDR = (HOST, PORT)
tcpCliSock = socket(AF_INET, SOCK_STREAM)
tcpCliSock.connect(ADDR)
i=0
for x in range(3):
print x, "xx"
n=tcpCliSock.send("test datas from flume")
tcpCliSock.close()
------------------ 原始邮件 ------------------
发件人: "Hari Shreedharan";<hs...@cloudera.com>;
发送时间: 2015年5月14日(星期四) 上午10:53
收件人: "user@flume.apache.org"<us...@flume.apache.org>;
主题: Re: set flume send logs to hdfs error
How are you sending data to the Avro Source?
Thanks,
Hari
On Wed, May 13, 2015 at 7:38 PM, 鹰 <98...@qq.com> wrote:
hi all ,
i'm want set flume send data to hdfs my configure file is lile this :
tier1.sources=source1
tier1.channels=channel1
tier1.sinks=sink1
tier1.sources.source1.type=avro
tier1.sources.source1.bind=0.0.0.0
tier1.sources.source1.port=44444
tier1.sources.source1.channels=channel1
tier1.channels.channel1.type=memory
tier1.channels.channel1.capacity=10000
tier1.channels.channel1.transactionCapacity=1000
tier1.channels.channel1.keep-alive=30
tier1.sinks.sink1.type=hdfs
tier1.sinks.sink1.channel=channel1
tier1.sinks.sink1.hdfs.path=hdfs://hadoop-home.com:9000/user/hadoop/
tier1.sinks.sink1.hdfs.fileType=DataStream
tier1.sinks.sink1.hdfs.writeFormat=Text
tier1.sinks.sink1.hdfs.rollInterval=0
tier1.sinks.sink1.hdfs.rollSize=10240
tier1.sinks.sink1.hdfs.rollCount=0
tier1.sinks.sink1.hdfs.idleTimeout=60
when I start the flume by this configure file and send data to the port 44444 I get an error :
org.apache.avro.AvroRuntimeException: Excessively large list allocation request detected: 154218761 items! Connection closed;
dose anybody can help me ,thanks.
Re: set flume send logs to hdfs error
Posted by Hari Shreedharan <hs...@cloudera.com>.
You need to use Flume’s client API to send data to the Avro Source. Use the thrift source, and generate thrift client for Python using this IDL file: https://github.com/apache/flume/blob/trunk/flume-ng-sdk/src/main/thrift/flume.thrift <https://github.com/apache/flume/blob/trunk/flume-ng-sdk/src/main/thrift/flume.thrift>
You can use that to send data to the thrift source at that point.
Thanks,
Hari Shreedharan
> On May 13, 2015, at 8:06 PM, 鹰 <98...@qq.com> wrote:
>
> I send data by python scripts use socket send the code like this:
>
> import sys
> from socket import *
>
>
> HOST = '192.168.1.117'
> PORT =44444
> BUFSIZ = 1024
> ADDR = (HOST, PORT)
>
> tcpCliSock = socket(AF_INET, SOCK_STREAM)
> tcpCliSock.connect(ADDR)
> i=0
> for x in range(3):
> print x, "xx"
> n=tcpCliSock.send("test datas from flume")
> tcpCliSock.close()
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Hari Shreedharan";<hs...@cloudera.com>;
> 发送时间: 2015年5月14日(星期四) 上午10:53
> 收件人: "user@flume.apache.org"<us...@flume.apache.org>;
> 主题: Re: set flume send logs to hdfs error
>
> How are you sending data to the Avro Source?
>
>
> Thanks,
> Hari
>
> On Wed, May 13, 2015 at 7:38 PM, 鹰 <980548079@qq.com <ma...@qq.com>> wrote:
> hi all ,
> i'm want set flume send data to hdfs my configure file is lile this :
> tier1.sources=source1
> tier1.channels=channel1
> tier1.sinks=sink1
>
> tier1.sources.source1.type=avro
> tier1.sources.source1.bind=0.0.0.0
> tier1.sources.source1.port=44444
> tier1.sources.source1.channels=channel1
>
> tier1.channels.channel1.type=memory
> tier1.channels.channel1.capacity=10000
> tier1.channels.channel1.transactionCapacity=1000
> tier1.channels.channel1.keep-alive=30
>
> tier1.sinks.sink1.type=hdfs
> tier1.sinks.sink1.channel=channel1
> tier1.sinks.sink1.hdfs.path=hdfs://hadoop-home.com:9000/user/hadoop/ <http://hadoop-home.com:9000/user/hadoop/>
> tier1.sinks.sink1.hdfs.fileType=DataStream
> tier1.sinks.sink1.hdfs.writeFormat=Text
> tier1.sinks.sink1.hdfs.rollInterval=0
> tier1.sinks.sink1.hdfs.rollSize=10240
> tier1.sinks.sink1.hdfs.rollCount=0
> tier1.sinks.sink1.hdfs.idleTimeout=60
>
> when I start the flume by this configure file and send data to the port 44444 I get an error :
> org.apache.avro.AvroRuntimeException: Excessively large list allocation request detected: 154218761 items! Connection closed;
> dose anybody can help me ,thanks.
>