You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "lizhenmxcz@163.com" <li...@163.com> on 2015/10/22 04:27:12 UTC

flume hive sink not work

hi all:
I use flume to import data from syslog to hive,but encount the follow errors. 

2015-10-22 10:05:05,115 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:324)] k2 : Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
        at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:99)
        at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:344)
        at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:296)
        at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:254)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
        at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:380)
        at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:86)
        ... 6 more
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hive.HiveWriter.timedCall(HiveWriter.java:431)
        at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:373)
        ... 7 more


my configuration is:


a1.sources = r1
a1.channels = c1 c2
a1.sinks = k1 k2

a1.sources.r1.type = syslogudp
a1.sources.r1.port = 514
a1.sources.r1.host = 192.168.55.246

a1.sources.r1.channels = c1 c2
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = regex_extractor
a1.sources.r1.interceptors.i1.regex = Dns(.*)\\[
a1.sources.r1.interceptors.i1.serializers = t1
a1.sources.r1.interceptors.i1.serializers.t1.name = type

a1.sources.r1.selector.type = multiplexing
a1.sources.r1.selector.header = type
a1.sources.r1.selector.mapping.Request = c1
a1.sources.r1.selector.mapping.Answer = c2

a1.sinks.k2.type = hive
a1.sinks.k2.channel = c1
a1.sinks.k2.hive.metastore = thrift://bigdata1:9083
a1.sinks.k2.hive.database = dnsdb
a1.sinks.k2.hive.table = dns_request
a1.sinks.k2.hive.partiton = %Y,%m,%d,%H
a1.sinks.k2.hive.txnsPerBatchAsk = 2
a1.sinks.k2.batchSize = 10
a1.sinks.k2.serializer = delimited
a1.sinks.k2.serializer.delimiter = ,
a1.sinks.k2.serializer.fieldnames = timepoint,random,sip,dip,spt,type,name

a1.sinks.k1.type = hive
a1.sinks.k1.channel = c2
a1.sinks.k1.hive.metastore = thrift://bigdata1:9083
a1.sinks.k1.hive.database = Dnsdb
a1.sinks.k1.hive.table = dns_answer
a1.sinks.k1.hive.partiton = %Y,%m,%d,%H
a1.sinks.k1.hive.txnsPerBatchAsk = 2
a1.sinks.k1.batchSize = 10
a1.sinks.k1.serializer = delimited
a1.sinks.k1.serializer.delimiter = ,
a1.sinks.k1.serializer.fieldnames = timepoint,random,sip,dip,dpt,name,nosuchname,typemax,typecname,typeaddr,authservername,additionalrecords

help me please,thanks.



lizhenmxcz@163.com

Re: Re: flume hive sink not work

Posted by iain wright <ia...@gmail.com>.
What is the result from ping bigdata1

Seems like it's not resolving from your telnet test

Try to telnet to the ip of bigdata on the same port.

If that works you can try changing bigdata1 to the IP of the machine in
your flume config (is it the same host flume is running on? If so use
127.0.0.1)

or sort out setting an A record in your dns server for bigdata1, or an
entry in /etc/hosts , etc.
On Oct 21, 2015 11:23 PM, "lizhenmxcz@163.com" <li...@163.com> wrote:

>
> telnet say "nane or service not known",but the port is listening
> tcp        0      0 0.0.0.0:9083                0.0.0.0:
> *                   LISTEN      32119/java
> ------------------------------
> lizhenmxcz@163.com
>
>
> *From:* iain wright <ia...@gmail.com>
> *Date:* 2015-10-22 14:13
> *To:* user <us...@flume.apache.org>
> *Subject:* Re: flume hive sink not work
>
> Are you able to telnet to bigdata1 on port 9083?
> On Oct 21, 2015 7:27 PM, "lizhenmxcz@163.com" <li...@163.com> wrote:
>
>>
>> hi all:
>> I use flume to import data from syslog to hive,but encount the follow
>> errors.
>>
>>
>> 2015-10-22 10:05:05,115 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:324)] k2 : Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
>>
>> org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
>>
>>         at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:99)
>>
>>         at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:344)
>>
>>         at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:296)
>>         at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:254)
>>
>>         at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>>
>>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
>>
>>         at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:380)
>>
>>         at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:86)
>>         ... 6 more
>> Caused by: java.util.concurrent.TimeoutException
>>         at java.util.concurrent.FutureTask.get(FutureTask.java:201)
>>
>>         at org.apache.flume.sink.hive.HiveWriter.timedCall(HiveWriter.java:431)
>>
>>         at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:373)
>>         ... 7 more
>>
>>
>> my configuration is:
>>
>>
>> a1.sources = r1
>> a1.channels = c1 c2
>> a1.sinks = k1 k2
>>
>> a1.sources.r1.type = syslogudp
>> a1.sources.r1.port = 514
>> a1.sources.r1.host = 192.168.55.246
>>
>> a1.sources.r1.channels = c1 c2
>> a1.sources.r1.interceptors = i1
>> a1.sources.r1.interceptors.i1.type = regex_extractor
>> a1.sources.r1.interceptors.i1.regex = Dns(.*)\\[
>> a1.sources.r1.interceptors.i1.serializers = t1
>> a1.sources.r1.interceptors.i1.serializers.t1.name = type
>>
>> a1.sources.r1.selector.type = multiplexing
>> a1.sources.r1.selector.header = type
>> a1.sources.r1.selector.mapping.Request = c1
>> a1.sources.r1.selector.mapping.Answer = c2
>>
>> a1.sinks.k2.type = hive
>> a1.sinks.k2.channel = c1
>> a1.sinks.k2.hive.metastore = thrift://bigdata1:9083
>> a1.sinks.k2.hive.database = dnsdb
>> a1.sinks.k2.hive.table = dns_request
>> a1.sinks.k2.hive.partiton = %Y,%m,%d,%H
>> a1.sinks.k2.hive.txnsPerBatchAsk = 2
>> a1.sinks.k2.batchSize = 10
>> a1.sinks.k2.serializer = delimited
>> a1.sinks.k2.serializer.delimiter = ,
>> a1.sinks.k2.serializer.fieldnames = timepoint,random,sip,dip,spt,type,name
>>
>> a1.sinks.k1.type = hive
>> a1.sinks.k1.channel = c2
>> a1.sinks.k1.hive.metastore = thrift://bigdata1:9083
>> a1.sinks.k1.hive.database = Dnsdb
>> a1.sinks.k1.hive.table = dns_answer
>> a1.sinks.k1.hive.partiton = %Y,%m,%d,%H
>> a1.sinks.k1.hive.txnsPerBatchAsk = 2
>> a1.sinks.k1.batchSize = 10
>> a1.sinks.k1.serializer = delimited
>> a1.sinks.k1.serializer.delimiter = ,
>>
>> a1.sinks.k1.serializer.fieldnames = timepoint,random,sip,dip,dpt,name,nosuchname,typemax,typecname,typeaddr,authservername,additionalrecords
>>
>> help me please,thanks.
>>
>> ------------------------------
>> lizhenmxcz@163.com
>>
>

Re: Re: flume hive sink not work

Posted by "lizhenmxcz@163.com" <li...@163.com>.
telnet say "nane or service not known",but the port is listening 
tcp        0      0 0.0.0.0:9083                0.0.0.0:*                   LISTEN      32119/java


lizhenmxcz@163.com
 
From: iain wright
Date: 2015-10-22 14:13
To: user
Subject: Re: flume hive sink not work
Are you able to telnet to bigdata1 on port 9083?
On Oct 21, 2015 7:27 PM, "lizhenmxcz@163.com" <li...@163.com> wrote:

hi all:
I use flume to import data from syslog to hive,but encount the follow errors. 

2015-10-22 10:05:05,115 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:324)] k2 : Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
        at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:99)
        at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:344)
        at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:296)
        at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:254)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
        at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:380)
        at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:86)
        ... 6 more
Caused by: java.util.concurrent.TimeoutException
        at java.util.concurrent.FutureTask.get(FutureTask.java:201)
        at org.apache.flume.sink.hive.HiveWriter.timedCall(HiveWriter.java:431)
        at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:373)
        ... 7 more


my configuration is:


a1.sources = r1
a1.channels = c1 c2
a1.sinks = k1 k2

a1.sources.r1.type = syslogudp
a1.sources.r1.port = 514
a1.sources.r1.host = 192.168.55.246

a1.sources.r1.channels = c1 c2
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = regex_extractor
a1.sources.r1.interceptors.i1.regex = Dns(.*)\\[
a1.sources.r1.interceptors.i1.serializers = t1
a1.sources.r1.interceptors.i1.serializers.t1.name = type

a1.sources.r1.selector.type = multiplexing
a1.sources.r1.selector.header = type
a1.sources.r1.selector.mapping.Request = c1
a1.sources.r1.selector.mapping.Answer = c2

a1.sinks.k2.type = hive
a1.sinks.k2.channel = c1
a1.sinks.k2.hive.metastore = thrift://bigdata1:9083
a1.sinks.k2.hive.database = dnsdb
a1.sinks.k2.hive.table = dns_request
a1.sinks.k2.hive.partiton = %Y,%m,%d,%H
a1.sinks.k2.hive.txnsPerBatchAsk = 2
a1.sinks.k2.batchSize = 10
a1.sinks.k2.serializer = delimited
a1.sinks.k2.serializer.delimiter = ,
a1.sinks.k2.serializer.fieldnames = timepoint,random,sip,dip,spt,type,name

a1.sinks.k1.type = hive
a1.sinks.k1.channel = c2
a1.sinks.k1.hive.metastore = thrift://bigdata1:9083
a1.sinks.k1.hive.database = Dnsdb
a1.sinks.k1.hive.table = dns_answer
a1.sinks.k1.hive.partiton = %Y,%m,%d,%H
a1.sinks.k1.hive.txnsPerBatchAsk = 2
a1.sinks.k1.batchSize = 10
a1.sinks.k1.serializer = delimited
a1.sinks.k1.serializer.delimiter = ,
a1.sinks.k1.serializer.fieldnames = timepoint,random,sip,dip,dpt,name,nosuchname,typemax,typecname,typeaddr,authservername,additionalrecords

help me please,thanks.



lizhenmxcz@163.com

Re: flume hive sink not work

Posted by iain wright <ia...@gmail.com>.
Are you able to telnet to bigdata1 on port 9083?
On Oct 21, 2015 7:27 PM, "lizhenmxcz@163.com" <li...@163.com> wrote:

>
> hi all:
> I use flume to import data from syslog to hive,but encount the follow
> errors.
>
>
> 2015-10-22 10:05:05,115 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:324)] k2 : Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
>
> org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
>         at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:99)
>
>         at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:344)
>
>         at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:296)
>         at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:254)
>
>         at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:745)
>
> Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://bigdata1:9083', database='dnsdb', table='dns_request', partitionVals=[] }
>
>         at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:380)
>         at org.apache.flume.sink.hive.HiveWriter.<init>(HiveWriter.java:86)
>         ... 6 more
> Caused by: java.util.concurrent.TimeoutException
>         at java.util.concurrent.FutureTask.get(FutureTask.java:201)
>
>         at org.apache.flume.sink.hive.HiveWriter.timedCall(HiveWriter.java:431)
>
>         at org.apache.flume.sink.hive.HiveWriter.newConnection(HiveWriter.java:373)
>         ... 7 more
>
>
> my configuration is:
>
>
> a1.sources = r1
> a1.channels = c1 c2
> a1.sinks = k1 k2
>
> a1.sources.r1.type = syslogudp
> a1.sources.r1.port = 514
> a1.sources.r1.host = 192.168.55.246
>
> a1.sources.r1.channels = c1 c2
> a1.sources.r1.interceptors = i1
> a1.sources.r1.interceptors.i1.type = regex_extractor
> a1.sources.r1.interceptors.i1.regex = Dns(.*)\\[
> a1.sources.r1.interceptors.i1.serializers = t1
> a1.sources.r1.interceptors.i1.serializers.t1.name = type
>
> a1.sources.r1.selector.type = multiplexing
> a1.sources.r1.selector.header = type
> a1.sources.r1.selector.mapping.Request = c1
> a1.sources.r1.selector.mapping.Answer = c2
>
> a1.sinks.k2.type = hive
> a1.sinks.k2.channel = c1
> a1.sinks.k2.hive.metastore = thrift://bigdata1:9083
> a1.sinks.k2.hive.database = dnsdb
> a1.sinks.k2.hive.table = dns_request
> a1.sinks.k2.hive.partiton = %Y,%m,%d,%H
> a1.sinks.k2.hive.txnsPerBatchAsk = 2
> a1.sinks.k2.batchSize = 10
> a1.sinks.k2.serializer = delimited
> a1.sinks.k2.serializer.delimiter = ,
> a1.sinks.k2.serializer.fieldnames = timepoint,random,sip,dip,spt,type,name
>
> a1.sinks.k1.type = hive
> a1.sinks.k1.channel = c2
> a1.sinks.k1.hive.metastore = thrift://bigdata1:9083
> a1.sinks.k1.hive.database = Dnsdb
> a1.sinks.k1.hive.table = dns_answer
> a1.sinks.k1.hive.partiton = %Y,%m,%d,%H
> a1.sinks.k1.hive.txnsPerBatchAsk = 2
> a1.sinks.k1.batchSize = 10
> a1.sinks.k1.serializer = delimited
> a1.sinks.k1.serializer.delimiter = ,
>
> a1.sinks.k1.serializer.fieldnames = timepoint,random,sip,dip,dpt,name,nosuchname,typemax,typecname,typeaddr,authservername,additionalrecords
>
> help me please,thanks.
>
> ------------------------------
> lizhenmxcz@163.com
>