You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by Suhas Satish <su...@gmail.com> on 2013/09/12 04:55:20 UTC

flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Hi
I have setup the following flume topology but nothing gets written to hdfs.
There are no errors either. Do you know whats going wrong?

I have a stand alone configuration for just  *tail exec source -> file
channel -> hdfs sink* working but when I use avro, its getting messed up.


*exec tail -f source -> file channel -> avro sink -> avro source -> file
channel -> hdfs sink*
flume-avro.conf as follows -

agent.sources = reader
  agent.channels = fileChannel
  agent.sinks = avro-forward-sink


  # For each one of the sources, the type is defined
  agent.sources.reader.type = exec

  agent.sources.reader.command = tail -f /opt/mapr/logs/configure.log
  # stderr is simply discarded, unless logStdErr=true

  # If the process exits for any reason, the source also exits and
will produce no further data.
  agent.sources.reader.logStdErr = true

  agent.sources.reader.restart = true

  # The channel can be defined as follows.

  agent.sources.reader.channels = fileChannel

  # Each sink's type must be defined

  agent.sinks.avro-forward-sink.type = avro
  agent.sinks.avro-forward-sink.hostname = localhost
  agent.sinks.avro-forward-sink.port = 41414

  #Specify the channel the sink should use
  agent.sinks.avro-forward-sink.channel = fileChannel

  # Each channel's type is defined.
  agent.channels.fileChannel.type = FILE

  # Other config values specific to each type of channel(sink or source)

  # can be defined as well
  agent.channels.fileChannel.type = FILE
  agent.channels.fileChannel.transactionCapacity = 1000000
  agent.channels.fileChannel.checkpointInterval 30000
  agent.channels.fileChannel.maxFileSize = 2146435071

  agent.channels.fileChannel.capacity 10000000

############################################################
  agent.sources = avro-collection-source
  agent.channels = channel1

  agent.sinks = hdfs-sink

  # For each one of the sources, the type is defined

  agent.sources.avro-collection-source.type = avro
  agent.sources.avro-collection-source.bind = 0.0.0.0
  agent.sources.avro-collection-source.port = 41414

  # The channel can be defined as follows.
  agent.sources.avro-collection-source.channels = channel1
  agent.sinks.hdfs-sink.channel = channel1

  # Each sink's type must be defined
  agent.sinks.hdfs-sink.type = hdfs
  agent.sinks.hdfs-sink.kerberosPrincipal = flume/qa-node133.qa.lab@QA.LAB
  agent.sinks.hdfs-sink.kerberosKeytab = /opt/mapr/conf/flume.keytab
  agent.sinks.hdfs-sink.path = /user/root/flume/log_test7/
  agent.sinks.hdfs-sink.filePrefix = LogCreateTest
  agent.sinks.hdfs-sink.rollInterval = 6

  agent.sinks.hdfs-sink.rollSize = 0
  agent.sinks.hdfs-sink.rollCount = 10000
  agent.sinks.hdfs-sink.batchSize = 10000
  agent.sinks.hdfs-sink.txnEventMax = 40000
  agent.sinks.hdfs-sink.fileType = DataStream

  agent.sinks.hdfs-sink.maxOpenFiles=50
  agent.sinks.hdfs-sink.appendTimeout = 10000
  agent.sinks.hdfs-sink.callTimeout = 10000
  agent.sinks.hdfs-sink.threadsPoolSize=100
  agent.sinks.hdfs-sink.rollTimerPoolSize = 1

  #Specify the channel the source and sink should use
  agent.sources.avro-collection-source.channels = channel1
  agent.sinks.hdfs-sink.channel = channel1

  agent.channels.channel1.type = FILE
  agent.channels.channel1.transactionCapacity = 1000000
  agent.channels.channel1.checkpointInterval 30000

  agent.channels.channel1.maxFileSize = 2146435071
  agent.channels.channel1.capacity 10000000



Thanks,
Suhas.

RE: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by "Mahadevappa, Shobha" <Sh...@nttdata.com>.
I did not observe yesterday.
Now i realize that you are also subscribed to the flume ML.


Regards,
Shobha M

From: Babu, Prashanth [mailto:Prashanth.Babu@nttdata.com]
Sent: 13 September 2013 AM 07:46
To: user@flume.apache.org
Subject: RE: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Yes, it would be better for you to create 2 configurations one each for agent and collector.

Run the following commands for starting the Flume instances in 2 different command shells.
../bin/flume-ng agent --conf ./  -f flume-avro.conf -Dflume.root.logger=DEBUG,LOGFILE -n agent &

../bin/flume-ng agent --conf ./  -f flume-avro.conf -Dflume.root.logger=DEBUG,LOGFILE -n collector &

Regards,
Prashanth.

From: Paul Chavez [mailto:pchavez@verticalsearchworks.com]
Sent: Thursday, September 12, 2013 11:37 PM
To: user@flume.apache.org
Subject: RE: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

I think you can do this on one node, but you will need to run two instances of flume, each with a different agent name.

Paul

From: Suhas Satish [mailto:suhas.satish@gmail.com]
Sent: Thursday, September 12, 2013 10:58 AM
To: user@flume.apache.org
Subject: Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Thanks. Yes I was trying to set it up on a single node. If it cannot be done, I can go to 2 different nodes, but that would add additional complexities which I'd like to avoid if possible.

My original intent was to test if the Avro Source/Avro Sink interface can work with SSL enabled (hence the extra hop) and if it can, can it use the ssl_keystore and ssl_truststore already available from a secure hadoop cluster.

On Thu, Sep 12, 2013 at 10:48 AM, Paul Chavez <pc...@verticalsearchworks.com>> wrote:
et this all up on a single node? If so, why are you adding in an extra Avro hop?

In practice this setup should be on two nodes, one acting as the 'agent' with the exec sour



Cheers,
Suhas.

______________________________________________________________________
Disclaimer:This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding

______________________________________________________________________
Disclaimer:This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data.  If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding

RE: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by "Babu, Prashanth" <Pr...@nttdata.com>.
Yes, it would be better for you to create 2 configurations one each for agent and collector.

Run the following commands for starting the Flume instances in 2 different command shells.
../bin/flume-ng agent --conf ./  -f flume-avro.conf -Dflume.root.logger=DEBUG,LOGFILE -n agent &

../bin/flume-ng agent --conf ./  -f flume-avro.conf -Dflume.root.logger=DEBUG,LOGFILE -n collector &

Regards,
Prashanth.

From: Paul Chavez [mailto:pchavez@verticalsearchworks.com]
Sent: Thursday, September 12, 2013 11:37 PM
To: user@flume.apache.org
Subject: RE: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

I think you can do this on one node, but you will need to run two instances of flume, each with a different agent name.

Paul

From: Suhas Satish [mailto:suhas.satish@gmail.com]
Sent: Thursday, September 12, 2013 10:58 AM
To: user@flume.apache.org
Subject: Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Thanks. Yes I was trying to set it up on a single node. If it cannot be done, I can go to 2 different nodes, but that would add additional complexities which I'd like to avoid if possible.

My original intent was to test if the Avro Source/Avro Sink interface can work with SSL enabled (hence the extra hop) and if it can, can it use the ssl_keystore and ssl_truststore already available from a secure hadoop cluster.

On Thu, Sep 12, 2013 at 10:48 AM, Paul Chavez <pc...@verticalsearchworks.com>> wrote:
et this all up on a single node? If so, why are you adding in an extra Avro hop?

In practice this setup should be on two nodes, one acting as the 'agent' with the exec sour



Cheers,
Suhas.

______________________________________________________________________
Disclaimer:This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data.  If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding

RE: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by Paul Chavez <pc...@verticalsearchworks.com>.
I think you can do this on one node, but you will need to run two instances of flume, each with a different agent name.

Paul

From: Suhas Satish [mailto:suhas.satish@gmail.com]
Sent: Thursday, September 12, 2013 10:58 AM
To: user@flume.apache.org
Subject: Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Thanks. Yes I was trying to set it up on a single node. If it cannot be done, I can go to 2 different nodes, but that would add additional complexities which I'd like to avoid if possible.

My original intent was to test if the Avro Source/Avro Sink interface can work with SSL enabled (hence the extra hop) and if it can, can it use the ssl_keystore and ssl_truststore already available from a secure hadoop cluster.

On Thu, Sep 12, 2013 at 10:48 AM, Paul Chavez <pc...@verticalsearchworks.com>> wrote:
et this all up on a single node? If so, why are you adding in an extra Avro hop?

In practice this setup should be on two nodes, one acting as the 'agent' with the exec sour



Cheers,
Suhas.

Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by Suhas Satish <su...@gmail.com>.
Thanks. Yes I was trying to set it up on a single node. If it cannot be
done, I can go to 2 different nodes, but that would add additional
complexities which I'd like to avoid if possible.

My original intent was to test if the Avro Source/Avro Sink interface can
work with SSL enabled (hence the extra hop) and if it can, can it use the
ssl_keystore and ssl_truststore already available from a secure hadoop
cluster.

On Thu, Sep 12, 2013 at 10:48 AM, Paul Chavez <
pchavez@verticalsearchworks.com> wrote:

> et this all up on a single node? If so, why are you adding in an extra
> Avro hop?****
>
> ** **
>
> In practice this setup should be on two nodes, one acting as the ‘agent’
> with the exec sour
>



Cheers,
Suhas.

RE: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by Paul Chavez <pc...@verticalsearchworks.com>.
Are you trying to set this all up on a single node? If so, why are you adding in an extra Avro hop?

In practice this setup should be on two nodes, one acting as the 'agent' with the exec source and avro sink, and the other with an avro source and hdfs sink.

You can use one configuration file, but make sure each node configuration has a different name. Then on each node invoke flume with the proper name.

Hope that helps,
Paul Chavez

From: Suhas Satish [mailto:suhas.satish@gmail.com]
Sent: Thursday, September 12, 2013 10:34 AM
To: user@flume.apache.org; Prashanth.Babu@nttdata.com
Cc: dev
Subject: Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Yes I tried splitting into agent and collector (with different names) but it was in a single configuration file  flume-avro.conf.

does that mean I need to have 2 separate configuration files, 1 for the agent and 1 for the collector?

If so, is my flume launch command still the same then ? is it

../bin/flume-ng agent --conf ./  -f flume-avro.conf -Dflume.root.logger=DEBUG,LOGFILE -n agent &
or  is it
../bin/flume-ng agent --conf ./  -f flume-avro.conf -Dflume.root.logger=DEBUG,LOGFILE -n collector &
or is it
../bin/flume-ng agent --conf ./  -f flume-avro.conf -Dflume.root.logger=DEBUG,LOGFILE -n agent, collector &

?




On Wed, Sep 11, 2013 at 9:15 PM, Babu, Prashanth <Pr...@nttdata.com>> wrote:
d also change agent to something else in the collector conf.




Cheers,
Suhas.

Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by Suhas Satish <su...@gmail.com>.
Yes I tried splitting into agent and collector (with different names) but
it was in a single configuration file  flume-avro.conf.

does that mean I need to have 2 separate configuration files, 1 for the
agent and 1 for the collector?

If so, is my flume launch command still the same then ? is it

../bin/flume-ng agent --conf ./  -f flume-avro.conf
-Dflume.root.logger=DEBUG,LOGFILE -n agent &
or  is it
../bin/flume-ng agent --conf ./  -f flume-avro.conf
-Dflume.root.logger=DEBUG,LOGFILE -n collector &
or is it
../bin/flume-ng agent --conf ./  -f flume-avro.conf
-Dflume.root.logger=DEBUG,LOGFILE -n agent, collector &

?




On Wed, Sep 11, 2013 at 9:15 PM, Babu, Prashanth <Prashanth.Babu@nttdata.com
> wrote:

> d also change agent to something else in the collector conf.****
>
> **
>



Cheers,
Suhas.

Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by Suhas Satish <su...@gmail.com>.
Yes I tried splitting into agent and collector (with different names) but
it was in a single configuration file  flume-avro.conf.

does that mean I need to have 2 separate configuration files, 1 for the
agent and 1 for the collector?

If so, is my flume launch command still the same then ? is it

../bin/flume-ng agent --conf ./  -f flume-avro.conf
-Dflume.root.logger=DEBUG,LOGFILE -n agent &
or  is it
../bin/flume-ng agent --conf ./  -f flume-avro.conf
-Dflume.root.logger=DEBUG,LOGFILE -n collector &
or is it
../bin/flume-ng agent --conf ./  -f flume-avro.conf
-Dflume.root.logger=DEBUG,LOGFILE -n agent, collector &

?




On Wed, Sep 11, 2013 at 9:15 PM, Babu, Prashanth <Prashanth.Babu@nttdata.com
> wrote:

> d also change agent to something else in the collector conf.****
>
> **
>



Cheers,
Suhas.

RE: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by "Babu, Prashanth" <Pr...@nttdata.com>.
Did you try splitting this configuration into agent and collector?
And also change agent to something else in the collector conf.

Regards,
Prashanth.

From: Suhas Satish [mailto:suhas.satish@gmail.com]
Sent: Thursday, September 12, 2013 8:46 AM
To: user@flume.apache.org; dev
Subject: Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

This is my launch cmd -
../bin/flume-ng agent --conf ./  -f flume-avro.conf -Dflume.root.logger=DEBUG,LOGFILE -n agent &

Cheers,
Suhas.

On Wed, Sep 11, 2013 at 8:08 PM, Suhas Satish <su...@gmail.com>> wrote:

On Wed, Sep 11, 2013 at 7:55 PM, Suhas Satish <su...@gmail.com>> wrote:
up the following flume top

I find this in the log file -
11 Sep 2013 20:06:15,723 ERROR [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadSinks:432)  - Sink hdfs-sink has been removed due to an error during configuration

Can you point out what the error is?

Thanks,
Suhas.


______________________________________________________________________
Disclaimer:This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data.  If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding

RE: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by "Babu, Prashanth" <Pr...@nttdata.com>.
Did you try splitting this configuration into agent and collector?
And also change agent to something else in the collector conf.

Regards,
Prashanth.

From: Suhas Satish [mailto:suhas.satish@gmail.com]
Sent: Thursday, September 12, 2013 8:46 AM
To: user@flume.apache.org; dev
Subject: Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

This is my launch cmd -
../bin/flume-ng agent --conf ./  -f flume-avro.conf -Dflume.root.logger=DEBUG,LOGFILE -n agent &

Cheers,
Suhas.

On Wed, Sep 11, 2013 at 8:08 PM, Suhas Satish <su...@gmail.com>> wrote:

On Wed, Sep 11, 2013 at 7:55 PM, Suhas Satish <su...@gmail.com>> wrote:
up the following flume top

I find this in the log file -
11 Sep 2013 20:06:15,723 ERROR [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadSinks:432)  - Sink hdfs-sink has been removed due to an error during configuration

Can you point out what the error is?

Thanks,
Suhas.


______________________________________________________________________
Disclaimer:This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data.  If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding

Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by Suhas Satish <su...@gmail.com>.
This is my launch cmd -
../bin/flume-ng agent --conf ./  -f flume-avro.conf
-Dflume.root.logger=DEBUG,LOGFILE -n agent &

Cheers,
Suhas.


On Wed, Sep 11, 2013 at 8:08 PM, Suhas Satish <su...@gmail.com>wrote:

>
> On Wed, Sep 11, 2013 at 7:55 PM, Suhas Satish <su...@gmail.com>wrote:
>
>> up the following flume top
>
>
> I find this in the log file -
> 11 Sep 2013 20:06:15,723 ERROR [conf-file-poller-0]
> (org.apache.flume.node.AbstractConfigurationProvider.loadSinks:432)  - Sink
> hdfs-sink has been removed due to an error during configuration
>
> Can you point out what the error is?
>
> Thanks,
> Suhas.
>

Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by Suhas Satish <su...@gmail.com>.
This is my launch cmd -
../bin/flume-ng agent --conf ./  -f flume-avro.conf
-Dflume.root.logger=DEBUG,LOGFILE -n agent &

Cheers,
Suhas.


On Wed, Sep 11, 2013 at 8:08 PM, Suhas Satish <su...@gmail.com>wrote:

>
> On Wed, Sep 11, 2013 at 7:55 PM, Suhas Satish <su...@gmail.com>wrote:
>
>> up the following flume top
>
>
> I find this in the log file -
> 11 Sep 2013 20:06:15,723 ERROR [conf-file-poller-0]
> (org.apache.flume.node.AbstractConfigurationProvider.loadSinks:432)  - Sink
> hdfs-sink has been removed due to an error during configuration
>
> Can you point out what the error is?
>
> Thanks,
> Suhas.
>

Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by Suhas Satish <su...@gmail.com>.
On Wed, Sep 11, 2013 at 7:55 PM, Suhas Satish <su...@gmail.com>wrote:

> up the following flume top


I find this in the log file -
11 Sep 2013 20:06:15,723 ERROR [conf-file-poller-0]
(org.apache.flume.node.AbstractConfigurationProvider.loadSinks:432)  - Sink
hdfs-sink has been removed due to an error during configuration

Can you point out what the error is?

Thanks,
Suhas.

Re: flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created

Posted by Suhas Satish <su...@gmail.com>.
On Wed, Sep 11, 2013 at 7:55 PM, Suhas Satish <su...@gmail.com>wrote:

> up the following flume top


I find this in the log file -
11 Sep 2013 20:06:15,723 ERROR [conf-file-poller-0]
(org.apache.flume.node.AbstractConfigurationProvider.loadSinks:432)  - Sink
hdfs-sink has been removed due to an error during configuration

Can you point out what the error is?

Thanks,
Suhas.