You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Sandeep Baldawa <sa...@gmail.com> on 2013/04/07 20:45:25 UTC

Best way for a newbie to learn flume

Hello,

I am a new user for flume and was wondering what was the best way to learn
about flume from a user's perspective.

I am interested more in getting answers for the following questions
- What problem is flume trying to solve.
- How to install flume in the most simple way to understand the concepts.
- Am trying to follow http://flume.apache.org/FlumeUserGuide.html# and am
not sure if I can find a quick start guide here, can someone point me to
the correct link if possible.

Thanks,
Sandeep

Re: Best way for a newbie to learn flume

Posted by Sandeep Baldawa <sa...@gmail.com>.
Thanks again for all the details. I will follow the steps described by you.


On Sun, Apr 7, 2013 at 2:58 PM, Israel Ekpo <is...@aicer.org> wrote:

> Also the link below is associated with Flume OG and not Flume NG
>
> http://archive.cloudera.com/cdh/3/flume/UserGuide/
>
> The architecture and features have changed significantly since that version
>
>
> On 7 April 2013 17:54, Israel Ekpo <is...@aicer.org> wrote:
>
>> Sandeep,
>>
>> So Flume currently has two tracks:
>>
>> Flume OG (not actively supported)
>> https://cwiki.apache.org/confluence/display/FLUME/Flume+OG+%28pre+1.0%29
>> Flume NG (currently active)
>> https://cwiki.apache.org/confluence/display/FLUME/Flume+NG
>>
>> The latest stable version for Flume NG is 1.3.1
>>
>> The NG stands for Next Generation and it is the current active
>> development track.
>>
>> The OG refers to the Original Generation of Flume. This includes releases
>> before the 1.0.0 release.
>>
>> New comers and existing users of the OG track are encouraged to migrate
>> over to the NG track.
>>
>> You can download Flume NG 1.3.1 here
>>
>> http://flume.apache.org/download.html
>>
>> Regarding "Getting Started", in the next couple of weeks, additional
>> information will be added to Wiki to make the on-boarding process easier
>> for new comers.
>>
>> In the time being, please bear with us.
>>
>> I would recommend you download and install the latest version of Java 1.6.
>>
>> Then download Flume and extract it to folder in your directory.
>>
>> Then you can use the following sources, channels and sinks to get started.
>>
>> This is the best way for you to learn and understand the various pieces.
>>
>> SOURCE: Spooling Directory Source
>> CHANNEL: File Channel (more reliable) or Memory Channel (faster)
>> SINK: File Roll Sink
>>
>> You can create a directory that you will be spooling and dump a couple of
>> log files in there. Make sure the files a new-line delimited.
>>
>> Each line will represent an event in the log files.
>>
>> Then configure the file channel and the file roll sink using guidelines
>> and examples available in the user guide.
>>
>> http://flume.apache.org/FlumeUserGuide.html
>>
>> That will give you a feel for how flume works.
>>
>> Once you have that set up then you can run the agent and see what happens.
>>
>> Once you start getting the hang of things you can try other sources and
>> sinks or maybe even create a few of your own custom sources, channels or
>> sinks.
>>
>>
>>
>> On 7 April 2013 17:10, Sandeep Baldawa <sa...@gmail.com> wrote:
>>
>>>
>>> Thanks for the detailed reply.
>>>
>>> Awesome questions and I should have added these details in my question,
>>> am learning flume more as a hobby, learning experience and a tech
>>> enthusiastic(heard pretty good things about flume).
>>>
>>> Thanks again for the instructions. Just one question about setting
>>> things up, are instructions at
>>> http://archive.cloudera.com/cdh/3/flume/UserGuide/ relevant with the
>>> latest build?, I liked the documentation in this link which has a quick
>>> start guide too.
>>>
>>>
>>> On Sun, Apr 7, 2013 at 1:28 PM, Israel Ekpo <is...@aicer.org> wrote:
>>>
>>>> Sandeep,
>>>>
>>>> Excellent questions.
>>>>
>>>> You asked "what problem Flume is trying to solve?".
>>>>
>>>> It think the more appropriate question is what problem you are trying
>>>> to solve?
>>>>
>>>> This will go a long way in helping us understand which components of
>>>> Flume you may need and how you need to set it up.
>>>>
>>>> Are you using Flume as part of your job or personal hubby? Are you
>>>> using Flume for a course at school or part of an academic project?
>>>>
>>>> Going back to your original question, in the simplest terms, and for
>>>> most use cases, Flume is a system designed for collecting and transporting
>>>> large amounts of data and events from one or more sources and then
>>>> aggregating the collected data in a centralized data store or for onward
>>>> propagation to subsequent sources.
>>>>
>>>> You can use it for aggregating data from log files, network traffic,
>>>> click streams, twitter and any other source that can generate events.
>>>>
>>>> Spend more time to review the user guide and you will find a lot of
>>>> information and answers to prospective questions.
>>>>
>>>> http://flume.apache.org/FlumeUserGuide.html
>>>>
>>>> To install flume you will need to set up Java 1.6 and then make sure
>>>> that it is available in your PATH and then download the latest version of
>>>> Flume and decompress the tarball or zip file.
>>>>
>>>> You will need to set up the configuration file(s) for the agents based
>>>> on the sources, channels and sinks you choose to use.
>>>>
>>>> I would recommend that you go ahead and get started with setting it up
>>>> and let us know if you run into any issues.
>>>>
>>>> If you can share which use case and what problem you are trying to
>>>> solve, someone can point you in the right direction.
>>>>
>>>>
>>>>
>>>> On 7 April 2013 14:45, Sandeep Baldawa <sa...@gmail.com>wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am a new user for flume and was wondering what was the best way to
>>>>> learn about flume from a user's perspective.
>>>>>
>>>>> I am interested more in getting answers for the following questions
>>>>> - What problem is flume trying to solve.
>>>>> - How to install flume in the most simple way to understand the
>>>>> concepts.
>>>>> - Am trying to follow http://flume.apache.org/FlumeUserGuide.html#and am not sure if I can find a quick start guide here, can someone point
>>>>> me to the correct link if possible.
>>>>>
>>>>> Thanks,
>>>>> Sandeep
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Best way for a newbie to learn flume

Posted by Israel Ekpo <is...@aicer.org>.
Also the link below is associated with Flume OG and not Flume NG

http://archive.cloudera.com/cdh/3/flume/UserGuide/

The architecture and features have changed significantly since that version


On 7 April 2013 17:54, Israel Ekpo <is...@aicer.org> wrote:

> Sandeep,
>
> So Flume currently has two tracks:
>
> Flume OG (not actively supported)
> https://cwiki.apache.org/confluence/display/FLUME/Flume+OG+%28pre+1.0%29
> Flume NG (currently active)
> https://cwiki.apache.org/confluence/display/FLUME/Flume+NG
>
> The latest stable version for Flume NG is 1.3.1
>
> The NG stands for Next Generation and it is the current active development
> track.
>
> The OG refers to the Original Generation of Flume. This includes releases
> before the 1.0.0 release.
>
> New comers and existing users of the OG track are encouraged to migrate
> over to the NG track.
>
> You can download Flume NG 1.3.1 here
>
> http://flume.apache.org/download.html
>
> Regarding "Getting Started", in the next couple of weeks, additional
> information will be added to Wiki to make the on-boarding process easier
> for new comers.
>
> In the time being, please bear with us.
>
> I would recommend you download and install the latest version of Java 1.6.
>
> Then download Flume and extract it to folder in your directory.
>
> Then you can use the following sources, channels and sinks to get started.
>
> This is the best way for you to learn and understand the various pieces.
>
> SOURCE: Spooling Directory Source
> CHANNEL: File Channel (more reliable) or Memory Channel (faster)
> SINK: File Roll Sink
>
> You can create a directory that you will be spooling and dump a couple of
> log files in there. Make sure the files a new-line delimited.
>
> Each line will represent an event in the log files.
>
> Then configure the file channel and the file roll sink using guidelines
> and examples available in the user guide.
>
> http://flume.apache.org/FlumeUserGuide.html
>
> That will give you a feel for how flume works.
>
> Once you have that set up then you can run the agent and see what happens.
>
> Once you start getting the hang of things you can try other sources and
> sinks or maybe even create a few of your own custom sources, channels or
> sinks.
>
>
>
> On 7 April 2013 17:10, Sandeep Baldawa <sa...@gmail.com> wrote:
>
>>
>> Thanks for the detailed reply.
>>
>> Awesome questions and I should have added these details in my question,
>> am learning flume more as a hobby, learning experience and a tech
>> enthusiastic(heard pretty good things about flume).
>>
>> Thanks again for the instructions. Just one question about setting things
>> up, are instructions at
>> http://archive.cloudera.com/cdh/3/flume/UserGuide/ relevant with the
>> latest build?, I liked the documentation in this link which has a quick
>> start guide too.
>>
>>
>> On Sun, Apr 7, 2013 at 1:28 PM, Israel Ekpo <is...@aicer.org> wrote:
>>
>>> Sandeep,
>>>
>>> Excellent questions.
>>>
>>> You asked "what problem Flume is trying to solve?".
>>>
>>> It think the more appropriate question is what problem you are trying to
>>> solve?
>>>
>>> This will go a long way in helping us understand which components of
>>> Flume you may need and how you need to set it up.
>>>
>>> Are you using Flume as part of your job or personal hubby? Are you using
>>> Flume for a course at school or part of an academic project?
>>>
>>> Going back to your original question, in the simplest terms, and for
>>> most use cases, Flume is a system designed for collecting and transporting
>>> large amounts of data and events from one or more sources and then
>>> aggregating the collected data in a centralized data store or for onward
>>> propagation to subsequent sources.
>>>
>>> You can use it for aggregating data from log files, network traffic,
>>> click streams, twitter and any other source that can generate events.
>>>
>>> Spend more time to review the user guide and you will find a lot of
>>> information and answers to prospective questions.
>>>
>>> http://flume.apache.org/FlumeUserGuide.html
>>>
>>> To install flume you will need to set up Java 1.6 and then make sure
>>> that it is available in your PATH and then download the latest version of
>>> Flume and decompress the tarball or zip file.
>>>
>>> You will need to set up the configuration file(s) for the agents based
>>> on the sources, channels and sinks you choose to use.
>>>
>>> I would recommend that you go ahead and get started with setting it up
>>> and let us know if you run into any issues.
>>>
>>> If you can share which use case and what problem you are trying to
>>> solve, someone can point you in the right direction.
>>>
>>>
>>>
>>> On 7 April 2013 14:45, Sandeep Baldawa <sa...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> I am a new user for flume and was wondering what was the best way to
>>>> learn about flume from a user's perspective.
>>>>
>>>> I am interested more in getting answers for the following questions
>>>> - What problem is flume trying to solve.
>>>> - How to install flume in the most simple way to understand the
>>>> concepts.
>>>> - Am trying to follow http://flume.apache.org/FlumeUserGuide.html# and
>>>> am not sure if I can find a quick start guide here, can someone point me to
>>>> the correct link if possible.
>>>>
>>>> Thanks,
>>>> Sandeep
>>>>
>>>
>>>
>>
>

Re: Best way for a newbie to learn flume

Posted by Israel Ekpo <is...@aicer.org>.
Sandeep,

So Flume currently has two tracks:

Flume OG (not actively supported)
https://cwiki.apache.org/confluence/display/FLUME/Flume+OG+%28pre+1.0%29
Flume NG (currently active)
https://cwiki.apache.org/confluence/display/FLUME/Flume+NG

The latest stable version for Flume NG is 1.3.1

The NG stands for Next Generation and it is the current active development
track.

The OG refers to the Original Generation of Flume. This includes releases
before the 1.0.0 release.

New comers and existing users of the OG track are encouraged to migrate
over to the NG track.

You can download Flume NG 1.3.1 here

http://flume.apache.org/download.html

Regarding "Getting Started", in the next couple of weeks, additional
information will be added to Wiki to make the on-boarding process easier
for new comers.

In the time being, please bear with us.

I would recommend you download and install the latest version of Java 1.6.

Then download Flume and extract it to folder in your directory.

Then you can use the following sources, channels and sinks to get started.

This is the best way for you to learn and understand the various pieces.

SOURCE: Spooling Directory Source
CHANNEL: File Channel (more reliable) or Memory Channel (faster)
SINK: File Roll Sink

You can create a directory that you will be spooling and dump a couple of
log files in there. Make sure the files a new-line delimited.

Each line will represent an event in the log files.

Then configure the file channel and the file roll sink using guidelines and
examples available in the user guide.

http://flume.apache.org/FlumeUserGuide.html

That will give you a feel for how flume works.

Once you have that set up then you can run the agent and see what happens.

Once you start getting the hang of things you can try other sources and
sinks or maybe even create a few of your own custom sources, channels or
sinks.



On 7 April 2013 17:10, Sandeep Baldawa <sa...@gmail.com> wrote:

>
> Thanks for the detailed reply.
>
> Awesome questions and I should have added these details in my question, am
> learning flume more as a hobby, learning experience and a tech
> enthusiastic(heard pretty good things about flume).
>
> Thanks again for the instructions. Just one question about setting things
> up, are instructions at http://archive.cloudera.com/cdh/3/flume/UserGuide/relevant with the latest build?, I liked the documentation in this link
> which has a quick start guide too.
>
>
> On Sun, Apr 7, 2013 at 1:28 PM, Israel Ekpo <is...@aicer.org> wrote:
>
>> Sandeep,
>>
>> Excellent questions.
>>
>> You asked "what problem Flume is trying to solve?".
>>
>> It think the more appropriate question is what problem you are trying to
>> solve?
>>
>> This will go a long way in helping us understand which components of
>> Flume you may need and how you need to set it up.
>>
>> Are you using Flume as part of your job or personal hubby? Are you using
>> Flume for a course at school or part of an academic project?
>>
>> Going back to your original question, in the simplest terms, and for most
>> use cases, Flume is a system designed for collecting and transporting large
>> amounts of data and events from one or more sources and then aggregating
>> the collected data in a centralized data store or for onward propagation to
>> subsequent sources.
>>
>> You can use it for aggregating data from log files, network traffic,
>> click streams, twitter and any other source that can generate events.
>>
>> Spend more time to review the user guide and you will find a lot of
>> information and answers to prospective questions.
>>
>> http://flume.apache.org/FlumeUserGuide.html
>>
>> To install flume you will need to set up Java 1.6 and then make sure that
>> it is available in your PATH and then download the latest version of Flume
>> and decompress the tarball or zip file.
>>
>> You will need to set up the configuration file(s) for the agents based on
>> the sources, channels and sinks you choose to use.
>>
>> I would recommend that you go ahead and get started with setting it up
>> and let us know if you run into any issues.
>>
>> If you can share which use case and what problem you are trying to solve,
>> someone can point you in the right direction.
>>
>>
>>
>> On 7 April 2013 14:45, Sandeep Baldawa <sa...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I am a new user for flume and was wondering what was the best way to
>>> learn about flume from a user's perspective.
>>>
>>> I am interested more in getting answers for the following questions
>>> - What problem is flume trying to solve.
>>> - How to install flume in the most simple way to understand the concepts.
>>> - Am trying to follow http://flume.apache.org/FlumeUserGuide.html# and
>>> am not sure if I can find a quick start guide here, can someone point me to
>>> the correct link if possible.
>>>
>>> Thanks,
>>> Sandeep
>>>
>>
>>
>

Re: Best way for a newbie to learn flume

Posted by Sandeep Baldawa <sa...@gmail.com>.
Thanks for the detailed reply.

Awesome questions and I should have added these details in my question, am
learning flume more as a hobby, learning experience and a tech
enthusiastic(heard pretty good things about flume).

Thanks again for the instructions. Just one question about setting things
up, are instructions at
http://archive.cloudera.com/cdh/3/flume/UserGuide/relevant with the
latest build?, I liked the documentation in this link
which has a quick start guide too.


On Sun, Apr 7, 2013 at 1:28 PM, Israel Ekpo <is...@aicer.org> wrote:

> Sandeep,
>
> Excellent questions.
>
> You asked "what problem Flume is trying to solve?".
>
> It think the more appropriate question is what problem you are trying to
> solve?
>
> This will go a long way in helping us understand which components of Flume
> you may need and how you need to set it up.
>
> Are you using Flume as part of your job or personal hubby? Are you using
> Flume for a course at school or part of an academic project?
>
> Going back to your original question, in the simplest terms, and for most
> use cases, Flume is a system designed for collecting and transporting large
> amounts of data and events from one or more sources and then aggregating
> the collected data in a centralized data store or for onward propagation to
> subsequent sources.
>
> You can use it for aggregating data from log files, network traffic, click
> streams, twitter and any other source that can generate events.
>
> Spend more time to review the user guide and you will find a lot of
> information and answers to prospective questions.
>
> http://flume.apache.org/FlumeUserGuide.html
>
> To install flume you will need to set up Java 1.6 and then make sure that
> it is available in your PATH and then download the latest version of Flume
> and decompress the tarball or zip file.
>
> You will need to set up the configuration file(s) for the agents based on
> the sources, channels and sinks you choose to use.
>
> I would recommend that you go ahead and get started with setting it up and
> let us know if you run into any issues.
>
> If you can share which use case and what problem you are trying to solve,
> someone can point you in the right direction.
>
>
>
> On 7 April 2013 14:45, Sandeep Baldawa <sa...@gmail.com> wrote:
>
>> Hello,
>>
>> I am a new user for flume and was wondering what was the best way to
>> learn about flume from a user's perspective.
>>
>> I am interested more in getting answers for the following questions
>> - What problem is flume trying to solve.
>> - How to install flume in the most simple way to understand the concepts.
>> - Am trying to follow http://flume.apache.org/FlumeUserGuide.html# and
>> am not sure if I can find a quick start guide here, can someone point me to
>> the correct link if possible.
>>
>> Thanks,
>> Sandeep
>>
>
>

Re: Best way for a newbie to learn flume

Posted by Israel Ekpo <is...@aicer.org>.
Sandeep,

Excellent questions.

You asked "what problem Flume is trying to solve?".

It think the more appropriate question is what problem you are trying to
solve?

This will go a long way in helping us understand which components of Flume
you may need and how you need to set it up.

Are you using Flume as part of your job or personal hubby? Are you using
Flume for a course at school or part of an academic project?

Going back to your original question, in the simplest terms, and for most
use cases, Flume is a system designed for collecting and transporting large
amounts of data and events from one or more sources and then aggregating
the collected data in a centralized data store or for onward propagation to
subsequent sources.

You can use it for aggregating data from log files, network traffic, click
streams, twitter and any other source that can generate events.

Spend more time to review the user guide and you will find a lot of
information and answers to prospective questions.

http://flume.apache.org/FlumeUserGuide.html

To install flume you will need to set up Java 1.6 and then make sure that
it is available in your PATH and then download the latest version of Flume
and decompress the tarball or zip file.

You will need to set up the configuration file(s) for the agents based on
the sources, channels and sinks you choose to use.

I would recommend that you go ahead and get started with setting it up and
let us know if you run into any issues.

If you can share which use case and what problem you are trying to solve,
someone can point you in the right direction.



On 7 April 2013 14:45, Sandeep Baldawa <sa...@gmail.com> wrote:

> Hello,
>
> I am a new user for flume and was wondering what was the best way to learn
> about flume from a user's perspective.
>
> I am interested more in getting answers for the following questions
> - What problem is flume trying to solve.
> - How to install flume in the most simple way to understand the concepts.
> - Am trying to follow http://flume.apache.org/FlumeUserGuide.html# and am
> not sure if I can find a quick start guide here, can someone point me to
> the correct link if possible.
>
> Thanks,
> Sandeep
>