You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@chukwa.apache.org by Mohammad Tariq <do...@gmail.com> on 2011/11/17 11:47:28 UTC

How to extract only the desired information using Chuka

Is it possible for us to extract only the actual content present
inside a file without any other information, using Chukwa??

Regards,
    Mohammad Tariq

Re: How to extract only the desired information using Chuka

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Bill,
  Thank you so much for your valuable reply.I'll proceed as per your
directions and let you know the progress.

Regards,
    Mohammad Tariq



On Thu, Nov 17, 2011 at 9:56 PM, Bill Graham <bi...@gmail.com> wrote:
> The data stored in Hadoop after the demux process is a sequence file
> containing the data. One easy way to get this is to use Pig via the
> ChukwaLoader:
> http://svn.apache.org/viewvc/incubator/chukwa/trunk/contrib/chukwa-pig/src/java/org/apache/hadoop/chukwa/pig/ChukwaLoader.java?view=markup
>
> Note that it's using the SequenceFileRecordReader like this to read the
> data, so if you don't want to use Pig, you could do something similar.
> SequenceFileRecordReader<ChukwaRecordKey, ChukwaRecord>
>
> The ChukwaRecord contains a handful of fields created by the Processor that
> you've configured to collect your data. If you're using the TSProcessor, I
> think the payload is in a field called 'body' IIRC.
> There's also a command line java tool to dump the contents of a sequence
> file to stdout, which can be handy. I forget what it's called, but it should
> be in the docs.
> On Thu, Nov 17, 2011 at 2:53 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Oh, in that case i have to wait for their reply and keep on trying
>> till then..Thanks for the reply Ahmed.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Thu, Nov 17, 2011 at 4:20 PM, Ahmed Fathalla <af...@gmail.com>
>> wrote:
>> > Hmm...maybe in the demux part of the system ( I think it utilizes pig
>> > scripts somewhere). I'm not an expert in this, maybe Ari, Bill or Eric
>> > can
>> > help on this.
>> >
>> > On Thu, Nov 17, 2011 at 12:47 PM, Mohammad Tariq <do...@gmail.com>
>> > wrote:
>> >>
>> >> Is it possible for us to extract only the actual content present
>> >> inside a file without any other information, using Chukwa??
>> >>
>> >> Regards,
>> >>     Mohammad Tariq
>> >
>> >
>> >
>> > --
>> > Ahmed Fathalla
>> >
>
>

Re: How to extract only the desired information using Chuka

Posted by Bill Graham <bi...@gmail.com>.
The data stored in Hadoop after the demux process is a sequence file
containing the data. One easy way to get this is to use Pig via the
ChukwaLoader:

http://svn.apache.org/viewvc/incubator/chukwa/trunk/contrib/chukwa-pig/src/java/org/apache/hadoop/chukwa/pig/ChukwaLoader.java?view=markup

Note that it's using the SequenceFileRecordReader like this to read the
data, so if you don't want to use Pig, you could do something similar.
SequenceFileRecordReader<ChukwaRecordKey, ChukwaRecord>

The ChukwaRecord contains a handful of fields created by the Processor that
you've configured to collect your data. If you're using the TSProcessor, I
think the payload is in a field called 'body' IIRC.

There's also a command line java tool to dump the contents of a sequence
file to stdout, which can be handy. I forget what it's called, but it
should be in the docs.

On Thu, Nov 17, 2011 at 2:53 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Oh, in that case i have to wait for their reply and keep on trying
> till then..Thanks for the reply Ahmed.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Thu, Nov 17, 2011 at 4:20 PM, Ahmed Fathalla <af...@gmail.com>
> wrote:
> > Hmm...maybe in the demux part of the system ( I think it utilizes pig
> > scripts somewhere). I'm not an expert in this, maybe Ari, Bill or Eric
> can
> > help on this.
> >
> > On Thu, Nov 17, 2011 at 12:47 PM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >>
> >> Is it possible for us to extract only the actual content present
> >> inside a file without any other information, using Chukwa??
> >>
> >> Regards,
> >>     Mohammad Tariq
> >
> >
> >
> > --
> > Ahmed Fathalla
> >
>

Re: How to extract only the desired information using Chuka

Posted by Mohammad Tariq <do...@gmail.com>.
Oh, in that case i have to wait for their reply and keep on trying
till then..Thanks for the reply Ahmed.

Regards,
    Mohammad Tariq



On Thu, Nov 17, 2011 at 4:20 PM, Ahmed Fathalla <af...@gmail.com> wrote:
> Hmm...maybe in the demux part of the system ( I think it utilizes pig
> scripts somewhere). I'm not an expert in this, maybe Ari, Bill or Eric can
> help on this.
>
> On Thu, Nov 17, 2011 at 12:47 PM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Is it possible for us to extract only the actual content present
>> inside a file without any other information, using Chukwa??
>>
>> Regards,
>>     Mohammad Tariq
>
>
>
> --
> Ahmed Fathalla
>

Re: How to extract only the desired information using Chuka

Posted by Ahmed Fathalla <af...@gmail.com>.
Hmm...maybe in the demux part of the system ( I think it utilizes pig
scripts somewhere). I'm not an expert in this, maybe Ari, Bill or Eric can
help on this.

On Thu, Nov 17, 2011 at 12:47 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Is it possible for us to extract only the actual content present
> inside a file without any other information, using Chukwa??
>
> Regards,
>     Mohammad Tariq
>



-- 
Ahmed Fathalla