You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Mohammad Tariq <do...@gmail.com> on 2012/06/15 01:47:02 UTC

Hbase-sink behavior

Hello list,

       I am trying to use hbase-sink to collect data from a local file
and dump it into an Hbase table..But there are a few things I am not
able to understand and need some guidance.

This is the content of my conf file :

hbase-agent.sources = tail
hbase-agent.sinks = sink1
hbase-agent.channels = ch1
hbase-agent.sources.tail.type = exec
hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
hbase-agent.sources.tail.channels = ch1
hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
hbase-agent.sinks.sink1.channel = ch1
hbase-agent.sinks.sink1.table = test3
hbase-agent.sinks.sink1.columnFamily = testing
hbase-agent.sinks.sink1.column = foo
hbase-agent.sinks.sink1.serializer =
org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
hbase-agent.sinks.sink1.serializer.payloadColumn = col1
hbase-agent.sinks.sink1.serializer.incrementColumn = col1
hbase-agent.sinks.sink1.serializer.keyType = timestamp
hbase-agent.sinks.sink1.serializer.rowPrefix = 1
hbase-agent.sinks.sink1.serializer.suffix = timestamp
hbase-agent.channels.ch1.type=memory

Right now I am taking just some simple text from a file which has
following content -

value1
value2
value3
value4
value5
value6

And my Hbase table looks like -

hbase(main):217:0> scan 'test3'
ROW                                    COLUMN+CELL
 11339716704561                        column=testing:col1,
timestamp=1339716707569, value=value1
 11339716704562                        column=testing:col1,
timestamp=1339716707571, value=value4
 11339716846594                        column=testing:col1,
timestamp=1339716849608, value=value2
 11339716846595                        column=testing:col1,
timestamp=1339716849610, value=value1
 11339716846596                        column=testing:col1,
timestamp=1339716849611, value=value6
 11339716846597                        column=testing:col1,
timestamp=1339716849614, value=value6
 11339716846598                        column=testing:col1,
timestamp=1339716849615, value=value5
 11339716846599                        column=testing:col1,
timestamp=1339716849615, value=value6
 incRow                                column=testing:col1,
timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
9 row(s) in 0.0580 seconds

Now I have following questions -

1- Why the timestamp value is different from the row key?(I was trying
to make "1+timestamp" as the rowkey)
2- Although I am not using "incRow", it stills appear in the table
with some value. Why so and what is this value??
3- How can avoid the last row??

I am still in the learning phase so please pardon my ignorance..Many thanks.

Regards,
    Mohammad Tariq

Re: Hbase-sink behavior

Posted by "ashutosh (오픈플랫폼개발팀)" <sh...@kt.com>.
Hi Will,

Absolutely, you're right. I'll try the details provided by Mr. Tariq in another thread tomorrow.

Thank you so much for your time and guidance.

Regards,
Ashutosh Sharma

On Jun 21, 2012, at 7:45 PM, "Will McQueen" <wi...@cloudera.com>> wrote:

Hi Sharma,

So I assume that your command looks something like this:
     flume-ng agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf -c /etc/flume-ng/conf

...?

Hari, I saw your comment:
>>I am not sure if HBase changed their wire protocol between these versions.
Do you have any other advice about troubleshooting a possible hbase protocol mismatch issue?

Cheers,
Will


On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀) <sh...@kt.com>> wrote:
Hi Will,

I installed flume as part of CDH3u4 version 1.1 using yum install flume-ng. One more point, I am using flume-ng hbase sink downloaded from: https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar

Now, I ran the agent with –conf parameter with updated log4j.properties. I don’t see any error in the log. Please see the below from the log file:

2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1
2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting - hbase-agent
2012-06-21 18:25:08,146 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting
2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 9
2012-06-21 18:25:08,146 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting
2012-06-21 18:25:08,148 DEBUG nodemanager.DefaultLogicalNodeManager: Node manager started
2012-06-21 18:25:08,148 DEBUG properties.PropertiesFileConfigurationProvider: Configuration provider started
2012-06-21 18:25:08,149 DEBUG properties.PropertiesFileConfigurationProvider: Checking file:/home/hadoop/flumeng/hbaseagent.conf for changes
2012-06-21 18:25:08,149 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent: hbase-agent
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created context for sink1: serializer.rowPrefix
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting validation of configuration for agent: hbase-agent, initial-configuration: AgentConfiguration[hbase-agent]
SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt, channels=ch1, type=exec} }}
CHANNELS: {ch1={ parameters:{type=memory} }}
SINKS: {sink1={ parameters:{serializer.payloadColumn=col1, serializer.keyType=timestamp, serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer, serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1, batchSize=1, columnFamily=cf1, table=test, type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1, serializer.suffix=timestamp} }}

2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created channel ch1
2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating sink: sink1 using OTHER
2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post validation configuration for hbase-agent
AgentConfiguration created without Configuration stubs for which only basic syntactical validation was performed[hbase-agent]
SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt, channels=ch1, type=exec} }}
CHANNELS: {ch1={ parameters:{type=memory} }}
SINKS: {sink1={ parameters:{serializer.payloadColumn=col1, serializer.keyType=timestamp, serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer, serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1, batchSize=1, columnFamily=cf1, table=test, type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1, serializer.suffix=timestamp} }}
2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Channels:ch1

2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks sink1

2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources tail

2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration  for agents: [hbase-agent]
2012-06-21 18:25:08,171 INFO properties.PropertiesFileConfigurationProvider: Creating channels
2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory: Creating instance of channel ch1 type memory
2012-06-21 18:25:08,175 INFO properties.PropertiesFileConfigurationProvider: created channel ch1
2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory: Creating instance of source tail, type exec
2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating instance of sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type org.apache.flume.sink.hbase.HBaseSink is a custom type
2012-06-21 18:25:08,298 INFO nodemanager.DefaultLogicalNodeManager: Node configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource@1fd0fafc }} sinkRunners:{sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5 counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source starting with command:tail -f /home/hadoop/demo.txt
2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source started

Output of the which Flume-ng is:
/usr/bin/flume-ng


----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
Cell: 010-7300-0150
Email: sharma.ashutosh@kt.com<ht...@kt.com>
----------------------------------------

From: Will McQueen [mailto:will@cloudera.com<ma...@cloudera.com>]
Sent: Thursday, June 21, 2012 6:07 PM

To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior

Hi Sharma,


Could you please describe how you installed flume? Also, I see you're getting this warning:

>> Warning: No configuration directory set! Use --conf <dir> to override.



The log4j.properties that flume provides is stored in the conf dir. If you specify the flume conf dir, flume can pick it up. So for troubleshooting you can try:

1) modifying the log4j.properties within flume's conf dir so that the top reads:
#flume.root.logger=DEBUG,console
flume.root.logger=DEBUG,LOGFILE
flume.log.dir=.
flume.log.file=flume.log

2) Run the flume agent while specifying the flume conf dir (--conf <dir>)

3) What's the output of 'which flume-ng'?

Cheers,
Will
On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀) <sh...@kt.com>> wrote:
Hi Hari,

I checked, agent is successfully tailing the file which I mentioned. Yes, you are right, agent has started properly without any error. Because there is no further movement, so it’s hard for me to identify the issue. I also used tail –F also, but no success.
Can you suggest me some technique to troubleshoot it, so I could identify the issue and resolve the same. Does flume record some log anywhere?

----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
Cell: 010-7300-0150
Email: sharma.ashutosh@kt.com<ht...@kt.com>
----------------------------------------

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com<ma...@cloudera.com>]
Sent: Thursday, June 21, 2012 5:25 PM

To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior

I am not sure if HBase changed their wire protocol between these versions. Looks like your agent has started properly. Are you sure data is being written into the file being tailed? I suggest using tail -F. The log being stuck here is ok, that is probably because nothing specific is required(or your log file rotated).

Thanks
Hari

--
Hari Shreedharan


On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:

Hi Hari,



Thanks for your prompt reply. I already created the table in Hbase with a column family and hadoop/hbase library is available to hadoop. I noticed that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?

Please see the below lines captured while running the flume agent:



>>> flume-ng  agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf

Warning: No configuration directory set! Use --conf <dir> to override.

Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access

Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from classpath

Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from classpath

+ exec /home/hadoop/jdk16/bin/java -Xmx20m -cp '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar' -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64 org.apache.flume.node.Application -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf

12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1

12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting - hbase-agent

12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting

12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 10

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent: hbase-agent

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration  for agents: [hbase-agent]

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Creating channels

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: created channel ch1

12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink

12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource@1ed0af9b }} sinkRunners:{sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }

12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with command:tail -f /home/hadoop/demo.txt



Screen stuck here….no movement.



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------



From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Thursday, June 21, 2012 5:01 PM
To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior



Hi Ashutosh,



The sink will not create the table or column family. Make sure you have the table and column family. Also please make sure you have HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your class path).





Thanks

Hari



--

Hari Shreedharan



On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:

Hi,



I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.



Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members....



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------



-----Original Message-----

From: Mohammad Tariq [mailto:dontariq@gmail.com]

Sent: Friday, June 15, 2012 9:02 AM

To: flume-user@incubator.apache.org<ma...@incubator.apache.org>

Subject: Re: Hbase-sink behavior



Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you.



Regards,

Mohammad Tariq





On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hs...@cloudera.com>> wrote:

Hi Mohammad,



My answers are inline.



--

Hari Shreedharan



On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:



Hello list,



I am trying to use hbase-sink to collect data from a local file and

dump it into an Hbase table..But there are a few things I am not able

to understand and need some guidance.



This is the content of my conf file :



hbase-agent.sources = tail

hbase-agent..sinks = sink1

hbase-agent.channels = ch1

hbase-agent.sources.tail.type = exec

hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt

hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =

org.apache.flume.sink.hbase.HBaseSink

hbase-agent.sinks.sink1.channel = ch1

hbase-agent.sinks.sink1.table = test3

hbase-agent.sinks.sink1.columnFamily = testing

hbase-agent.sinks.sink1.column = foo

hbase-agent.sinks.sink1.serializer =

org.apache.flume.sink.hbase.SimpleHbaseEventSerializer

hbase-agent.sinks.sink1.serializer.payloadColumn = col1

hbase-agent.sinks.sink1.serializer.incrementColumn = col1

hbase-agent.sinks.sink1.serializer.keyType = timestamp

hbase-agent.sinks.sink1.serializer.rowPrefix = 1

hbase-agent.sinks.sink1.serializer.suffix = timestamp

hbase-agent.channels.ch1.type=memory



Right now I am taking just some simple text from a file which has

following content -



value1

value2

value3

value4

value5

value6



And my Hbase table looks like -



hbase(main):217:0> scan 'test3'

ROW COLUMN+CELL

11339716704561 column=testing:col1,

timestamp=1339716707569, value=value1

11339716704562 column=testing:col1,

timestamp=1339716707571, value=value4

11339716846594 column=testing:col1,

timestamp=1339716849608, value=value2

11339716846595 column=testing:col1,

timestamp=1339716849610, value=value1

11339716846596 column=testing:col1,

timestamp=1339716849611, value=value6

11339716846597 column=testing:col1,

timestamp=1339716849614, value=value6

11339716846598 column=testing:col1,

timestamp=1339716849615, value=value5

11339716846599 column=testing:col1,

timestamp=1339716849615, value=value6

incRow column=testing:col1,

timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C

9 row(s) in 0.0580 seconds



Now I have following questions -



1- Why the timestamp value is different from the row key?(I was trying

to make "1+timestamp" as the rowkey)



The value shown by hbase shell as timestamp is the time at which the

value was inserted into Hbase, while the value inserted by Flume is

the timestamp at which the sink read the event from the channel.

Depending on how long the network and HBase takes, these timestamps

can vary. If you want 1+timestamp as row key then you should configure it:



hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is

appended as-is to the suffix you choose.



2- Although I am not using "incRow", it stills appear in the table

with some value. Why so and what is this value??



The SimpleHBaseEventSerializer is only an example class. For custom

use cases you can write your own serializer by implementing

HbaseEventSerializer. In this case, you have specified

incrementColumn, which causes an increment on the column specified.

Simply don't specify that config and that row will not appear.



3- How can avoid the last row??



See above.





I am still in the learning phase so please pardon my ignorance..Many thanks.



No problem. Much of this is documented

here:

https://builds.apache.org/job/flume-trunk/site/apidocs/index.html<https://builds.apache..org/job/flume-trunk/site/apidocs/index.html>







Regards,

Mohammad Tariq





이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.

This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.




이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.

Re: Hbase-sink behavior

Posted by Rahul Patodi <pa...@gmail.com>.
It might be helpful for newbies:
http://www.technology-mania.com/2012/06/deploy-apache-flume-ng-1xx.html


On Fri, Jun 22, 2012 at 11:08 AM, ashutosh(오픈플랫폼개발팀) <sharma.ashutosh@kt.com
> wrote:

>  Hi Hari,
>
>
>
> My ***configuration* is :
>
>                                 hbase-agent.sources = tail
>
> hbase-agent.sinks = sink1
>
> hbase-agent.channels = ch1
>
>
>
> hbase-agent.sources.tail.type = exec
>
> hbase-agent.sources.tail.command = tail -F /home/hadoop/demo.txt
>
> hbase-agent.sources.tail.channels = ch1
>
>
>
> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
>
> hbase-agent.sinks.sink1.channel = ch1
>
> hbase-agent.sinks.sink1.table = demo
>
> hbase-agent.sinks.sink1.columnFamily = cf
>
> hbase-agent.sinks.sink1.column = foo
>
> hbase-agent.sinks.sink1.serializer =
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>
> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+
>
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>
>
>
> hbase-agent.channels.ch1.type = memory
>
>
>
> I also added the following ***parameter* in .bash_profile. All hadoop,
> hbase, and hbase-site.xml is accessible:
>
>                                 HADOOP_HOME
>
>                                 HBASE_HOME
>
>                                 CLASSPATH  /// As mentioned by Rahul.
>
>
>
> Here is ***log* which flume agent generates(it’s in debug mode):
>
> >>>>>>>>>>>$ flume-ng  agent -n hbase-agent --conf /etc/flume-ng/conf/ -f
> /etc/flume-ng/conf/hbase-agent  ///****Flume agent command *
>
> * *
>
> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
> access
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from classpath
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from
> classpath
>
> + exec /usr/lib/jvm/java-6-sun/bin/java -Xmx20m -cp
> '/etc/flume-ng/conf:/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/opt/jdk16/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> org.apache.flume.node.Application -n hbase-agent -f
> /etc/flume-ng/conf/hbase-agent
>
> 2012-06-22 14:33:05,759 (main) [INFO -
> org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)]
> Starting lifecycle supervisor 1
>
> 2012-06-22 14:33:05,763 (main) [INFO -
> org.apache.flume.node.FlumeNode.start(FlumeNode.java:54)] Flume node
> starting - hbase-agent
>
> 2012-06-22 14:33:05,766 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.start(DefaultLogicalNodeManager.java:140)]
> Node manager starting
>
> 2012-06-22 14:33:05,768 (lifecycleSupervisor-1-0) [INFO -
> org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)]
> Starting lifecycle supervisor 10
>
> 2012-06-22 14:33:05,769 (lifecycleSupervisor-1-2) [INFO -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.start(AbstractFileConfigurationProvider.java:67)]
> Configuration provider starting
>
> 2012-06-22 14:33:05,769 (lifecycleSupervisor-1-0) [DEBUG -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.start(DefaultLogicalNodeManager.java:144)]
> Node manager started
>
> 2012-06-22 14:33:05,770 (lifecycleSupervisor-1-2) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.start(AbstractFileConfigurationProvider.java:87)]
> Configuration provider started
>
> 2012-06-22 14:33:05,770 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:189)]
> Checking file:/etc/flume-ng/conf/hbase-agent for changes
>
> 2012-06-22 14:33:05,771 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:196)]
> Reloading configuration file:/etc/flume-ng/conf/hbase-agent
>
> 2012-06-22 14:33:05,775 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,775 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:973)]
> Created context for sink1: serializer
>
> 2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:886)]
> Added sinks: sink1 Agent: hbase-agent
>
> 2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,778 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,778 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,778 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,779 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,779 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)]
> Processing:sink1
>
> 2012-06-22 14:33:05,779 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:284)]
> Starting validation of configuration for agent: hbase-agent,
> initial-configuration: AgentConfiguration[hbase-agent]
>
> SOURCES: {tail={ parameters:{command=tail -F /home/hadoop/demo.txt,
> channels=ch1, type=exec} }}
>
> CHANNELS: {ch1={ parameters:{type=memory} }}
>
> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1+,
> columnFamily=cf, table=demo, type=org.apache.flume.sink.hbase.HBaseSink,
> channel=ch1, serializer.suffix=timestamp} }}
>
>
>
> 2012-06-22 14:33:05,784 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateChannels(FlumeConfiguration.java:438)]
> Created channel ch1
>
> 2012-06-22 14:33:05,796 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:633)]
> Creating sink: sink1 using OTHER
>
> 2012-06-22 14:33:05,802 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:342)]
> Post validation configuration for hbase-agent
>
> AgentConfiguration created without Configuration stubs for which only
> basic syntactical validation was performed[hbase-agent]
>
> SOURCES: {tail={ parameters:{command=tail -F /home/hadoop/demo.txt,
> channels=ch1, type=exec} }}
>
> CHANNELS: {ch1={ parameters:{type=memory} }}
>
> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1+,
> columnFamily=cf, table=demo, type=org.apache.flume.sink.hbase.HBaseSink,
> channel=ch1, serializer.suffix=timestamp} }}
>
>
>
> 2012-06-22 14:33:05,802 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:128)]
> Channels:ch1
>
>
>
> 2012-06-22 14:33:05,802 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:129)]
> Sinks sink1
>
>
>
> 2012-06-22 14:33:05,802 (conf-file-poller-0) [DEBUG -
> org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:130)]
> Sources tail
>
>
>
> 2012-06-22 14:33:05,802 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:133)]
> Post-validation flume configuration contains configuration  for agents:
> [hbase-agent]
>
> 2012-06-22 14:33:05,802 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadChannels(PropertiesFileConfigurationProvider.java:246)]
> Creating channels
>
> 2012-06-22 14:33:05,803 (conf-file-poller-0) [DEBUG -
> org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:68)]
> Creating instance of channel ch1 type memory
>
> 2012-06-22 14:33:05,807 (conf-file-poller-0) [INFO -
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadChannels(PropertiesFileConfigurationProvider.java:262)]
> created channel ch1
>
> 2012-06-22 14:33:05,807 (conf-file-poller-0) [DEBUG -
> org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:74)]
> Creating instance of source tail, type exec
>
> 2012-06-22 14:33:05,813 (conf-file-poller-0) [INFO -
> org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:70)]
> Creating instance of sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>
> 2012-06-22 14:33:05,813 (conf-file-poller-0) [DEBUG -
> org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:78)]
> Sink type org.apache.flume.sink.hbase.HBaseSink is a custom type
>
> 2012-06-22 14:33:05,943 (conf-file-poller-0) [INFO -
> org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:54)]
> Node configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
> source:org.apache.flume.source.ExecSource@311671b2 }}
> sinkRunners:{sink1=SinkRunner: {
> policy:org.apache.flume.sink.DefaultSinkProcessor@3882764b counterGroup:{
> name:null counters:{} } }}
> channels:{ch1=org.apache.flume.channel.MemoryChannel@7d2452e8} }
>
> 2012-06-22 14:33:05,949 (lifecycleSupervisor-1-2) [INFO -
> org.apache.flume.source.ExecSource.start(ExecSource.java:145)] Exec source
> starting with command:tail -F /home/hadoop/demo.txt
>
> 2012-06-22 14:33:05,950 (lifecycleSupervisor-1-2) [DEBUG -
> org.apache.flume.source.ExecSource.start(ExecSource.java:163)] Exec source
> started
>
>
>
>
>
> Still, it’s not moving any data into hbase.
>
>
>
> *Rahul:* Could you please recall, what additional configuration or
> troubleshooting you did. As I aware that, yesterday we were in same
> situation.
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> ----------------------------------------
>
>
>
> *From:* Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> *Sent:* Friday, June 22, 2012 12:39 PM
> *To:* flume-user@incubator.apache.org
> *Subject:* Re: Hbase-sink behavior
>
>
>
> Hi,
>
>
>
> Could you please make sure you hbase-site.xml is in the class path which
> flume is using. If the log you sent earlier was the only log you had, it
> means Hbase sink is unable to connect/write to Hbase. it definitely seems
> like Hbase client API is unable to connect. Please send your configuration
> too.
>
>
>
>
>
> Thanks
>
> Hari
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 21, 2012 at 8:01 PM, ashutosh(오픈플랫폼개발팀) wrote:
>
>   Hi Folks,
>
>
>
> I tried every options, but didn’t get any success yet. I am still not able
> to store data into Hbase. It seems that Hbase agent is working fine without
> reporting any error/warning.  I think there is some issue between hbase
> sink and hbase database. Can you please help me to troubleshoot this
> problem to identify the issue between hbase sink and hbase database.
> However, I used the same configuration mentioned by Mr. Rahul in chain of
> mails earlier. But none of these configurations worked for me.
>
>
>
> Please…please…please help me.
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> ----------------------------------------
>
>
>
> *From:* Hari Shreedharan [mailto:hshreedharan@cloudera.com<hs...@cloudera.com>]
>
> *Sent:* Friday, June 22, 2012 2:59 AM
> *To:* flume-user@incubator.apache.org
> *Subject:* Re: Hbase-sink behavior
>
>
>
> Hi,
>
>
>
> There are a couple of things you should not here:
>
>
>
> * If more than one event is read from the channel in the same millisecond,
> then these events will get written to HBase with the same row key, and one
> could potentially overwrite the older events, unless you have Hbase
> configured to support multiple versions.
>
> * Flume does not guarantee ordering or uniqueness, it guarantees at least
> once delivery. if a transaction fails, then Flume will try to write all
> events in the transaction again, and may cause duplicates. In case of Hbase
> the serializer is expected to make sure duplicates do not overwrite
> non-duplicate data, as mentioned in the Javadocs.
>
>
>
> Note that the SimpleHbaseEventSerializer is only an example, you should
> ideally write your own serializer and plug it in. This will ensure data is
> written in a way you expect.
>
>
>
> Thanks
>
> Hari
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 21, 2012 at 6:27 AM, Mohammad Tariq wrote:
>
>   Hi Rahul,
>
>
>
> Actually that has nothing to do with Flume..Simply, out of
>
> excitement I used the same file more than once so all these values
>
> went as different versions into the Hbase table. And when you tail a
>
> file without modifying the behavior of the tail command it will take
>
> only last few records and not the entire content of the file. That
>
> could be a reason for the absence of value3..But there is no issue
>
> from Flume's side..It totally depends on tail's behavior .
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Thu, Jun 21, 2012 at 6:47 PM, Rahul Patodi
>
> <pa...@gmail.com> wrote:
>
>  If you look at the output provided by you in the first mail of this mail
>
> thread:
>
> in your file (on local file system) you have value 1 to 6 (value1, value2,
>
> value3....)
>
> but when you scan in hbase output is value1, value4 , value2 , value1 ,
>
> value6 , value6 , value5 , value6
>
>
>
> value3 is not inserted
>
> value 6 is inserted 3 times
>
>
>
> did you figure out why so ?
>
>
>
>
>
>
>
> On Thu, Jun 21, 2012 at 6:24 PM, Mohammad Tariq <do...@gmail.com>
> wrote:
>
>
>
> Both the commands seem similar to me.
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Thu, Jun 21, 2012 at 5:43 PM, Rahul Patodi
>
> <pa...@gmail.com> wrote:
>
>  Hi Mohammad,
>
> Thanks for your response
>
> I have put this configuration:
>
>
>
> hbase-agent.sources=tail
>
> hbase-agent.sinks=sink1
>
> hbase-agent.channels=ch1
>
>
>
> hbase-agent.sources.tail.type=exec
>
> hbase-agent.sources.tail.command=tail -F /tmp/test05
>
> hbase-agent.sources.tail.channels=ch1
>
>
>
> hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink
>
> hbase-agent.sinks.sink1.channel=ch1
>
> hbase-agent.sinks.sink1.table=t002
>
> hbase-agent.sinks.sink1.columnFamily=cf
>
> hbase-agent.sinks.sink1.column=foo
>
>
>
>
> hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
> hbase-agent.sinks.sink1.serializer.payloadColumn=col1
>
> hbase-agent.sinks.sink1.serializer.incrementColumn=col1
>
> #hbase-agent.sinks.sink1.serializer.keyType=timestamp
>
> hbase-agent.sinks.sink1.serializer.rowPrefix=1+
>
> hbase-agent.sinks.sink1.serializer.suffix=timestamp
>
>
>
> hbase-agent.channels.ch1.type=memory
>
>
>
>
>
> Data is getting copy into HBase, but I have got another issue:
>
>
>
> My input data is simply:
>
>
>
> value1
>
> value2
>
> value3
>
> value4
>
> value5
>
> value6
>
> value7
>
> value8
>
> value9
>
>
>
> when I run this command in HBase:
>
> hbase(main):129:0> scan 't002', {VERSIONS => 3}
>
> ROW COLUMN+CELL
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758424,
>
> value=value5
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758423,
>
> value=value3
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758417,
>
> value=value1
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758427,
>
> value=value9
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758426,
>
> value=value8
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758425,
>
> value=value7
>
> incRow column=cf:col1,
>
> timestamp=1340279758443,
>
> value=\x00\x00\x00\x00\x00\x00\x00\x09
>
> 3 row(s) in 0.0420 seconds
>
>
>
> all the data is not getting copy ??
>
>
>
> When I run this command with version:
>
> hbase(main):130:0> scan 't002', {VERSIONS => 3}
>
> ROW COLUMN+CELL
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758424,
>
> value=value5
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758423,
>
> value=value3
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758417,
>
> value=value1
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758427,
>
> value=value9
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758426,
>
> value=value8
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758425,
>
> value=value7
>
> 1+1340279906637 column=cf:col1,
>
> timestamp=1340279909652,
>
> value=value1
>
> 1+1340279906638 column=cf:col1,
>
> timestamp=1340279909659,
>
> value=value6
>
> 1+1340279906638 column=cf:col1,
>
> timestamp=1340279909658,
>
> value=value5
>
> 1+1340279906638 column=cf:col1,
>
> timestamp=1340279909654,
>
> value=value3
>
> 1+1340279906646 column=cf:col1,
>
> timestamp=1340279909659,
>
> value=value7
>
> 1+1340279906647 column=cf:col1,
>
> timestamp=1340279909659,
>
> value=value9
>
> incRow column=cf:col1,
>
> timestamp=1340279909677,
>
> value=\x00\x00\x00\x00\x00\x00\x00\x12
>
> 7 row(s) in 0.0640 seconds
>
>
>
> Please help me understand this.
>
>
>
>
>
>
>
>
>
> On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <do...@gmail.com>
>
> wrote:
>
>
>
> Hi Will,
>
>
>
> I got it.Thanks for the info.
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <wi...@cloudera.com>
>
> wrote:
>
>  Hi Mohammad,
>
>
>
> In your config file, I think you need to remove this line:
>
>
>
>  hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
>
>
> I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
>
> (although there is a keyType var that stores the value of the
>
> 'suffix'
>
> prop).
>
>
>
> Cheers,
>
> Will
>
>
>
>
>
> On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com>
>
> wrote:
>
>
>
> Hi Rahul,
>
>
>
> This normally happens when there is some problem in the
>
> configuration file.Create a file called hbase-agent inside your
>
> FLUME_HOME/conf directory and copy this content into it:
>
> hbase-agent.sources = tail
>
> hbase-agent.sinks = sink1
>
> hbase-agent.channels = ch1
>
>
>
> hbase-agent.sources.tail.type = exec
>
> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>
> hbase-agent.sources.tail.channels = ch1
>
>
>
> hbase-agent.sinks.sink1.type = org.apache.flume.sink..hbase.HBaseSink
>
> hbase-agent.sinks.sink1.channel = ch1
>
> hbase-agent.sinks.sink1.table = demo
>
> hbase-agent.sinks.sink1.columnFamily = cf
>
>
>
> hbase-agent.sinks.sink1.serializer =
>
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>
>
>
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>
>
>
> hbase-agent.channels.ch1.type=memory
>
>
>
> Then start the agent and see if it works for you. It worked for me.
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com>
>
> wrote:
>
>  Hi Sharma,
>
>
>
> So I assume that your command looks something like this:
>
> flume-ng agent -n hbase-agent -f
>
> /home/hadoop/flumeng/hbaseagent.conf
>
> -c /etc/flume-ng/conf
>
>
>
> ...?
>
>
>
> Hari, I saw your comment:
>
>
>
>   I am not sure if HBase changed their wire protocol between these
>
> versions.
>
>  Do you have any other advice about troubleshooting a possible
>
> hbase
>
> protocol
>
> mismatch issue?
>
>
>
> Cheers,
>
> Will
>
>
>
>
>
>
>
> On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
>
> <sh...@kt.com>
>
> wrote:
>
>
>
> Hi Will,
>
>
>
>
>
>
>
> I installed flume as part of CDH3u4 version 1.1 using yum install
>
> flume-ng. One more point, I am using flume-ng hbase sink
>
> downloaded
>
> from:
>
>
>
>
>
>
>
>
> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar<https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1..1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar>
>
>
>
>
>
>
>
> Now, I ran the agent with -conf parameter with updated
>
> log4j.properties. I
>
> don't see any error in the log. Please see the below from the log
>
> file:
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor:
>
> Starting
>
> lifecycle supervisor 1
>
>
>
> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting
>
> -
>
> hbase-agent
>
>
>
> 2012-06-21 18:25:08,146 INFO
>
> nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> manager starting
>
>
>
> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor:
>
> Starting
>
> lifecycle supervisor 9
>
>
>
> 2012-06-21 18:25:08,146 INFO
>
> properties.PropertiesFileConfigurationProvider: Configuration
>
> provider
>
> starting
>
>
>
> 2012-06-21 18:25:08,148 DEBUG
>
> nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> manager started
>
>
>
> 2012-06-21 18:25:08,148 DEBUG
>
> properties.PropertiesFileConfigurationProvider: Configuration
>
> provider
>
> started
>
>
>
> 2012-06-21 18:25:08,149 DEBUG
>
> properties.PropertiesFileConfigurationProvider: Checking
>
> file:/home/hadoop/flumeng/hbaseagent.conf for changes
>
>
>
> 2012-06-21 18:25:08,149 INFO
>
> properties.PropertiesFileConfigurationProvider: Reloading
>
> configuration
>
> file:/home/hadoop/flumeng/hbaseagent.conf
>
>
>
> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added
>
> sinks:
>
> sink1
>
> Agent: hbase-agent
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created
>
> context
>
> for
>
> sink1: serializer.rowPrefix
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
>
> validation
>
> of configuration for agent: hbase-agent, initial-configuration:
>
> AgentConfiguration[hbase-agent]
>
>
>
> SOURCES: {tail={ parameters:{command=tail -f
>
> /home/hadoop/demo.txt,
>
> channels=ch1, type=exec} }}
>
>
>
> CHANNELS: {ch1={ parameters:{type=memory} }}
>
>
>
> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>
> serializer.keyType=timestamp,
>
>
>
> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>
> serializer.incrementColumn=col1, column=foo,
>
> serializer.rowPrefix=1,
>
> batchSize=1, columnFamily=cf1, table=test,
>
> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>
> serializer.suffix=timestamp} }}
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created
>
> channel
>
> ch1
>
>
>
> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating
>
> sink:
>
> sink1 using OTHER
>
>
>
> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post
>
> validation
>
> configuration for hbase-agent
>
>
>
> AgentConfiguration created without Configuration stubs for which
>
> only
>
> basic syntactical validation was performed[hbase-agent]
>
>
>
> SOURCES: {tail={ parameters:{command=tail -f
>
> /home/hadoop/demo.txt,
>
> channels=ch1, type=exec} }}
>
>
>
> CHANNELS: {ch1={ parameters:{type=memory} }}
>
>
>
> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>
> serializer.keyType=timestamp,
>
>
>
> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>
> serializer.incrementColumn=col1, column=foo,
>
> serializer.rowPrefix=1,
>
> batchSize=1, columnFamily=cf1, table=test,
>
> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>
> serializer.suffix=timestamp} }}
>
>
>
> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration:
>
> Channels:ch1
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks
>
> sink1
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources
>
> tail
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration:
>
> Post-validation
>
> flume configuration contains configuration for agents:
>
> [hbase-agent]
>
>
>
> 2012-06-21 18:25:08,171 INFO
>
> properties.PropertiesFileConfigurationProvider: Creating channels
>
>
>
> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory:
>
> Creating
>
> instance of channel ch1 type memory
>
>
>
> 2012-06-21 18:25:08,175 INFO
>
> properties.PropertiesFileConfigurationProvider: created channel
>
> ch1
>
>
>
> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory:
>
> Creating
>
> instance of source tail, type exec
>
>
>
> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating
>
> instance
>
> of
>
> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>
>
>
> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
>
> org.apache.flume.sink.hbase.HBaseSink is a custom type
>
>
>
> 2012-06-21 18:25:08,298 INFO
>
> nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> configuration change:{
>
> sourceRunners:{tail=EventDrivenSourceRunner:
>
> {
>
> source:org.apache.flume.source.ExecSource@1fd0fafc }}
>
> sinkRunners:{sink1=SinkRunner: {
>
> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
>
> counterGroup:{
>
> name:null counters:{} } }}
>
> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
>
>
>
> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source
>
> starting
>
> with
>
> command:tail -f /home/hadoop/demo.txt
>
>
>
> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source
>
> started
>
>
>
>
>
>
>
> Output of the which Flume-ng is:
>
>
>
> /usr/bin/flume-ng
>
>
>
>
>
>
>
>
>
>
>
> ----------------------------------------
>
>
>
> ----------------------------------------
>
>
>
> Thanks & Regards,
>
>
>
> Ashutosh Sharma
>
>
>
> Cell: 010-7300-0150
>
>
>
> Email: sharma.ashutosh@kt.com
>
>
>
> ----------------------------------------
>
>
>
>
>
>
>
> From: Will McQueen [mailto:will@cloudera.com <wi...@cloudera.com>]
>
> Sent: Thursday, June 21, 2012 6:07 PM
>
>
>
>
>
> To: flume-user@incubator.apache.org
>
> Subject: Re: Hbase-sink behavior
>
>
>
>
>
>
>
> Hi Sharma,
>
>
>
>
>
>
>
> Could you please describe how you installed flume? Also, I see
>
> you're
>
> getting this warning:
>
>
>
>   Warning: No configuration directory set! Use --conf <dir> to
>
> override.
>
>
>
>
>
>
>
> The log4j.properties that flume provides is stored in the conf
>
> dir.
>
> If
>
> you
>
> specify the flume conf dir, flume can pick it up. So for
>
> troubleshooting you
>
> can try:
>
>
>
>
>
> 1) modifying the log4j.properties within flume's conf dir so that
>
> the
>
> top
>
> reads:
>
> #flume.root.logger=DEBUG,console
>
> flume.root.logger=DEBUG,LOGFILE
>
> flume.log.dir=.
>
> flume.log.file=flume.log
>
>
>
> 2) Run the flume agent while specifying the flume conf dir
>
> (--conf
>
> <dir>)
>
>
>
> 3) What's the output of 'which flume-ng'?
>
>
>
> Cheers,
>
> Will
>
>
>
> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
>
> <sh...@kt.com> wrote:
>
>
>
> Hi Hari,
>
>
>
>
>
>
>
> I checked, agent is successfully tailing the file which I
>
> mentioned.
>
> Yes,
>
> you are right, agent has started properly without any error.
>
> Because
>
> there
>
> is no further movement, so it's hard for me to identify the
>
> issue. I
>
> also
>
> used tail -F also, but no success.
>
>
>
> Can you suggest me some technique to troubleshoot it, so I could
>
> identify
>
> the issue and resolve the same. Does flume record some log
>
> anywhere?
>
>
>
>
>
>
>
> ----------------------------------------
>
>
>
> ----------------------------------------
>
>
>
> Thanks & Regards,
>
>
>
> Ashutosh Sharma
>
>
>
> Cell: 010-7300-0150
>
>
>
> Email: sharma.ashutosh@kt.com
>
>
>
> ----------------------------------------
>
>
>
>
>
>
>
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com<hs...@cloudera.com>
> ]
>
> Sent: Thursday, June 21, 2012 5:25 PM
>
>
>
>
>
> To: flume-user@incubator.apache.org
>
> Subject: Re: Hbase-sink behavior
>
>
>
>
>
>
>
> I am not sure if HBase changed their wire protocol between these
>
> versions.
>
> Looks like your agent has started properly. Are you sure data is
>
> being
>
> written into the file being tailed? I suggest using tail -F. The
>
> log
>
> being
>
> stuck here is ok, that is probably because nothing specific is
>
> required(or
>
> your log file rotated).
>
>
>
>
>
>
>
> Thanks
>
>
>
> Hari
>
>
>
>
>
>
>
> --
>
>
>
> Hari Shreedharan
>
>
>
>
>
>
>
> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
>
>
>
> Hi Hari,
>
>
>
>
>
>
>
> Thanks for your prompt reply. I already created the table in
>
> Hbase
>
> with
>
> a
>
> column family and hadoop/hbase library is available to hadoop. I
>
> noticed
>
> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
>
>
>
> Please see the below lines captured while running the flume
>
> agent:
>
>
>
>
>
>
>
>   flume-ng agent -n hbase-agent -f
>
> /home/hadoop/flumeng/hbaseagent.conf
>
>
>
> Warning: No configuration directory set! Use --conf <dir> to
>
> override.
>
>
>
> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for
>
> HDFS
>
> access
>
>
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4..3.jar from
>
> classpath
>
>
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar
>
> from
>
> classpath
>
>
>
> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
>
>
>
>
>
>
>
>
> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera..1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
>
>
>
> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
>
> org.apache.flume.node.Application -n hbase-agent -f
>
> /home/hadoop/flumeng/hbaseagent.conf
>
>
>
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>
> lifecycle
>
> supervisor 1
>
>
>
> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
>
> hbase-agent
>
>
>
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> manager
>
> starting
>
>
>
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>
> lifecycle
>
> supervisor 10
>
>
>
> 12/06/21 16:40:42 INFO
>
> properties.PropertiesFileConfigurationProvider:
>
> Configuration provider starting
>
>
>
> 12/06/21 16:40:42 INFO
>
> properties.PropertiesFileConfigurationProvider:
>
> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks:
>
> sink1
>
> Agent:
>
> hbase-agent
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation
>
> flume
>
> configuration contains configuration for agents: [hbase-agent]
>
>
>
> 12/06/21 16:40:42 INFO
>
> properties.PropertiesFileConfigurationProvider:
>
> Creating channels
>
>
>
> 12/06/21 16:40:42 INFO
>
> properties.PropertiesFileConfigurationProvider:
>
> created channel ch1
>
>
>
> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance
>
> of
>
> sink
>
> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>
>
>
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> configuration change:{
>
> sourceRunners:{tail=EventDrivenSourceRunner:
>
> {
>
> source:org.apache.flume.source.ExecSource@1ed0af9b }}
>
> sinkRunners:{sink1=SinkRunner: {
>
> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
>
> counterGroup:{
>
> name:null counters:{} } }}
>
> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
>
>
>
> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting
>
> with
>
> command:tail -f /home/hadoop/demo.txt
>
>
>
>
>
>
>
> Screen stuck here....no movement.
>
>
>
>
>
>
>
> ----------------------------------------
>
>
>
> ----------------------------------------
>
>
>
> Thanks & Regards,
>
>
>
> Ashutosh Sharma
>
>
>
> ----------------------------------------
>
>
>
>
>
>
>
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com<hs...@cloudera.com>
> ]
>
> Sent: Thursday, June 21, 2012 5:01 PM
>
> To: flume-user@incubator.apache.org
>
> Subject: Re: Hbase-sink behavior
>
>
>
>
>
>
>
> Hi Ashutosh,
>
>
>
>
>
>
>
> The sink will not create the table or column family. Make sure
>
> you
>
> have
>
> the table and column family. Also please make sure you have
>
> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they
>
> are
>
> in
>
> your
>
> class path).
>
>
>
>
>
>
>
>
>
>
>
> Thanks
>
>
>
> Hari
>
>
>
>
>
>
>
> --
>
>
>
> Hari Shreedharan
>
>
>
>
>
>
>
> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
>
>
>
> Hi,
>
>
>
>
>
>
>
> I have used and followed the same steps which is mentioned in
>
> below
>
> mails
>
> to get start with the hbasesink. But agent is not storing any
>
> data
>
> into
>
> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick
>
> the
>
> hbase
>
> information. Even I am able to connect to the hbase server from
>
> that
>
> agent
>
> machine.
>
>
>
>
>
>
>
> Now, I am unable to understand and troubleshoot this problem.
>
> Seeking
>
> advice from the community members....
>
>
>
>
>
>
>
> ----------------------------------------
>
>
>
> ----------------------------------------
>
>
>
> Thanks & Regards,
>
>
>
> Ashutosh Sharma
>
>
>
> ----------------------------------------
>
>
>
>
>
>
>
> -----Original Message-----
>
>
>
> From: Mohammad Tariq [mailto:dontariq@gmail.com <do...@gmail.com>]
>
>
>
> Sent: Friday, June 15, 2012 9:02 AM
>
>
>
> To: flume-user@incubator.apache.org
>
>
>
> Subject: Re: Hbase-sink behavior
>
>
>
>
>
>
>
> Thank you so much Hari for the valuable response..I'll follow the
>
> guidelines provided by you.
>
>
>
>
>
>
>
> Regards,
>
>
>
> Mohammad Tariq
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
>
> <hs...@cloudera.com> wrote:
>
>
>
> Hi Mohammad,
>
>
>
>
>
>
>
> My answers are inline.
>
>
>
>
>
>
>
> --
>
>
>
> Hari Shreedharan
>
>
>
>
>
>
>
> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>
>
>
>
>
>
>
> Hello list,
>
>
>
>
>
>
>
> I am trying to use hbase-sink to collect data from a local file
>
> and
>
>
>
> dump it into an Hbase table..But there are a few things I am not
>
> able
>
>
>
> to understand and need some guidance.
>
>
>
>
>
>
>
> This is the content of my conf file :
>
>
>
>
>
>
>
> hbase-agent.sources = tail
>
>
>
> hbase-agent..sinks = sink1
>
>
>
> hbase-agent.channels = ch1
>
>
>
> hbase-agent.sources.tail.type = exec
>
>
>
> hbase-agent.sources.tail.command = tail -F
>
> /home/mohammad/demo.txt
>
>
>
> hbase-agent.sources.tail.channels = ch1
>
> hbase-agent.sinks.sink1.type
>
> =
>
>
>
> org.apache.flume.sink.hbase.HBaseSink
>
>
>
> hbase-agent.sinks.sink1.channel = ch1
>
>
>
> hbase-agent.sinks.sink1.table = test3
>
>
>
> hbase-agent.sinks.sink1.columnFamily = testing
>
>
>
> hbase-agent.sinks.sink1.column = foo
>
>
>
> hbase-agent.sinks.sink1.serializer =
>
>
>
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
>
>
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>
>
>
> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>
>
>
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
>
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>
>
>
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>
>
>
> hbase-agent.channels.ch1.type=memory
>
>
>
>
>
>
>
> Right now I am taking just some simple text from a file which has
>
>
>
> following content -
>
>
>
>
>
>
>
> value1
>
>
>
> value2
>
>
>
> value3
>
>
>
> value4
>
>
>
> value5
>
>
>
> value6
>
>
>
>
>
>
>
> And my Hbase table looks like -
>
>
>
>
>
>
>
> hbase(main):217:0> scan 'test3'
>
>
>
> ROW COLUMN+CELL
>
>
>
> 11339716704561 column=testing:col1,
>
>
>
> timestamp=1339716707569, value=value1
>
>
>
> 11339716704562 column=testing:col1,
>
>
>
> timestamp=1339716707571, value=value4
>
>
>
> 11339716846594 column=testing:col1,
>
>
>
> timestamp=1339716849608, value=value2
>
>
>
> 11339716846595 column=testing:col1,
>
>
>
> timestamp=1339716849610, value=value1
>
>
>
> 11339716846596 column=testing:col1,
>
>
>
> timestamp=1339716849611, value=value6
>
>
>
> 11339716846597 column=testing:col1,
>
>
>
> timestamp=1339716849614, value=value6
>
>
>
> 11339716846598 column=testing:col1,
>
>
>
> timestamp=1339716849615, value=value5
>
>
>
> 11339716846599 column=testing:col1,
>
>
>
> timestamp=1339716849615, value=value6
>
>
>
> incRow column=testing:col1,
>
>
>
> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
>
>
>
> 9 row(s) in 0.0580 seconds
>
>
>
>
>
>
>
> Now I have following questions -
>
>
>
>
>
>
>
> 1- Why the timestamp value is different from the row key?(I was
>
> trying
>
>
>
> to make "1+timestamp" as the rowkey)
>
>
>
>
>
>
>
> The value shown by hbase shell as timestamp is the time at which
>
> the
>
>
>
> value was inserted into Hbase, while the value inserted by Flume
>
> is
>
>
>
> the timestamp at which the sink read the event from the channel.
>
>
>
> Depending on how long the network and HBase takes, these
>
> timestamps
>
>
>
> can vary. If you want 1+timestamp as row key then you should
>
> configure
>
> it:
>
>
>
>
>
>
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
>
>
>
> appended as-is to the suffix you choose.
>
>
>
>
>
>
>
> 2- Although I am not using "incRow", it stills appear in the
>
> table
>
>
>
> with some value. Why so and what is this value??
>
>
>
>
>
>
>
> The SimpleHBaseEventSerializer is only an example class. For
>
> custom
>
>
>
> use cases you can write your own serializer by implementing
>
>
>
> HbaseEventSerializer. In this case, you have specified
>
>
>
> incrementColumn, which causes an increment on the column
>
> specified.
>
>
>
> Simply don't specify that config and that row will not appear.
>
>
>
>
>
>
>
> 3- How can avoid the last row??
>
>
>
>
>
>
>
> See above.
>
>
>
>
>
>
>
>
>
>
>
> I am still in the learning phase so please pardon my
>
> ignorance..Many
>
> thanks.
>
>
>
>
>
>
>
> No problem. Much of this is documented
>
>
>
> here:
>
>
>
> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Regards,
>
>
>
> Mohammad Tariq
>
>
>
>
>
>
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>
> 없이, 본
>
> 문서에
>
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>
> 본
>
> 메일이
>
> 잘못
>
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
>
>
> This E-mail may contain confidential information and/or copyright
>
> material. This email is intended for the use of the addressee
>
> only.
>
> If
>
> you
>
> receive this email by mistake, please either delete it without
>
> reproducing,
>
> distributing or retaining copies thereof or notify the sender
>
> immediately.
>
>
>
>
>
>
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>
> 없이, 본
>
> 문서에
>
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>
> 본
>
> 메일이
>
> 잘못
>
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
> This E-mail may contain confidential information and/or copyright
>
> material. This email is intended for the use of the addressee
>
> only.
>
> If
>
> you
>
> receive this email by mistake, please either delete it without
>
> reproducing,
>
> distributing or retaining copies thereof or notify the sender
>
> immediately.
>
>
>
>
>
>
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>
> 없이, 본
>
> 문서에
>
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>
> 본
>
> 메일이
>
> 잘못
>
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
> This E-mail may contain confidential information and/or copyright
>
> material. This email is intended for the use of the addressee
>
> only.
>
> If
>
> you
>
> receive this email by mistake, please either delete it without
>
> reproducing,
>
> distributing or retaining copies thereof or notify the sender
>
> immediately.
>
>
>
>
>
>
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>
> 없이, 본
>
> 문서에
>
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>
> 본
>
> 메일이
>
> 잘못
>
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
> This E-mail may contain confidential information and/or copyright
>
> material. This email is intended for the use of the addressee
>
> only.
>
> If
>
> you
>
> receive this email by mistake, please either delete it without
>
> reproducing,
>
> distributing or retaining copies thereof or notify the sender
>
> immediately.
>
>
>
>
>
>
>
>
>
> --
>
> Regards,
>
> Rahul Patodi
>
>
>
>
>
>
>
>
>
> --
>
> Regards,
>
> Rahul Patodi
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다.. 만약, 본 메일이
> 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>



-- 
*Regards*,
Rahul Patodi

RE: Hbase-sink behavior

Posted by "ashutosh (오픈플랫폼개발팀)" <sh...@kt.com>.
Hi Hari,

My **configuration is :
                                hbase-agent.sources = tail
hbase-agent.sinks = sink1
hbase-agent.channels = ch1

hbase-agent.sources.tail.type = exec
hbase-agent.sources.tail.command = tail -F /home/hadoop/demo.txt
hbase-agent.sources.tail.channels = ch1

hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
hbase-agent.sinks.sink1.channel = ch1
hbase-agent.sinks.sink1.table = demo
hbase-agent.sinks.sink1.columnFamily = cf
hbase-agent.sinks.sink1.column = foo
hbase-agent.sinks.sink1.serializer = org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
hbase-agent.sinks.sink1.serializer.payloadColumn = col1
hbase-agent.sinks.sink1.serializer.incrementColumn = col1
hbase-agent.sinks.sink1.serializer.rowPrefix = 1+
hbase-agent.sinks.sink1.serializer.suffix = timestamp

hbase-agent.channels.ch1.type = memory

I also added the following **parameter in .bash_profile. All hadoop, hbase, and hbase-site.xml is accessible:
                                HADOOP_HOME
                                HBASE_HOME
                                CLASSPATH  /// As mentioned by Rahul.

Here is **log which flume agent generates(it’s in debug mode):
>>>>>>>>>>>$ flume-ng  agent -n hbase-agent --conf /etc/flume-ng/conf/ -f /etc/flume-ng/conf/hbase-agent  ///***Flume agent command

Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access
Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from classpath
Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from classpath
+ exec /usr/lib/jvm/java-6-sun/bin/java -Xmx20m -cp '/etc/flume-ng/conf:/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/opt/jdk16/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar' -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64 org.apache.flume.node.Application -n hbase-agent -f /etc/flume-ng/conf/hbase-agent
2012-06-22 14:33:05,759 (main) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)] Starting lifecycle supervisor 1
2012-06-22 14:33:05,763 (main) [INFO - org.apache.flume.node.FlumeNode.start(FlumeNode.java:54)] Flume node starting - hbase-agent
2012-06-22 14:33:05,766 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.start(DefaultLogicalNodeManager.java:140)] Node manager starting
2012-06-22 14:33:05,768 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)] Starting lifecycle supervisor 10
2012-06-22 14:33:05,769 (lifecycleSupervisor-1-2) [INFO - org.apache.flume.conf.file.AbstractFileConfigurationProvider.start(AbstractFileConfigurationProvider.java:67)] Configuration provider starting
2012-06-22 14:33:05,769 (lifecycleSupervisor-1-0) [DEBUG - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.start(DefaultLogicalNodeManager.java:144)] Node manager started
2012-06-22 14:33:05,770 (lifecycleSupervisor-1-2) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider.start(AbstractFileConfigurationProvider.java:87)] Configuration provider started
2012-06-22 14:33:05,770 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:189)] Checking file:/etc/flume-ng/conf/hbase-agent for changes
2012-06-22 14:33:05,771 (conf-file-poller-0) [INFO - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:196)] Reloading configuration file:/etc/flume-ng/conf/hbase-agent
2012-06-22 14:33:05,775 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,775 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:973)] Created context for sink1: serializer
2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:886)] Added sinks: sink1 Agent: hbase-agent
2012-06-22 14:33:05,776 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,778 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,778 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,778 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,779 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,779 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:969)] Processing:sink1
2012-06-22 14:33:05,779 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:284)] Starting validation of configuration for agent: hbase-agent, initial-configuration: AgentConfiguration[hbase-agent]
SOURCES: {tail={ parameters:{command=tail -F /home/hadoop/demo.txt, channels=ch1, type=exec} }}
CHANNELS: {ch1={ parameters:{type=memory} }}
SINKS: {sink1={ parameters:{serializer.payloadColumn=col1, serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer, serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1+, columnFamily=cf, table=demo, type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1, serializer.suffix=timestamp} }}

2012-06-22 14:33:05,784 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateChannels(FlumeConfiguration.java:438)] Created channel ch1
2012-06-22 14:33:05,796 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:633)] Creating sink: sink1 using OTHER
2012-06-22 14:33:05,802 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:342)] Post validation configuration for hbase-agent
AgentConfiguration created without Configuration stubs for which only basic syntactical validation was performed[hbase-agent]
SOURCES: {tail={ parameters:{command=tail -F /home/hadoop/demo.txt, channels=ch1, type=exec} }}
CHANNELS: {ch1={ parameters:{type=memory} }}
SINKS: {sink1={ parameters:{serializer.payloadColumn=col1, serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer, serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1+, columnFamily=cf, table=demo, type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1, serializer.suffix=timestamp} }}

2012-06-22 14:33:05,802 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:128)] Channels:ch1

2012-06-22 14:33:05,802 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:129)] Sinks sink1

2012-06-22 14:33:05,802 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:130)] Sources tail

2012-06-22 14:33:05,802 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:133)] Post-validation flume configuration contains configuration  for agents: [hbase-agent]
2012-06-22 14:33:05,802 (conf-file-poller-0) [INFO - org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadChannels(PropertiesFileConfigurationProvider.java:246)] Creating channels
2012-06-22 14:33:05,803 (conf-file-poller-0) [DEBUG - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:68)] Creating instance of channel ch1 type memory
2012-06-22 14:33:05,807 (conf-file-poller-0) [INFO - org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadChannels(PropertiesFileConfigurationProvider.java:262)] created channel ch1
2012-06-22 14:33:05,807 (conf-file-poller-0) [DEBUG - org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:74)] Creating instance of source tail, type exec
2012-06-22 14:33:05,813 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:70)] Creating instance of sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
2012-06-22 14:33:05,813 (conf-file-poller-0) [DEBUG - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:78)] Sink type org.apache.flume.sink.hbase.HBaseSink is a custom type
2012-06-22 14:33:05,943 (conf-file-poller-0) [INFO - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.onNodeConfigurationChanged(DefaultLogicalNodeManager.java:54)] Node configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource@311671b2 }} sinkRunners:{sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@3882764b counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel@7d2452e8} }
2012-06-22 14:33:05,949 (lifecycleSupervisor-1-2) [INFO - org.apache.flume.source.ExecSource.start(ExecSource.java:145)] Exec source starting with command:tail -F /home/hadoop/demo.txt
2012-06-22 14:33:05,950 (lifecycleSupervisor-1-2) [DEBUG - org.apache.flume.source.ExecSource.start(ExecSource.java:163)] Exec source started


Still, it’s not moving any data into hbase.

Rahul: Could you please recall, what additional configuration or troubleshooting you did. As I aware that, yesterday we were in same situation.
----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
----------------------------------------

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Friday, June 22, 2012 12:39 PM
To: flume-user@incubator.apache.org
Subject: Re: Hbase-sink behavior

Hi,

Could you please make sure you hbase-site.xml is in the class path which flume is using. If the log you sent earlier was the only log you had, it means Hbase sink is unable to connect/write to Hbase. it definitely seems like Hbase client API is unable to connect. Please send your configuration too.


Thanks
Hari

--
Hari Shreedharan


On Thursday, June 21, 2012 at 8:01 PM, ashutosh(오픈플랫폼개발팀) wrote:

Hi Folks,



I tried every options, but didn’t get any success yet. I am still not able to store data into Hbase. It seems that Hbase agent is working fine without reporting any error/warning.  I think there is some issue between hbase sink and hbase database. Can you please help me to troubleshoot this problem to identify the issue between hbase sink and hbase database. However, I used the same configuration mentioned by Mr. Rahul in chain of mails earlier. But none of these configurations worked for me.



Please…please…please help me.



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------



From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Friday, June 22, 2012 2:59 AM
To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior



Hi,



There are a couple of things you should not here:



* If more than one event is read from the channel in the same millisecond, then these events will get written to HBase with the same row key, and one could potentially overwrite the older events, unless you have Hbase configured to support multiple versions.

* Flume does not guarantee ordering or uniqueness, it guarantees at least once delivery. if a transaction fails, then Flume will try to write all events in the transaction again, and may cause duplicates. In case of Hbase the serializer is expected to make sure duplicates do not overwrite non-duplicate data, as mentioned in the Javadocs.



Note that the SimpleHbaseEventSerializer is only an example, you should ideally write your own serializer and plug it in. This will ensure data is written in a way you expect.



Thanks

Hari



--

Hari Shreedharan



On Thursday, June 21, 2012 at 6:27 AM, Mohammad Tariq wrote:

Hi Rahul,



Actually that has nothing to do with Flume..Simply, out of

excitement I used the same file more than once so all these values

went as different versions into the Hbase table. And when you tail a

file without modifying the behavior of the tail command it will take

only last few records and not the entire content of the file. That

could be a reason for the absence of value3..But there is no issue

from Flume's side..It totally depends on tail's behavior .

Regards,

Mohammad Tariq





On Thu, Jun 21, 2012 at 6:47 PM, Rahul Patodi

<pa...@gmail.com>> wrote:

If you look at the output provided by you in the first mail of this mail

thread:

in your file (on local file system) you have value 1 to 6 (value1, value2,

value3....)

but when you scan in hbase output is value1, value4 , value2 , value1 ,

value6 , value6 , value5 , value6



value3 is not inserted

value 6 is inserted 3 times



did you figure out why so ?







On Thu, Jun 21, 2012 at 6:24 PM, Mohammad Tariq <do...@gmail.com>> wrote:



Both the commands seem similar to me.



Regards,

Mohammad Tariq





On Thu, Jun 21, 2012 at 5:43 PM, Rahul Patodi

<pa...@gmail.com>> wrote:

Hi Mohammad,

Thanks for your response

I have put this configuration:



hbase-agent.sources=tail

hbase-agent.sinks=sink1

hbase-agent.channels=ch1



hbase-agent.sources.tail.type=exec

hbase-agent.sources.tail.command=tail -F /tmp/test05

hbase-agent.sources.tail.channels=ch1



hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink

hbase-agent.sinks.sink1.channel=ch1

hbase-agent.sinks.sink1.table=t002

hbase-agent.sinks.sink1.columnFamily=cf

hbase-agent.sinks.sink1.column=foo



hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer

hbase-agent.sinks.sink1.serializer.payloadColumn=col1

hbase-agent.sinks.sink1.serializer.incrementColumn=col1

#hbase-agent.sinks.sink1.serializer.keyType=timestamp

hbase-agent.sinks.sink1.serializer.rowPrefix=1+

hbase-agent.sinks.sink1.serializer.suffix=timestamp



hbase-agent.channels.ch1.type=memory





Data is getting copy into HBase, but I have got another issue:



My input data is simply:



value1

value2

value3

value4

value5

value6

value7

value8

value9



when I run this command in HBase:

hbase(main):129:0> scan 't002', {VERSIONS => 3}

ROW COLUMN+CELL

1+1340279755410 column=cf:col1,

timestamp=1340279758424,

value=value5

1+1340279755410 column=cf:col1,

timestamp=1340279758423,

value=value3

1+1340279755410 column=cf:col1,

timestamp=1340279758417,

value=value1

1+1340279755411 column=cf:col1,

timestamp=1340279758427,

value=value9

1+1340279755411 column=cf:col1,

timestamp=1340279758426,

value=value8

1+1340279755411 column=cf:col1,

timestamp=1340279758425,

value=value7

incRow column=cf:col1,

timestamp=1340279758443,

value=\x00\x00\x00\x00\x00\x00\x00\x09

3 row(s) in 0.0420 seconds



all the data is not getting copy ??



When I run this command with version:

hbase(main):130:0> scan 't002', {VERSIONS => 3}

ROW COLUMN+CELL

1+1340279755410 column=cf:col1,

timestamp=1340279758424,

value=value5

1+1340279755410 column=cf:col1,

timestamp=1340279758423,

value=value3

1+1340279755410 column=cf:col1,

timestamp=1340279758417,

value=value1

1+1340279755411 column=cf:col1,

timestamp=1340279758427,

value=value9

1+1340279755411 column=cf:col1,

timestamp=1340279758426,

value=value8

1+1340279755411 column=cf:col1,

timestamp=1340279758425,

value=value7

1+1340279906637 column=cf:col1,

timestamp=1340279909652,

value=value1

1+1340279906638 column=cf:col1,

timestamp=1340279909659,

value=value6

1+1340279906638 column=cf:col1,

timestamp=1340279909658,

value=value5

1+1340279906638 column=cf:col1,

timestamp=1340279909654,

value=value3

1+1340279906646 column=cf:col1,

timestamp=1340279909659,

value=value7

1+1340279906647 column=cf:col1,

timestamp=1340279909659,

value=value9

incRow column=cf:col1,

timestamp=1340279909677,

value=\x00\x00\x00\x00\x00\x00\x00\x12

7 row(s) in 0.0640 seconds



Please help me understand this.









On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <do...@gmail.com>>

wrote:



Hi Will,



I got it.Thanks for the info.



Regards,

Mohammad Tariq





On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <wi...@cloudera.com>>

wrote:

Hi Mohammad,



In your config file, I think you need to remove this line:



hbase-agent.sinks.sink1.serializer.keyType = timestamp



I don't see any 'keyType' property in SimpleHbaseEventSerializer.java

(although there is a keyType var that stores the value of the

'suffix'

prop).



Cheers,

Will





On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com>>

wrote:



Hi Rahul,



This normally happens when there is some problem in the

configuration file.Create a file called hbase-agent inside your

FLUME_HOME/conf directory and copy this content into it:

hbase-agent.sources = tail

hbase-agent.sinks = sink1

hbase-agent.channels = ch1



hbase-agent.sources.tail.type = exec

hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt

hbase-agent.sources.tail.channels = ch1



hbase-agent.sinks.sink1.type = org.apache.flume.sink..hbase.HBaseSink

hbase-agent.sinks.sink1.channel = ch1

hbase-agent.sinks.sink1.table = demo

hbase-agent.sinks.sink1.columnFamily = cf



hbase-agent.sinks.sink1.serializer =

org.apache.flume.sink.hbase.SimpleHbaseEventSerializer

hbase-agent.sinks.sink1.serializer.payloadColumn = col1



hbase-agent.sinks.sink1.serializer.keyType = timestamp

hbase-agent.sinks.sink1.serializer.rowPrefix = 1

hbase-agent.sinks.sink1.serializer.suffix = timestamp



hbase-agent.channels.ch1.type=memory



Then start the agent and see if it works for you. It worked for me.



Regards,

Mohammad Tariq





On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com>>

wrote:

Hi Sharma,



So I assume that your command looks something like this:

flume-ng agent -n hbase-agent -f

/home/hadoop/flumeng/hbaseagent.conf

-c /etc/flume-ng/conf



...?



Hari, I saw your comment:



I am not sure if HBase changed their wire protocol between these

versions.

Do you have any other advice about troubleshooting a possible

hbase

protocol

mismatch issue?



Cheers,

Will







On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)

<sh...@kt.com>>

wrote:



Hi Will,







I installed flume as part of CDH3u4 version 1.1 using yum install

flume-ng. One more point, I am using flume-ng hbase sink

downloaded

from:







https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar<https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1..1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar>







Now, I ran the agent with -conf parameter with updated

log4j.properties. I

don't see any error in the log. Please see the below from the log

file:







2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor:

Starting

lifecycle supervisor 1



2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting

-

hbase-agent



2012-06-21 18:25:08,146 INFO

nodemanager.DefaultLogicalNodeManager:

Node

manager starting



2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor:

Starting

lifecycle supervisor 9



2012-06-21 18:25:08,146 INFO

properties.PropertiesFileConfigurationProvider: Configuration

provider

starting



2012-06-21 18:25:08,148 DEBUG

nodemanager.DefaultLogicalNodeManager:

Node

manager started



2012-06-21 18:25:08,148 DEBUG

properties.PropertiesFileConfigurationProvider: Configuration

provider

started



2012-06-21 18:25:08,149 DEBUG

properties.PropertiesFileConfigurationProvider: Checking

file:/home/hadoop/flumeng/hbaseagent.conf<file:///\\home\hadoop\flumeng\hbaseagent.conf> for changes



2012-06-21 18:25:08,149 INFO

properties.PropertiesFileConfigurationProvider: Reloading

configuration

file:/home/hadoop/flumeng/hbaseagent.conf<file:///\\home\hadoop\flumeng\hbaseagent.conf>



2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added

sinks:

sink1

Agent: hbase-agent



2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created

context

for

sink1: serializer.rowPrefix



2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:

Processing:sink1



2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting

validation

of configuration for agent: hbase-agent, initial-configuration:

AgentConfiguration[hbase-agent]



SOURCES: {tail={ parameters:{command=tail -f

/home/hadoop/demo.txt,

channels=ch1, type=exec} }}



CHANNELS: {ch1={ parameters:{type=memory} }}



SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,

serializer.keyType=timestamp,



serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,

serializer.incrementColumn=col1, column=foo,

serializer.rowPrefix=1,

batchSize=1, columnFamily=cf1, table=test,

type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,

serializer.suffix=timestamp} }}







2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created

channel

ch1



2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating

sink:

sink1 using OTHER



2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post

validation

configuration for hbase-agent



AgentConfiguration created without Configuration stubs for which

only

basic syntactical validation was performed[hbase-agent]



SOURCES: {tail={ parameters:{command=tail -f

/home/hadoop/demo.txt,

channels=ch1, type=exec} }}



CHANNELS: {ch1={ parameters:{type=memory} }}



SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,

serializer.keyType=timestamp,



serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,

serializer.incrementColumn=col1, column=foo,

serializer.rowPrefix=1,

batchSize=1, columnFamily=cf1, table=test,

type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,

serializer.suffix=timestamp} }}



2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration:

Channels:ch1







2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks

sink1







2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources

tail







2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration:

Post-validation

flume configuration contains configuration for agents:

[hbase-agent]



2012-06-21 18:25:08,171 INFO

properties.PropertiesFileConfigurationProvider: Creating channels



2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory:

Creating

instance of channel ch1 type memory



2012-06-21 18:25:08,175 INFO

properties.PropertiesFileConfigurationProvider: created channel

ch1



2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory:

Creating

instance of source tail, type exec



2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating

instance

of

sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink



2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type

org.apache.flume.sink.hbase.HBaseSink is a custom type



2012-06-21 18:25:08,298 INFO

nodemanager.DefaultLogicalNodeManager:

Node

configuration change:{

sourceRunners:{tail=EventDrivenSourceRunner:

{

source:org.apache.flume.source.ExecSource@1fd0fafc }}

sinkRunners:{sink1=SinkRunner: {

policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5

counterGroup:{

name:null counters:{} } }}

channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }



2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source

starting

with

command:tail -f /home/hadoop/demo.txt



2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source

started







Output of the which Flume-ng is:



/usr/bin/flume-ng











----------------------------------------



----------------------------------------



Thanks & Regards,



Ashutosh Sharma



Cell: 010-7300-0150



Email: sharma.ashutosh@kt.com<ma...@kt.com>



----------------------------------------







From: Will McQueen [mailto:will@cloudera.com]

Sent: Thursday, June 21, 2012 6:07 PM





To: flume-user@incubator.apache.org<ma...@incubator.apache.org>

Subject: Re: Hbase-sink behavior







Hi Sharma,







Could you please describe how you installed flume? Also, I see

you're

getting this warning:



Warning: No configuration directory set! Use --conf <dir> to

override.







The log4j.properties that flume provides is stored in the conf

dir.

If

you

specify the flume conf dir, flume can pick it up. So for

troubleshooting you

can try:





1) modifying the log4j.properties within flume's conf dir so that

the

top

reads:

#flume.root.logger=DEBUG,console

flume.root.logger=DEBUG,LOGFILE

flume.log.dir=.

flume.log.file=flume.log



2) Run the flume agent while specifying the flume conf dir

(--conf

<dir>)



3) What's the output of 'which flume-ng'?



Cheers,

Will



On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)

<sh...@kt.com>> wrote:



Hi Hari,







I checked, agent is successfully tailing the file which I

mentioned.

Yes,

you are right, agent has started properly without any error.

Because

there

is no further movement, so it's hard for me to identify the

issue. I

also

used tail -F also, but no success.



Can you suggest me some technique to troubleshoot it, so I could

identify

the issue and resolve the same. Does flume record some log

anywhere?







----------------------------------------



----------------------------------------



Thanks & Regards,



Ashutosh Sharma



Cell: 010-7300-0150



Email: sharma.ashutosh@kt.com<ma...@kt.com>



----------------------------------------







From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]

Sent: Thursday, June 21, 2012 5:25 PM





To: flume-user@incubator.apache.org<ma...@incubator.apache.org>

Subject: Re: Hbase-sink behavior







I am not sure if HBase changed their wire protocol between these

versions.

Looks like your agent has started properly. Are you sure data is

being

written into the file being tailed? I suggest using tail -F. The

log

being

stuck here is ok, that is probably because nothing specific is

required(or

your log file rotated).







Thanks



Hari







--



Hari Shreedharan







On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:



Hi Hari,







Thanks for your prompt reply. I already created the table in

Hbase

with

a

column family and hadoop/hbase library is available to hadoop. I

noticed

that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?



Please see the below lines captured while running the flume

agent:







flume-ng agent -n hbase-agent -f

/home/hadoop/flumeng/hbaseagent.conf



Warning: No configuration directory set! Use --conf <dir> to

override.



Info: Including Hadoop libraries found via (/usr/bin/hadoop) for

HDFS

access



Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4..3.jar from

classpath



Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar

from

classpath



+ exec /home/hadoop/jdk16/bin/java -Xmx20m -cp







'/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera..1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'



-Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64

org.apache.flume.node.Application -n hbase-agent -f

/home/hadoop/flumeng/hbaseagent.conf



12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting

lifecycle

supervisor 1



12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -

hbase-agent



12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:

Node

manager

starting



12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting

lifecycle

supervisor 10



12/06/21 16:40:42 INFO

properties.PropertiesFileConfigurationProvider:

Configuration provider starting



12/06/21 16:40:42 INFO

properties.PropertiesFileConfigurationProvider:

Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf<file:///\\home\hadoop\flumeng\hbaseagent.conf>



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks:

sink1

Agent:

hbase-agent



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1



12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation

flume

configuration contains configuration for agents: [hbase-agent]



12/06/21 16:40:42 INFO

properties.PropertiesFileConfigurationProvider:

Creating channels



12/06/21 16:40:42 INFO

properties.PropertiesFileConfigurationProvider:

created channel ch1



12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance

of

sink

sink1 typeorg.apache.flume.sink.hbase.HBaseSink



12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:

Node

configuration change:{

sourceRunners:{tail=EventDrivenSourceRunner:

{

source:org.apache.flume.source.ExecSource@1ed0af9b }}

sinkRunners:{sink1=SinkRunner: {

policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb

counterGroup:{

name:null counters:{} } }}

channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }



12/06/21 16:40:42 INFO source.ExecSource: Exec source starting

with

command:tail -f /home/hadoop/demo.txt







Screen stuck here....no movement.







----------------------------------------



----------------------------------------



Thanks & Regards,



Ashutosh Sharma



----------------------------------------







From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]

Sent: Thursday, June 21, 2012 5:01 PM

To: flume-user@incubator.apache.org<ma...@incubator.apache.org>

Subject: Re: Hbase-sink behavior







Hi Ashutosh,







The sink will not create the table or column family. Make sure

you

have

the table and column family. Also please make sure you have

HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they

are

in

your

class path).











Thanks



Hari







--



Hari Shreedharan







On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:



Hi,







I have used and followed the same steps which is mentioned in

below

mails

to get start with the hbasesink. But agent is not storing any

data

into

hbase. I added the hbase-site.xml in $CLASSPATH variable to pick

the

hbase

information. Even I am able to connect to the hbase server from

that

agent

machine.







Now, I am unable to understand and troubleshoot this problem.

Seeking

advice from the community members....







----------------------------------------



----------------------------------------



Thanks & Regards,



Ashutosh Sharma



----------------------------------------







-----Original Message-----



From: Mohammad Tariq [mailto:dontariq@gmail.com]



Sent: Friday, June 15, 2012 9:02 AM



To: flume-user@incubator.apache.org<ma...@incubator.apache.org>



Subject: Re: Hbase-sink behavior







Thank you so much Hari for the valuable response..I'll follow the

guidelines provided by you.







Regards,



Mohammad Tariq











On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan

<hs...@cloudera.com>> wrote:



Hi Mohammad,







My answers are inline.







--



Hari Shreedharan







On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:







Hello list,







I am trying to use hbase-sink to collect data from a local file

and



dump it into an Hbase table..But there are a few things I am not

able



to understand and need some guidance.







This is the content of my conf file :







hbase-agent.sources = tail



hbase-agent..sinks = sink1



hbase-agent.channels = ch1



hbase-agent.sources.tail.type = exec



hbase-agent.sources.tail.command = tail -F

/home/mohammad/demo.txt



hbase-agent.sources.tail.channels = ch1

hbase-agent.sinks.sink1.type

=



org.apache.flume.sink.hbase.HBaseSink



hbase-agent.sinks.sink1.channel = ch1



hbase-agent.sinks.sink1.table = test3



hbase-agent.sinks.sink1.columnFamily = testing



hbase-agent.sinks.sink1.column = foo



hbase-agent.sinks.sink1.serializer =



org.apache.flume.sink.hbase.SimpleHbaseEventSerializer



hbase-agent.sinks.sink1.serializer.payloadColumn = col1



hbase-agent.sinks.sink1.serializer.incrementColumn = col1



hbase-agent.sinks.sink1.serializer.keyType = timestamp



hbase-agent.sinks.sink1.serializer.rowPrefix = 1



hbase-agent.sinks.sink1.serializer.suffix = timestamp



hbase-agent.channels.ch1.type=memory







Right now I am taking just some simple text from a file which has



following content -







value1



value2



value3



value4



value5



value6







And my Hbase table looks like -







hbase(main):217:0> scan 'test3'



ROW COLUMN+CELL



11339716704561 column=testing:col1,



timestamp=1339716707569, value=value1



11339716704562 column=testing:col1,



timestamp=1339716707571, value=value4



11339716846594 column=testing:col1,



timestamp=1339716849608, value=value2



11339716846595 column=testing:col1,



timestamp=1339716849610, value=value1



11339716846596 column=testing:col1,



timestamp=1339716849611, value=value6



11339716846597 column=testing:col1,



timestamp=1339716849614, value=value6



11339716846598 column=testing:col1,



timestamp=1339716849615, value=value5



11339716846599 column=testing:col1,



timestamp=1339716849615, value=value6



incRow column=testing:col1,



timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C



9 row(s) in 0.0580 seconds







Now I have following questions -







1- Why the timestamp value is different from the row key?(I was

trying



to make "1+timestamp" as the rowkey)







The value shown by hbase shell as timestamp is the time at which

the



value was inserted into Hbase, while the value inserted by Flume

is



the timestamp at which the sink read the event from the channel.



Depending on how long the network and HBase takes, these

timestamps



can vary. If you want 1+timestamp as row key then you should

configure

it:







hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is



appended as-is to the suffix you choose.







2- Although I am not using "incRow", it stills appear in the

table



with some value. Why so and what is this value??







The SimpleHBaseEventSerializer is only an example class. For

custom



use cases you can write your own serializer by implementing



HbaseEventSerializer. In this case, you have specified



incrementColumn, which causes an increment on the column

specified.



Simply don't specify that config and that row will not appear.







3- How can avoid the last row??







See above.











I am still in the learning phase so please pardon my

ignorance..Many

thanks.







No problem. Much of this is documented



here:



https://builds.apache.org/job/flume-trunk/site/apidocs/index.html















Regards,



Mohammad Tariq











이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한

없이, 본

문서에

포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,

본

메일이

잘못

전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.



This E-mail may contain confidential information and/or copyright

material. This email is intended for the use of the addressee

only.

If

you

receive this email by mistake, please either delete it without

reproducing,

distributing or retaining copies thereof or notify the sender

immediately.











이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한

없이, 본

문서에

포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,

본

메일이

잘못

전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.

This E-mail may contain confidential information and/or copyright

material. This email is intended for the use of the addressee

only.

If

you

receive this email by mistake, please either delete it without

reproducing,

distributing or retaining copies thereof or notify the sender

immediately.











이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한

없이, 본

문서에

포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,

본

메일이

잘못

전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.

This E-mail may contain confidential information and/or copyright

material. This email is intended for the use of the addressee

only.

If

you

receive this email by mistake, please either delete it without

reproducing,

distributing or retaining copies thereof or notify the sender

immediately.











이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한

없이, 본

문서에

포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,

본

메일이

잘못

전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.

This E-mail may contain confidential information and/or copyright

material. This email is intended for the use of the addressee

only.

If

you

receive this email by mistake, please either delete it without

reproducing,

distributing or retaining copies thereof or notify the sender

immediately.









--

Regards,

Rahul Patodi









--

Regards,

Rahul Patodi




이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다.. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.

Re: Hbase-sink behavior

Posted by Rahul Patodi <pa...@gmail.com>.
Hi Ashutosh,

Please check whether you have defined these variables in .bashrc
HADOOP_HOME
HADOOP_PREFIX
PATH=/opt/hes/flume-1.1.0-cdh4.0.0/bin:$PATH
FLUME_HOME
FLUME_CONF_DIR
HBASE_HOME
CLASSPATH=$CLASSPATH:HBASE_HOME/conf:HADOOP_HOME/conf

After defining these variables I got flume-HBase Sink working
(Some of the variable might not needed)

On Fri, Jun 22, 2012 at 9:08 AM, Hari Shreedharan <hshreedharan@cloudera.com
> wrote:

>  Hi,
>
> Could you please make sure you hbase-site.xml is in the class path which
> flume is using. If the log you sent earlier was the only log you had, it
> means Hbase sink is unable to connect/write to Hbase. it definitely seems
> like Hbase client API is unable to connect. Please send your configuration
> too.
>
>
> Thanks
> Hari
>
> --
> Hari Shreedharan
>
> On Thursday, June 21, 2012 at 8:01 PM, ashutosh(오픈플랫폼개발팀) wrote:
>
>  Hi Folks,
>
>
>
> I tried every options, but didn’t get any success yet. I am still not able
> to store data into Hbase. It seems that Hbase agent is working fine without
> reporting any error/warning.  I think there is some issue between hbase
> sink and hbase database. Can you please help me to troubleshoot this
> problem to identify the issue between hbase sink and hbase database.
> However, I used the same configuration mentioned by Mr. Rahul in chain of
> mails earlier. But none of these configurations worked for me.
>
>
>
> Please…please…please help me.
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> ----------------------------------------
>
>
>
> *From:* Hari Shreedharan [mailto:hshreedharan@cloudera.com<hs...@cloudera.com>]
>
> *Sent:* Friday, June 22, 2012 2:59 AM
> *To:* flume-user@incubator.apache.org
> *Subject:* Re: Hbase-sink behavior
>
>
>
> Hi,
>
>
>
> There are a couple of things you should not here:
>
>
>
> * If more than one event is read from the channel in the same millisecond,
> then these events will get written to HBase with the same row key, and one
> could potentially overwrite the older events, unless you have Hbase
> configured to support multiple versions.
>
> * Flume does not guarantee ordering or uniqueness, it guarantees at least
> once delivery. if a transaction fails, then Flume will try to write all
> events in the transaction again, and may cause duplicates. In case of Hbase
> the serializer is expected to make sure duplicates do not overwrite
> non-duplicate data, as mentioned in the Javadocs.
>
>
>
> Note that the SimpleHbaseEventSerializer is only an example, you should
> ideally write your own serializer and plug it in. This will ensure data is
> written in a way you expect.
>
>
>
> Thanks
>
> Hari
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 21, 2012 at 6:27 AM, Mohammad Tariq wrote:
>
>   Hi Rahul,
>
>
>
> Actually that has nothing to do with Flume..Simply, out of
>
> excitement I used the same file more than once so all these values
>
> went as different versions into the Hbase table. And when you tail a
>
> file without modifying the behavior of the tail command it will take
>
> only last few records and not the entire content of the file. That
>
> could be a reason for the absence of value3..But there is no issue
>
> from Flume's side..It totally depends on tail's behavior .
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Thu, Jun 21, 2012 at 6:47 PM, Rahul Patodi
>
> <pa...@gmail.com> wrote:
>
>  If you look at the output provided by you in the first mail of this mail
>
> thread:
>
> in your file (on local file system) you have value 1 to 6 (value1, value2,
>
> value3....)
>
> but when you scan in hbase output is value1, value4 , value2 , value1 ,
>
> value6 , value6 , value5 , value6
>
>
>
> value3 is not inserted
>
> value 6 is inserted 3 times
>
>
>
> did you figure out why so ?
>
>
>
>
>
>
>
> On Thu, Jun 21, 2012 at 6:24 PM, Mohammad Tariq <do...@gmail.com>
> wrote:
>
>
>
> Both the commands seem similar to me.
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Thu, Jun 21, 2012 at 5:43 PM, Rahul Patodi
>
> <pa...@gmail.com> wrote:
>
>  Hi Mohammad,
>
> Thanks for your response
>
> I have put this configuration:
>
>
>
> hbase-agent.sources=tail
>
> hbase-agent.sinks=sink1
>
> hbase-agent.channels=ch1
>
>
>
> hbase-agent.sources.tail.type=exec
>
> hbase-agent.sources.tail.command=tail -F /tmp/test05
>
> hbase-agent.sources.tail.channels=ch1
>
>
>
> hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink
>
> hbase-agent.sinks.sink1.channel=ch1
>
> hbase-agent.sinks.sink1.table=t002
>
> hbase-agent.sinks.sink1.columnFamily=cf
>
> hbase-agent.sinks.sink1.column=foo
>
>
>
>
> hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
> hbase-agent.sinks.sink1.serializer.payloadColumn=col1
>
> hbase-agent.sinks.sink1.serializer.incrementColumn=col1
>
> #hbase-agent.sinks.sink1.serializer.keyType=timestamp
>
> hbase-agent.sinks.sink1.serializer.rowPrefix=1+
>
> hbase-agent.sinks.sink1.serializer.suffix=timestamp
>
>
>
> hbase-agent.channels.ch1.type=memory
>
>
>
>
>
> Data is getting copy into HBase, but I have got another issue:
>
>
>
> My input data is simply:
>
>
>
> value1
>
> value2
>
> value3
>
> value4
>
> value5
>
> value6
>
> value7
>
> value8
>
> value9
>
>
>
> when I run this command in HBase:
>
> hbase(main):129:0> scan 't002', {VERSIONS => 3}
>
> ROW COLUMN+CELL
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758424,
>
> value=value5
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758423,
>
> value=value3
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758417,
>
> value=value1
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758427,
>
> value=value9
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758426,
>
> value=value8
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758425,
>
> value=value7
>
> incRow column=cf:col1,
>
> timestamp=1340279758443,
>
> value=\x00\x00\x00\x00\x00\x00\x00\x09
>
> 3 row(s) in 0.0420 seconds
>
>
>
> all the data is not getting copy ??
>
>
>
> When I run this command with version:
>
> hbase(main):130:0> scan 't002', {VERSIONS => 3}
>
> ROW COLUMN+CELL
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758424,
>
> value=value5
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758423,
>
> value=value3
>
> 1+1340279755410 column=cf:col1,
>
> timestamp=1340279758417,
>
> value=value1
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758427,
>
> value=value9
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758426,
>
> value=value8
>
> 1+1340279755411 column=cf:col1,
>
> timestamp=1340279758425,
>
> value=value7
>
> 1+1340279906637 column=cf:col1,
>
> timestamp=1340279909652,
>
> value=value1
>
> 1+1340279906638 column=cf:col1,
>
> timestamp=1340279909659,
>
> value=value6
>
> 1+1340279906638 column=cf:col1,
>
> timestamp=1340279909658,
>
> value=value5
>
> 1+1340279906638 column=cf:col1,
>
> timestamp=1340279909654,
>
> value=value3
>
> 1+1340279906646 column=cf:col1,
>
> timestamp=1340279909659,
>
> value=value7
>
> 1+1340279906647 column=cf:col1,
>
> timestamp=1340279909659,
>
> value=value9
>
> incRow column=cf:col1,
>
> timestamp=1340279909677,
>
> value=\x00\x00\x00\x00\x00\x00\x00\x12
>
> 7 row(s) in 0.0640 seconds
>
>
>
> Please help me understand this.
>
>
>
>
>
>
>
>
>
> On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <do...@gmail.com>
>
> wrote:
>
>
>
> Hi Will,
>
>
>
> I got it.Thanks for the info.
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <wi...@cloudera.com>
>
> wrote:
>
>  Hi Mohammad,
>
>
>
> In your config file, I think you need to remove this line:
>
>
>
>  hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
>
>
> I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
>
> (although there is a keyType var that stores the value of the
>
> 'suffix'
>
> prop).
>
>
>
> Cheers,
>
> Will
>
>
>
>
>
> On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com>
>
> wrote:
>
>
>
> Hi Rahul,
>
>
>
> This normally happens when there is some problem in the
>
> configuration file.Create a file called hbase-agent inside your
>
> FLUME_HOME/conf directory and copy this content into it:
>
> hbase-agent.sources = tail
>
> hbase-agent.sinks = sink1
>
> hbase-agent.channels = ch1
>
>
>
> hbase-agent.sources.tail.type = exec
>
> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>
> hbase-agent.sources.tail.channels = ch1
>
>
>
> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
>
> hbase-agent.sinks.sink1.channel = ch1
>
> hbase-agent.sinks.sink1.table = demo
>
> hbase-agent.sinks.sink1.columnFamily = cf
>
>
>
> hbase-agent.sinks.sink1.serializer =
>
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>
>
>
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>
>
>
> hbase-agent.channels.ch1.type=memory
>
>
>
> Then start the agent and see if it works for you. It worked for me.
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com>
>
> wrote:
>
>  Hi Sharma,
>
>
>
> So I assume that your command looks something like this:
>
> flume-ng agent -n hbase-agent -f
>
> /home/hadoop/flumeng/hbaseagent.conf
>
> -c /etc/flume-ng/conf
>
>
>
> ...?
>
>
>
> Hari, I saw your comment:
>
>
>
>  I am not sure if HBase changed their wire protocol between these
>
> versions.
>
> Do you have any other advice about troubleshooting a possible
>
> hbase
>
> protocol
>
> mismatch issue?
>
>
>
> Cheers,
>
> Will
>
>
>
>
>
>
>
> On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
>
> <sh...@kt.com>
>
> wrote:
>
>
>
> Hi Will,
>
>
>
>
>
>
>
> I installed flume as part of CDH3u4 version 1.1 using yum install
>
> flume-ng. One more point, I am using flume-ng hbase sink
>
> downloaded
>
> from:
>
>
>
>
>
>
>
>
> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
>
>
>
>
>
>
>
> Now, I ran the agent with -conf parameter with updated
>
> log4j.properties. I
>
> don't see any error in the log. Please see the below from the log
>
> file:
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor:
>
> Starting
>
> lifecycle supervisor 1
>
>
>
> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting
>
> -
>
> hbase-agent
>
>
>
> 2012-06-21 18:25:08,146 INFO
>
> nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> manager starting
>
>
>
> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor:
>
> Starting
>
> lifecycle supervisor 9
>
>
>
> 2012-06-21 18:25:08,146 INFO
>
> properties.PropertiesFileConfigurationProvider: Configuration
>
> provider
>
> starting
>
>
>
> 2012-06-21 18:25:08,148 DEBUG
>
> nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> manager started
>
>
>
> 2012-06-21 18:25:08,148 DEBUG
>
> properties.PropertiesFileConfigurationProvider: Configuration
>
> provider
>
> started
>
>
>
> 2012-06-21 18:25:08,149 DEBUG
>
> properties.PropertiesFileConfigurationProvider: Checking
>
> file:/home/hadoop/flumeng/hbaseagent.conf for changes
>
>
>
> 2012-06-21 18:25:08,149 INFO
>
> properties.PropertiesFileConfigurationProvider: Reloading
>
> configuration
>
> file:/home/hadoop/flumeng/hbaseagent.conf
>
>
>
> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added
>
> sinks:
>
> sink1
>
> Agent: hbase-agent
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created
>
> context
>
> for
>
> sink1: serializer.rowPrefix
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>
> Processing:sink1
>
>
>
> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
>
> validation
>
> of configuration for agent: hbase-agent, initial-configuration:
>
> AgentConfiguration[hbase-agent]
>
>
>
> SOURCES: {tail={ parameters:{command=tail -f
>
> /home/hadoop/demo.txt,
>
> channels=ch1, type=exec} }}
>
>
>
> CHANNELS: {ch1={ parameters:{type=memory} }}
>
>
>
> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>
> serializer.keyType=timestamp,
>
>
>
> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>
> serializer.incrementColumn=col1, column=foo,
>
> serializer.rowPrefix=1,
>
> batchSize=1, columnFamily=cf1, table=test,
>
> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>
> serializer.suffix=timestamp} }}
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created
>
> channel
>
> ch1
>
>
>
> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating
>
> sink:
>
> sink1 using OTHER
>
>
>
> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post
>
> validation
>
> configuration for hbase-agent
>
>
>
> AgentConfiguration created without Configuration stubs for which
>
> only
>
> basic syntactical validation was performed[hbase-agent]
>
>
>
> SOURCES: {tail={ parameters:{command=tail -f
>
> /home/hadoop/demo.txt,
>
> channels=ch1, type=exec} }}
>
>
>
> CHANNELS: {ch1={ parameters:{type=memory} }}
>
>
>
> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>
> serializer.keyType=timestamp,
>
>
>
> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>
> serializer.incrementColumn=col1, column=foo,
>
> serializer.rowPrefix=1,
>
> batchSize=1, columnFamily=cf1, table=test,
>
> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>
> serializer.suffix=timestamp} }}
>
>
>
> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration:
>
> Channels:ch1
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks
>
> sink1
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources
>
> tail
>
>
>
>
>
>
>
> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration:
>
> Post-validation
>
> flume configuration contains configuration for agents:
>
> [hbase-agent]
>
>
>
> 2012-06-21 18:25:08,171 INFO
>
> properties.PropertiesFileConfigurationProvider: Creating channels
>
>
>
> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory:
>
> Creating
>
> instance of channel ch1 type memory
>
>
>
> 2012-06-21 18:25:08,175 INFO
>
> properties.PropertiesFileConfigurationProvider: created channel
>
> ch1
>
>
>
> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory:
>
> Creating
>
> instance of source tail, type exec
>
>
>
> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating
>
> instance
>
> of
>
> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>
>
>
> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
>
> org.apache.flume.sink.hbase.HBaseSink is a custom type
>
>
>
> 2012-06-21 18:25:08,298 INFO
>
> nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> configuration change:{
>
> sourceRunners:{tail=EventDrivenSourceRunner:
>
> {
>
> source:org.apache.flume.source.ExecSource@1fd0fafc }}
>
> sinkRunners:{sink1=SinkRunner: {
>
> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
>
> counterGroup:{
>
> name:null counters:{} } }}
>
> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
>
>
>
> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source
>
> starting
>
> with
>
> command:tail -f /home/hadoop/demo.txt
>
>
>
> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source
>
> started
>
>
>
>
>
>
>
> Output of the which Flume-ng is:
>
>
>
> /usr/bin/flume-ng
>
>
>
>
>
>
>
>
>
>
>
> ----------------------------------------
>
>
>
> ----------------------------------------
>
>
>
> Thanks & Regards,
>
>
>
> Ashutosh Sharma
>
>
>
> Cell: 010-7300-0150
>
>
>
> Email: sharma.ashutosh@kt.com
>
>
>
> ----------------------------------------
>
>
>
>
>
>
>
> From: Will McQueen [mailto:will@cloudera.com <wi...@cloudera.com>]
>
> Sent: Thursday, June 21, 2012 6:07 PM
>
>
>
>
>
> To: flume-user@incubator.apache.org
>
> Subject: Re: Hbase-sink behavior
>
>
>
>
>
>
>
> Hi Sharma,
>
>
>
>
>
>
>
> Could you please describe how you installed flume? Also, I see
>
> you're
>
> getting this warning:
>
>
>
>  Warning: No configuration directory set! Use --conf <dir> to
>
> override.
>
>
>
>
>
>
>
> The log4j.properties that flume provides is stored in the conf
>
> dir.
>
> If
>
> you
>
> specify the flume conf dir, flume can pick it up. So for
>
> troubleshooting you
>
> can try:
>
>
>
>
>
> 1) modifying the log4j.properties within flume's conf dir so that
>
> the
>
> top
>
> reads:
>
> #flume.root.logger=DEBUG,console
>
> flume.root.logger=DEBUG,LOGFILE
>
> flume.log.dir=.
>
> flume.log.file=flume.log
>
>
>
> 2) Run the flume agent while specifying the flume conf dir
>
> (--conf
>
> <dir>)
>
>
>
> 3) What's the output of 'which flume-ng'?
>
>
>
> Cheers,
>
> Will
>
>
>
> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
>
> <sh...@kt.com> wrote:
>
>
>
> Hi Hari,
>
>
>
>
>
>
>
> I checked, agent is successfully tailing the file which I
>
> mentioned.
>
> Yes,
>
> you are right, agent has started properly without any error.
>
> Because
>
> there
>
> is no further movement, so it's hard for me to identify the
>
> issue. I
>
> also
>
> used tail -F also, but no success.
>
>
>
> Can you suggest me some technique to troubleshoot it, so I could
>
> identify
>
> the issue and resolve the same. Does flume record some log
>
> anywhere?
>
>
>
>
>
>
>
> ----------------------------------------
>
>
>
> ----------------------------------------
>
>
>
> Thanks & Regards,
>
>
>
> Ashutosh Sharma
>
>
>
> Cell: 010-7300-0150
>
>
>
> Email: sharma.ashutosh@kt.com
>
>
>
> ----------------------------------------
>
>
>
>
>
>
>
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com<hs...@cloudera.com>
> ]
>
> Sent: Thursday, June 21, 2012 5:25 PM
>
>
>
>
>
> To: flume-user@incubator.apache.org
>
> Subject: Re: Hbase-sink behavior
>
>
>
>
>
>
>
> I am not sure if HBase changed their wire protocol between these
>
> versions.
>
> Looks like your agent has started properly. Are you sure data is
>
> being
>
> written into the file being tailed? I suggest using tail -F. The
>
> log
>
> being
>
> stuck here is ok, that is probably because nothing specific is
>
> required(or
>
> your log file rotated).
>
>
>
>
>
>
>
> Thanks
>
>
>
> Hari
>
>
>
>
>
>
>
> --
>
>
>
> Hari Shreedharan
>
>
>
>
>
>
>
> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
>
>
>
> Hi Hari,
>
>
>
>
>
>
>
> Thanks for your prompt reply. I already created the table in
>
> Hbase
>
> with
>
> a
>
> column family and hadoop/hbase library is available to hadoop. I
>
> noticed
>
> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
>
>
>
> Please see the below lines captured while running the flume
>
> agent:
>
>
>
>
>
>
>
>  flume-ng agent -n hbase-agent -f
>
> /home/hadoop/flumeng/hbaseagent.conf
>
>
>
> Warning: No configuration directory set! Use --conf <dir> to
>
> override.
>
>
>
> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for
>
> HDFS
>
> access
>
>
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
>
> classpath
>
>
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar
>
> from
>
> classpath
>
>
>
> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
>
>
>
>
>
>
>
>
> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
>
>
>
> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
>
> org.apache.flume.node.Application -n hbase-agent -f
>
> /home/hadoop/flumeng/hbaseagent.conf
>
>
>
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>
> lifecycle
>
> supervisor 1
>
>
>
> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
>
> hbase-agent
>
>
>
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> manager
>
> starting
>
>
>
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>
> lifecycle
>
> supervisor 10
>
>
>
> 12/06/21 16:40:42 INFO
>
> properties.PropertiesFileConfigurationProvider:
>
> Configuration provider starting
>
>
>
> 12/06/21 16:40:42 INFO
>
> properties.PropertiesFileConfigurationProvider:
>
> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks:
>
> sink1
>
> Agent:
>
> hbase-agent
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
>
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation
>
> flume
>
> configuration contains configuration for agents: [hbase-agent]
>
>
>
> 12/06/21 16:40:42 INFO
>
> properties.PropertiesFileConfigurationProvider:
>
> Creating channels
>
>
>
> 12/06/21 16:40:42 INFO
>
> properties.PropertiesFileConfigurationProvider:
>
> created channel ch1
>
>
>
> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance
>
> of
>
> sink
>
> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>
>
>
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
>
> Node
>
> configuration change:{
>
> sourceRunners:{tail=EventDrivenSourceRunner:
>
> {
>
> source:org.apache.flume.source.ExecSource@1ed0af9b }}
>
> sinkRunners:{sink1=SinkRunner: {
>
> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
>
> counterGroup:{
>
> name:null counters:{} } }}
>
> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
>
>
>
> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting
>
> with
>
> command:tail -f /home/hadoop/demo.txt
>
>
>
>
>
>
>
> Screen stuck here....no movement.
>
>
>
>
>
>
>
> ----------------------------------------
>
>
>
> ----------------------------------------
>
>
>
> Thanks & Regards,
>
>
>
> Ashutosh Sharma
>
>
>
> ----------------------------------------
>
>
>
>
>
>
>
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com<hs...@cloudera.com>
> ]
>
> Sent: Thursday, June 21, 2012 5:01 PM
>
> To: flume-user@incubator.apache.org
>
> Subject: Re: Hbase-sink behavior
>
>
>
>
>
>
>
> Hi Ashutosh,
>
>
>
>
>
>
>
> The sink will not create the table or column family. Make sure
>
> you
>
> have
>
> the table and column family. Also please make sure you have
>
> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they
>
> are
>
> in
>
> your
>
> class path).
>
>
>
>
>
>
>
>
>
>
>
> Thanks
>
>
>
> Hari
>
>
>
>
>
>
>
> --
>
>
>
> Hari Shreedharan
>
>
>
>
>
>
>
> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
>
>
>
> Hi,
>
>
>
>
>
>
>
> I have used and followed the same steps which is mentioned in
>
> below
>
> mails
>
> to get start with the hbasesink. But agent is not storing any
>
> data
>
> into
>
> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick
>
> the
>
> hbase
>
> information. Even I am able to connect to the hbase server from
>
> that
>
> agent
>
> machine.
>
>
>
>
>
>
>
> Now, I am unable to understand and troubleshoot this problem.
>
> Seeking
>
> advice from the community members....
>
>
>
>
>
>
>
> ----------------------------------------
>
>
>
> ----------------------------------------
>
>
>
> Thanks & Regards,
>
>
>
> Ashutosh Sharma
>
>
>
> ----------------------------------------
>
>
>
>
>
>
>
> -----Original Message-----
>
>
>
> From: Mohammad Tariq [mailto:dontariq@gmail.com <do...@gmail.com>]
>
>
>
> Sent: Friday, June 15, 2012 9:02 AM
>
>
>
> To: flume-user@incubator.apache.org
>
>
>
> Subject: Re: Hbase-sink behavior
>
>
>
>
>
>
>
> Thank you so much Hari for the valuable response..I'll follow the
>
> guidelines provided by you.
>
>
>
>
>
>
>
> Regards,
>
>
>
> Mohammad Tariq
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
>
> <hs...@cloudera.com> wrote:
>
>
>
> Hi Mohammad,
>
>
>
>
>
>
>
> My answers are inline.
>
>
>
>
>
>
>
> --
>
>
>
> Hari Shreedharan
>
>
>
>
>
>
>
> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>
>
>
>
>
>
>
> Hello list,
>
>
>
>
>
>
>
> I am trying to use hbase-sink to collect data from a local file
>
> and
>
>
>
> dump it into an Hbase table..But there are a few things I am not
>
> able
>
>
>
> to understand and need some guidance.
>
>
>
>
>
>
>
> This is the content of my conf file :
>
>
>
>
>
>
>
> hbase-agent.sources = tail
>
>
>
> hbase-agent..sinks = sink1
>
>
>
> hbase-agent.channels = ch1
>
>
>
> hbase-agent.sources.tail.type = exec
>
>
>
> hbase-agent.sources.tail.command = tail -F
>
> /home/mohammad/demo.txt
>
>
>
> hbase-agent.sources.tail.channels = ch1
>
> hbase-agent.sinks.sink1.type
>
> =
>
>
>
> org.apache.flume.sink.hbase.HBaseSink
>
>
>
> hbase-agent.sinks.sink1.channel = ch1
>
>
>
> hbase-agent.sinks.sink1.table = test3
>
>
>
> hbase-agent.sinks.sink1.columnFamily = testing
>
>
>
> hbase-agent.sinks.sink1.column = foo
>
>
>
> hbase-agent.sinks.sink1.serializer =
>
>
>
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
>
>
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>
>
>
> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>
>
>
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
>
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>
>
>
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>
>
>
> hbase-agent.channels.ch1.type=memory
>
>
>
>
>
>
>
> Right now I am taking just some simple text from a file which has
>
>
>
> following content -
>
>
>
>
>
>
>
> value1
>
>
>
> value2
>
>
>
> value3
>
>
>
> value4
>
>
>
> value5
>
>
>
> value6
>
>
>
>
>
>
>
> And my Hbase table looks like -
>
>
>
>
>
>
>
> hbase(main):217:0> scan 'test3'
>
>
>
> ROW COLUMN+CELL
>
>
>
> 11339716704561 column=testing:col1,
>
>
>
> timestamp=1339716707569, value=value1
>
>
>
> 11339716704562 column=testing:col1,
>
>
>
> timestamp=1339716707571, value=value4
>
>
>
> 11339716846594 column=testing:col1,
>
>
>
> timestamp=1339716849608, value=value2
>
>
>
> 11339716846595 column=testing:col1,
>
>
>
> timestamp=1339716849610, value=value1
>
>
>
> 11339716846596 column=testing:col1,
>
>
>
> timestamp=1339716849611, value=value6
>
>
>
> 11339716846597 column=testing:col1,
>
>
>
> timestamp=1339716849614, value=value6
>
>
>
> 11339716846598 column=testing:col1,
>
>
>
> timestamp=1339716849615, value=value5
>
>
>
> 11339716846599 column=testing:col1,
>
>
>
> timestamp=1339716849615, value=value6
>
>
>
> incRow column=testing:col1,
>
>
>
> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
>
>
>
> 9 row(s) in 0.0580 seconds
>
>
>
>
>
>
>
> Now I have following questions -
>
>
>
>
>
>
>
> 1- Why the timestamp value is different from the row key?(I was
>
> trying
>
>
>
> to make "1+timestamp" as the rowkey)
>
>
>
>
>
>
>
> The value shown by hbase shell as timestamp is the time at which
>
> the
>
>
>
> value was inserted into Hbase, while the value inserted by Flume
>
> is
>
>
>
> the timestamp at which the sink read the event from the channel.
>
>
>
> Depending on how long the network and HBase takes, these
>
> timestamps
>
>
>
> can vary. If you want 1+timestamp as row key then you should
>
> configure
>
> it:
>
>
>
>
>
>
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
>
>
>
> appended as-is to the suffix you choose.
>
>
>
>
>
>
>
> 2- Although I am not using "incRow", it stills appear in the
>
> table
>
>
>
> with some value. Why so and what is this value??
>
>
>
>
>
>
>
> The SimpleHBaseEventSerializer is only an example class. For
>
> custom
>
>
>
> use cases you can write your own serializer by implementing
>
>
>
> HbaseEventSerializer. In this case, you have specified
>
>
>
> incrementColumn, which causes an increment on the column
>
> specified.
>
>
>
> Simply don't specify that config and that row will not appear.
>
>
>
>
>
>
>
> 3- How can avoid the last row??
>
>
>
>
>
>
>
> See above.
>
>
>
>
>
>
>
>
>
>
>
> I am still in the learning phase so please pardon my
>
> ignorance..Many
>
> thanks.
>
>
>
>
>
>
>
> No problem. Much of this is documented
>
>
>
> here:
>
>
>
> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Regards,
>
>
>
> Mohammad Tariq
>
>
>
>
>
>
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>
> 없이, 본
>
> 문서에
>
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>
> 본
>
> 메일이
>
> 잘못
>
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
>
>
> This E-mail may contain confidential information and/or copyright
>
> material. This email is intended for the use of the addressee
>
> only.
>
> If
>
> you
>
> receive this email by mistake, please either delete it without
>
> reproducing,
>
> distributing or retaining copies thereof or notify the sender
>
> immediately.
>
>
>
>
>
>
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>
> 없이, 본
>
> 문서에
>
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>
> 본
>
> 메일이
>
> 잘못
>
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
> This E-mail may contain confidential information and/or copyright
>
> material. This email is intended for the use of the addressee
>
> only.
>
> If
>
> you
>
> receive this email by mistake, please either delete it without
>
> reproducing,
>
> distributing or retaining copies thereof or notify the sender
>
> immediately.
>
>
>
>
>
>
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>
> 없이, 본
>
> 문서에
>
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>
> 본
>
> 메일이
>
> 잘못
>
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
> This E-mail may contain confidential information and/or copyright
>
> material. This email is intended for the use of the addressee
>
> only.
>
> If
>
> you
>
> receive this email by mistake, please either delete it without
>
> reproducing,
>
> distributing or retaining copies thereof or notify the sender
>
> immediately.
>
>
>
>
>
>
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>
> 없이, 본
>
> 문서에
>
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>
> 본
>
> 메일이
>
> 잘못
>
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
> This E-mail may contain confidential information and/or copyright
>
> material. This email is intended for the use of the addressee
>
> only.
>
> If
>
> you
>
> receive this email by mistake, please either delete it without
>
> reproducing,
>
> distributing or retaining copies thereof or notify the sender
>
> immediately.
>
>
>
>
>
>
>
>
>
> --
>
> Regards,
>
> Rahul Patodi
>
>
>
>
>
>
>
>
>
> --
>
> Regards,
>
> Rahul Patodi
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>
>
>


-- 
*Regards*,
Rahul Patodi

Re: Hbase-sink behavior

Posted by Hari Shreedharan <hs...@cloudera.com>.
Hi,  

Could you please make sure you hbase-site.xml is in the class path which flume is using. If the log you sent earlier was the only log you had, it means Hbase sink is unable to connect/write to Hbase. it definitely seems like Hbase client API is unable to connect. Please send your configuration too.


Thanks
Hari

--  
Hari Shreedharan


On Thursday, June 21, 2012 at 8:01 PM, ashutosh(오픈플랫폼개발팀) wrote:

>  
> Hi Folks,
>  
>  
>   
>  
>  
> I tried every options, but didn’t get any success yet. I am still not able to store data into Hbase. It seems that Hbase agent is working fine without reporting any error/warning.  I think there is some issue between hbase sink and hbase database. Can you please help me to troubleshoot this problem to identify the issue between hbase sink and hbase database. However, I used the same configuration mentioned by Mr. Rahul in chain of mails earlier. But none of these configurations worked for me.  
>  
>  
>   
>  
>  
> Please…please…please help me.  
>  
>  
>   
>  
>  
> ----------------------------------------
>  
>  
> ----------------------------------------
>  
>  
> Thanks & Regards,
>  
>  
> Ashutosh Sharma
>  
>  
> ----------------------------------------
>  
>  
>   
>  
>  
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]  
> Sent: Friday, June 22, 2012 2:59 AM
> To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> Subject: Re: Hbase-sink behavior
>  
>  
>  
>   
>  
>  
> Hi,  
>  
>  
>  
>   
>  
>  
>  
> There are a couple of things you should not here:
>  
>  
>  
>   
>  
>  
>  
> * If more than one event is read from the channel in the same millisecond, then these events will get written to HBase with the same row key, and one could potentially overwrite the older events, unless you have Hbase configured to support multiple versions.
>  
>  
>  
> * Flume does not guarantee ordering or uniqueness, it guarantees at least once delivery. if a transaction fails, then Flume will try to write all events in the transaction again, and may cause duplicates. In case of Hbase the serializer is expected to make sure duplicates do not overwrite non-duplicate data, as mentioned in the Javadocs.
>  
>  
>  
>   
>  
>  
>  
> Note that the SimpleHbaseEventSerializer is only an example, you should ideally write your own serializer and plug it in. This will ensure data is written in a way you expect.
>  
>  
>  
>   
>  
>  
>  
> Thanks
>  
>  
>  
> Hari
>  
>  
>  
>   
>  
>  
>  
> --  
>  
>  
>  
> Hari Shreedharan
>  
>  
>  
>   
>  
>  
>  
> On Thursday, June 21, 2012 at 6:27 AM, Mohammad Tariq wrote:
> >  
> > Hi Rahul,
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > Actually that has nothing to do with Flume..Simply, out of
> >  
> >  
> >  
> > excitement I used the same file more than once so all these values
> >  
> >  
> >  
> > went as different versions into the Hbase table. And when you tail a
> >  
> >  
> >  
> > file without modifying the behavior of the tail command it will take
> >  
> >  
> >  
> > only last few records and not the entire content of the file. That
> >  
> >  
> >  
> > could be a reason for the absence of value3..But there is no issue
> >  
> >  
> >  
> > from Flume's side..It totally depends on tail's behavior .
> >  
> >  
> >  
> > Regards,
> >  
> >  
> >  
> > Mohammad Tariq
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > On Thu, Jun 21, 2012 at 6:47 PM, Rahul Patodi
> >  
> >  
> >  
> > <patodirahul.hadoop@gmail.com (mailto:patodirahul.hadoop@gmail.com)> wrote:
> >  
> >  
> > >  
> > > If you look at the output provided by you in the first mail of this mail
> > >  
> > >  
> > >  
> > > thread:
> > >  
> > >  
> > >  
> > > in your file (on local file system) you have value 1 to 6 (value1, value2,
> > >  
> > >  
> > >  
> > > value3....)
> > >  
> > >  
> > >  
> > > but when you scan in hbase output is value1, value4 , value2 , value1 ,
> > >  
> > >  
> > >  
> > > value6 , value6 , value5 , value6
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > value3 is not inserted
> > >  
> > >  
> > >  
> > > value 6 is inserted 3 times
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > did you figure out why so ?
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > On Thu, Jun 21, 2012 at 6:24 PM, Mohammad Tariq <dontariq@gmail.com (mailto:dontariq@gmail.com)> wrote:
> > >  
> > >  
> > > >  
> > > >   
> > > >  
> > > >  
> > > >  
> > > > Both the commands seem similar to me.
> > > >  
> > > >  
> > > >  
> > > >   
> > > >  
> > > >  
> > > >  
> > > > Regards,
> > > >  
> > > >  
> > > >  
> > > > Mohammad Tariq
> > > >  
> > > >  
> > > >  
> > > >   
> > > >  
> > > >  
> > > >  
> > > >   
> > > >  
> > > >  
> > > >  
> > > > On Thu, Jun 21, 2012 at 5:43 PM, Rahul Patodi
> > > >  
> > > >  
> > > >  
> > > > <patodirahul.hadoop@gmail.com (mailto:patodirahul.hadoop@gmail.com)> wrote:
> > > >  
> > > >  
> > > > >  
> > > > > Hi Mohammad,
> > > > >  
> > > > >  
> > > > >  
> > > > > Thanks for your response
> > > > >  
> > > > >  
> > > > >  
> > > > > I have put this configuration:
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sources=tail
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks=sink1
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.channels=ch1
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sources.tail.type=exec
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sources.tail.command=tail -F /tmp/test05
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sources.tail.channels=ch1
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.channel=ch1
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.table=t002
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.columnFamily=cf
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.column=foo
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.serializer.payloadColumn=col1
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.serializer.incrementColumn=col1
> > > > >  
> > > > >  
> > > > >  
> > > > > #hbase-agent.sinks.sink1.serializer.keyType=timestamp
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.serializer.rowPrefix=1+
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.sinks.sink1.serializer.suffix=timestamp
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase-agent.channels.ch1.type=memory
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > Data is getting copy into HBase, but I have got another issue:
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > My input data is simply:
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > value1
> > > > >  
> > > > >  
> > > > >  
> > > > > value2
> > > > >  
> > > > >  
> > > > >  
> > > > > value3
> > > > >  
> > > > >  
> > > > >  
> > > > > value4
> > > > >  
> > > > >  
> > > > >  
> > > > > value5
> > > > >  
> > > > >  
> > > > >  
> > > > > value6
> > > > >  
> > > > >  
> > > > >  
> > > > > value7
> > > > >  
> > > > >  
> > > > >  
> > > > > value8
> > > > >  
> > > > >  
> > > > >  
> > > > > value9
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > when I run this command in HBase:
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase(main):129:0> scan 't002', {VERSIONS => 3}
> > > > >  
> > > > >  
> > > > >  
> > > > > ROW COLUMN+CELL
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755410 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758424,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value5
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755410 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758423,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value3
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755410 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758417,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value1
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755411 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758427,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value9
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755411 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758426,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value8
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755411 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758425,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value7
> > > > >  
> > > > >  
> > > > >  
> > > > > incRow column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758443,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=\x00\x00\x00\x00\x00\x00\x00\x09
> > > > >  
> > > > >  
> > > > >  
> > > > > 3 row(s) in 0.0420 seconds
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > all the data is not getting copy ??
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > When I run this command with version:
> > > > >  
> > > > >  
> > > > >  
> > > > > hbase(main):130:0> scan 't002', {VERSIONS => 3}
> > > > >  
> > > > >  
> > > > >  
> > > > > ROW COLUMN+CELL
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755410 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758424,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value5
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755410 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758423,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value3
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755410 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758417,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value1
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755411 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758427,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value9
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755411 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758426,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value8
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279755411 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279758425,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value7
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279906637 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279909652,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value1
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279906638 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279909659,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value6
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279906638 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279909658,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value5
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279906638 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279909654,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value3
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279906646 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279909659,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value7
> > > > >  
> > > > >  
> > > > >  
> > > > > 1+1340279906647 column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279909659,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=value9
> > > > >  
> > > > >  
> > > > >  
> > > > > incRow column=cf:col1,
> > > > >  
> > > > >  
> > > > >  
> > > > > timestamp=1340279909677,
> > > > >  
> > > > >  
> > > > >  
> > > > > value=\x00\x00\x00\x00\x00\x00\x00\x12
> > > > >  
> > > > >  
> > > > >  
> > > > > 7 row(s) in 0.0640 seconds
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > Please help me understand this.
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <dontariq@gmail.com (mailto:dontariq@gmail.com)>
> > > > >  
> > > > >  
> > > > >  
> > > > > wrote:
> > > > >  
> > > > >  
> > > > > >  
> > > > > >   
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > Hi Will,
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >   
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > I got it.Thanks for the info.
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >   
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > Regards,
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > Mohammad Tariq
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >   
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >   
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <will@cloudera.com (mailto:will@cloudera.com)>
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > wrote:
> > > > > >  
> > > > > >  
> > > > > > >  
> > > > > > > Hi Mohammad,
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >   
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > In your config file, I think you need to remove this line:
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >   
> > > > > > >  
> > > > > > >  
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >   
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > (although there is a keyType var that stores the value of the
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > 'suffix'
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > prop).
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >   
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > Cheers,
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > Will
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >   
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >   
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <dontariq@gmail.com (mailto:dontariq@gmail.com)>
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > wrote:
> > > > > > >  
> > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > Hi Rahul,
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > This normally happens when there is some problem in the
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > configuration file.Create a file called hbase-agent inside your
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > FLUME_HOME/conf directory and copy this content into it:
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sources = tail
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks = sink1
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.channels = ch1
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sources.tail.type = exec
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sources.tail.channels = ch1
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.channel = ch1
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.table = demo
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.columnFamily = cf
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.serializer =
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.serializer.suffix = timestamp
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > hbase-agent.channels.ch1.type=memory
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > Then start the agent and see if it works for you. It worked for me.
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > Regards,
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > Mohammad Tariq
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >   
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <will@cloudera.com (mailto:will@cloudera.com)>
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > wrote:
> > > > > > > >  
> > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Hi Sharma,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >   
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > So I assume that your command looks something like this:
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > flume-ng agent -n hbase-agent -f
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > /home/hadoop/flumeng/hbaseagent.conf
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > -c /etc/flume-ng/conf
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >   
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > ...?
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >   
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Hari, I saw your comment:
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >   
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > > I am not sure if HBase changed their wire protocol between these
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > > versions.
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Do you have any other advice about troubleshooting a possible
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > hbase
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > protocol
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > mismatch issue?
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >   
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Cheers,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Will
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >   
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >   
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >   
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > <sharma.ashutosh@kt.com (mailto:sharma.ashutosh@kt.com)>
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > wrote:
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hi Will,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > I installed flume as part of CDH3u4 version 1.1 using yum install
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > flume-ng. One more point, I am using flume-ng hbase sink
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > downloaded
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > from:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Now, I ran the agent with -conf parameter with updated
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > log4j.properties. I
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > don't see any error in the log. Please see the below from the log
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > file:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > lifecycle supervisor 1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > -
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,146 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Node
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > manager starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > lifecycle supervisor 9
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,146 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider: Configuration
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > provider
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,148 DEBUG
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Node
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > manager started
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,148 DEBUG
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider: Configuration
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > provider
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > started
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,149 DEBUG
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider: Checking
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > file:/home/hadoop/flumeng/hbaseagent.conf (file:///\\home\hadoop\flumeng\hbaseagent.conf) for changes
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,149 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider: Reloading
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > configuration
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > file:/home/hadoop/flumeng/hbaseagent.conf (file:///\\home\hadoop\flumeng\hbaseagent.conf)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sinks:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Agent: hbase-agent
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > context
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > for
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sink1: serializer.rowPrefix
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > validation
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > of configuration for agent: hbase-agent, initial-configuration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > AgentConfiguration[hbase-agent]
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > SOURCES: {tail={ parameters:{command=tail -f
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > /home/hadoop/demo.txt,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > channels=ch1, type=exec} }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > CHANNELS: {ch1={ parameters:{type=memory} }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer.keyType=timestamp,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer.incrementColumn=col1, column=foo,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer.rowPrefix=1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > batchSize=1, columnFamily=cf1, table=test,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer.suffix=timestamp} }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > channel
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ch1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sink:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sink1 using OTHER
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > validation
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > configuration for hbase-agent
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > AgentConfiguration created without Configuration stubs for which
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > only
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > basic syntactical validation was performed[hbase-agent]
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > SOURCES: {tail={ parameters:{command=tail -f
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > /home/hadoop/demo.txt,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > channels=ch1, type=exec} }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > CHANNELS: {ch1={ parameters:{type=memory} }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer.keyType=timestamp,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer.incrementColumn=col1, column=foo,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer.rowPrefix=1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > batchSize=1, columnFamily=cf1, table=test,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > serializer.suffix=timestamp} }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Channels:ch1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > tail
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Post-validation
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > flume configuration contains configuration for agents:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > [hbase-agent]
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,171 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider: Creating channels
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Creating
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > instance of channel ch1 type memory
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,175 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider: created channel
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ch1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Creating
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > instance of source tail, type exec
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > instance
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > of
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > org.apache.flume.sink.hbase.HBaseSink is a custom type
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,298 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Node
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > configuration change:{
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sourceRunners:{tail=EventDrivenSourceRunner:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > {
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > source:org.apache.flume.source.ExecSource@1fd0fafc }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sinkRunners:{sink1=SinkRunner: {
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > counterGroup:{
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > name:null counters:{} } }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > with
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > command:tail -f /home/hadoop/demo.txt
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > started
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Output of the which Flume-ng is:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > /usr/bin/flume-ng
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Thanks & Regards,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Ashutosh Sharma
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Cell: 010-7300-0150
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Email: sharma.ashutosh@kt.com (mailto:sharma.ashutosh@kt.com)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > From: Will McQueen [mailto:will@cloudera.com]
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Sent: Thursday, June 21, 2012 6:07 PM
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Subject: Re: Hbase-sink behavior
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hi Sharma,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Could you please describe how you installed flume? Also, I see
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > you're
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > getting this warning:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > > Warning: No configuration directory set! Use --conf <dir> to
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > > override.
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > The log4j.properties that flume provides is stored in the conf
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > dir.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > If
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > you
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > specify the flume conf dir, flume can pick it up. So for
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > troubleshooting you
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > can try:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 1) modifying the log4j.properties within flume's conf dir so that
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > the
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > top
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > reads:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > #flume.root.logger=DEBUG,console
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > flume.root.logger=DEBUG,LOGFILE
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > flume.log.dir=.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > flume.log.file=flume.log
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2) Run the flume agent while specifying the flume conf dir
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > (--conf
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > <dir>)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 3) What's the output of 'which flume-ng'?
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Cheers,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Will
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > <sharma.ashutosh@kt.com (mailto:sharma.ashutosh@kt.com)> wrote:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hi Hari,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > I checked, agent is successfully tailing the file which I
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > mentioned.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Yes,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > you are right, agent has started properly without any error.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Because
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > there
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > is no further movement, so it's hard for me to identify the
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > issue. I
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > also
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > used tail -F also, but no success.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Can you suggest me some technique to troubleshoot it, so I could
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > identify
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > the issue and resolve the same. Does flume record some log
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > anywhere?
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Thanks & Regards,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Ashutosh Sharma
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Cell: 010-7300-0150
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Email: sharma.ashutosh@kt.com (mailto:sharma.ashutosh@kt.com)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Sent: Thursday, June 21, 2012 5:25 PM
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Subject: Re: Hbase-sink behavior
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > I am not sure if HBase changed their wire protocol between these
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > versions.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Looks like your agent has started properly. Are you sure data is
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > being
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > written into the file being tailed? I suggest using tail -F. The
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > log
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > being
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > stuck here is ok, that is probably because nothing specific is
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > required(or
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > your log file rotated).
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Thanks
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hari
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > --
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hari Shreedharan
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hi Hari,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Thanks for your prompt reply. I already created the table in
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hbase
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > with
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > a
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > column family and hadoop/hbase library is available to hadoop. I
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > noticed
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Please see the below lines captured while running the flume
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > agent:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > > flume-ng agent -n hbase-agent -f
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > > /home/hadoop/flumeng/hbaseagent.conf
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Warning: No configuration directory set! Use --conf <dir> to
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > override.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Info: Including Hadoop libraries found via (/usr/bin/hadoop) for
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > HDFS
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > access
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > classpath
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > from
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > classpath
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > org.apache.flume.node.Application -n hbase-agent -f
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > /home/hadoop/flumeng/hbaseagent.conf
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > lifecycle
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > supervisor 1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Node
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > manager
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > lifecycle
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > supervisor 10
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Configuration provider starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf (file:///\\home\hadoop\flumeng\hbaseagent.conf)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Agent:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > flume
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > configuration contains configuration for agents: [hbase-agent]
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Creating channels
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > properties.PropertiesFileConfigurationProvider:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > created channel ch1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > of
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sink
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Node
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > configuration change:{
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sourceRunners:{tail=EventDrivenSourceRunner:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > {
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > source:org.apache.flume.source.ExecSource@1ed0af9b }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > sinkRunners:{sink1=SinkRunner: {
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > counterGroup:{
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > name:null counters:{} } }}
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > with
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > command:tail -f /home/hadoop/demo.txt
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Screen stuck here....no movement.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Thanks & Regards,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Ashutosh Sharma
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Sent: Thursday, June 21, 2012 5:01 PM
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Subject: Re: Hbase-sink behavior
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hi Ashutosh,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > The sink will not create the table or column family. Make sure
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > you
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > have
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > the table and column family. Also please make sure you have
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > are
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > in
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > your
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > class path).
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Thanks
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hari
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > --
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hari Shreedharan
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hi,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > I have used and followed the same steps which is mentioned in
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > below
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > mails
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > to get start with the hbasesink. But agent is not storing any
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > data
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > into
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase. I added the hbase-site.xml in $CLASSPATH variable to pick
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > the
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > information. Even I am able to connect to the hbase server from
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > that
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > agent
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > machine.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Now, I am unable to understand and troubleshoot this problem.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Seeking
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > advice from the community members....
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Thanks & Regards,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Ashutosh Sharma
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ----------------------------------------
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > -----Original Message-----
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > From: Mohammad Tariq [mailto:dontariq@gmail.com]
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Sent: Friday, June 15, 2012 9:02 AM
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Subject: Re: Hbase-sink behavior
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Thank you so much Hari for the valuable response..I'll follow the
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > guidelines provided by you.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Regards,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Mohammad Tariq
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)> wrote:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hi Mohammad,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > My answers are inline.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > --
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hari Shreedharan
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Hello list,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > I am trying to use hbase-sink to collect data from a local file
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > and
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > dump it into an Hbase table..But there are a few things I am not
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > able
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > to understand and need some guidance.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > This is the content of my conf file :
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sources = tail
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent..sinks = sink1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.channels = ch1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sources.tail.type = exec
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sources.tail.command = tail -F
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > /home/mohammad/demo.txt
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sources.tail.channels = ch1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.type
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > =
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > org.apache.flume.sink.hbase.HBaseSink
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.channel = ch1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.table = test3
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.columnFamily = testing
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.column = foo
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.serializer =
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.serializer.suffix = timestamp
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.channels.ch1.type=memory
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Right now I am taking just some simple text from a file which has
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > following content -
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > value1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > value2
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > value3
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > value4
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > value5
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > value6
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > And my Hbase table looks like -
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase(main):217:0> scan 'test3'
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ROW COLUMN+CELL
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 11339716704561 column=testing:col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamp=1339716707569, value=value1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 11339716704562 column=testing:col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamp=1339716707571, value=value4
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 11339716846594 column=testing:col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamp=1339716849608, value=value2
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 11339716846595 column=testing:col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamp=1339716849610, value=value1
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 11339716846596 column=testing:col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamp=1339716849611, value=value6
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 11339716846597 column=testing:col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamp=1339716849614, value=value6
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 11339716846598 column=testing:col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamp=1339716849615, value=value5
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 11339716846599 column=testing:col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamp=1339716849615, value=value6
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > incRow column=testing:col1,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 9 row(s) in 0.0580 seconds
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Now I have following questions -
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 1- Why the timestamp value is different from the row key?(I was
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > trying
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > to make "1+timestamp" as the rowkey)
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > The value shown by hbase shell as timestamp is the time at which
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > the
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > value was inserted into Hbase, while the value inserted by Flume
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > is
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > the timestamp at which the sink read the event from the channel.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Depending on how long the network and HBase takes, these
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > timestamps
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > can vary. If you want 1+timestamp as row key then you should
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > configure
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > it:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > appended as-is to the suffix you choose.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 2- Although I am not using "incRow", it stills appear in the
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > table
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > with some value. Why so and what is this value??
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > The SimpleHBaseEventSerializer is only an example class. For
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > custom
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > use cases you can write your own serializer by implementing
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > HbaseEventSerializer. In this case, you have specified
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > incrementColumn, which causes an increment on the column
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > specified.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Simply don't specify that config and that row will not appear.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 3- How can avoid the last row??
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > See above.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > I am still in the learning phase so please pardon my
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > ignorance..Many
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > thanks.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > No problem. Much of this is documented
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > here:
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Regards,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > Mohammad Tariq
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 없이, 본
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 문서에
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 본
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 메일이
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 잘못
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > This E-mail may contain confidential information and/or copyright
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > material. This email is intended for the use of the addressee
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > only.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > If
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > you
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > receive this email by mistake, please either delete it without
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > reproducing,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > distributing or retaining copies thereof or notify the sender
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > immediately.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 없이, 본
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 문서에
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 본
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 메일이
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 잘못
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > This E-mail may contain confidential information and/or copyright
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > material. This email is intended for the use of the addressee
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > only.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > If
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > you
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > receive this email by mistake, please either delete it without
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > reproducing,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > distributing or retaining copies thereof or notify the sender
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > immediately.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 없이, 본
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 문서에
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 본
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 메일이
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 잘못
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > This E-mail may contain confidential information and/or copyright
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > material. This email is intended for the use of the addressee
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > only.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > If
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > you
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > receive this email by mistake, please either delete it without
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > reproducing,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > distributing or retaining copies thereof or notify the sender
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > immediately.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >   
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 없이, 본
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 문서에
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 본
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 메일이
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 잘못
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > This E-mail may contain confidential information and/or copyright
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > material. This email is intended for the use of the addressee
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > only.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > If
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > you
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > receive this email by mistake, please either delete it without
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > reproducing,
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > distributing or retaining copies thereof or notify the sender
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > immediately.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > >  
> > > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > >   
> > > > >  
> > > > >  
> > > > >  
> > > > > --
> > > > >  
> > > > >  
> > > > >  
> > > > > Regards,
> > > > >  
> > > > >  
> > > > >  
> > > > > Rahul Patodi
> > > > >  
> > > > >  
> > > > >  
> > > >  
> > > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > --
> > >  
> > >  
> > >  
> > > Regards,
> > >  
> > >  
> > >  
> > > Rahul Patodi
> > >  
> > >  
> > >  
> >  
> >  
> >  
>  
>  
>   
>  
>  
>  
>   
>  
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.  
> This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.
>  
>  
>  
>  



RE: Hbase-sink behavior

Posted by "ashutosh (오픈플랫폼개발팀)" <sh...@kt.com>.
Hi Folks,

I tried every options, but didn’t get any success yet. I am still not able to store data into Hbase. It seems that Hbase agent is working fine without reporting any error/warning.  I think there is some issue between hbase sink and hbase database. Can you please help me to troubleshoot this problem to identify the issue between hbase sink and hbase database. However, I used the same configuration mentioned by Mr. Rahul in chain of mails earlier. But none of these configurations worked for me.

Please…please…please help me.

----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
----------------------------------------

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Friday, June 22, 2012 2:59 AM
To: flume-user@incubator.apache.org
Subject: Re: Hbase-sink behavior

Hi,

There are a couple of things you should not here:

* If more than one event is read from the channel in the same millisecond, then these events will get written to HBase with the same row key, and one could potentially overwrite the older events, unless you have Hbase configured to support multiple versions.
* Flume does not guarantee ordering or uniqueness, it guarantees at least once delivery. if a transaction fails, then Flume will try to write all events in the transaction again, and may cause duplicates. In case of Hbase the serializer is expected to make sure duplicates do not overwrite non-duplicate data, as mentioned in the Javadocs.

Note that the SimpleHbaseEventSerializer is only an example, you should ideally write your own serializer and plug it in. This will ensure data is written in a way you expect.

Thanks
Hari

--
Hari Shreedharan


On Thursday, June 21, 2012 at 6:27 AM, Mohammad Tariq wrote:
Hi Rahul,

Actually that has nothing to do with Flume..Simply, out of
excitement I used the same file more than once so all these values
went as different versions into the Hbase table. And when you tail a
file without modifying the behavior of the tail command it will take
only last few records and not the entire content of the file. That
could be a reason for the absence of value3..But there is no issue
from Flume's side..It totally depends on tail's behavior .
Regards,
Mohammad Tariq


On Thu, Jun 21, 2012 at 6:47 PM, Rahul Patodi
<pa...@gmail.com>> wrote:
If you look at the output provided by you in the first mail of this mail
thread:
in your file (on local file system) you have value 1 to 6 (value1, value2,
value3....)
but when you scan in hbase output is value1, value4 , value2 , value1 ,
value6 , value6 , value5 , value6

value3 is not inserted
value 6 is inserted 3 times

did you figure out why so ?



On Thu, Jun 21, 2012 at 6:24 PM, Mohammad Tariq <do...@gmail.com>> wrote:

Both the commands seem similar to me.

Regards,
Mohammad Tariq


On Thu, Jun 21, 2012 at 5:43 PM, Rahul Patodi
<pa...@gmail.com>> wrote:
Hi Mohammad,
Thanks for your response
I have put this configuration:

hbase-agent.sources=tail
hbase-agent.sinks=sink1
hbase-agent.channels=ch1

hbase-agent.sources.tail.type=exec
hbase-agent.sources.tail.command=tail -F /tmp/test05
hbase-agent.sources.tail.channels=ch1

hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink
hbase-agent.sinks.sink1.channel=ch1
hbase-agent.sinks.sink1.table=t002
hbase-agent.sinks.sink1.columnFamily=cf
hbase-agent.sinks.sink1.column=foo

hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
hbase-agent.sinks.sink1.serializer.payloadColumn=col1
hbase-agent.sinks.sink1.serializer.incrementColumn=col1
#hbase-agent.sinks.sink1.serializer.keyType=timestamp
hbase-agent.sinks.sink1.serializer.rowPrefix=1+
hbase-agent.sinks.sink1.serializer.suffix=timestamp

hbase-agent.channels.ch1.type=memory


Data is getting copy into HBase, but I have got another issue:

My input data is simply:

value1
value2
value3
value4
value5
value6
value7
value8
value9

when I run this command in HBase:
hbase(main):129:0> scan 't002', {VERSIONS => 3}
ROW COLUMN+CELL
1+1340279755410 column=cf:col1,
timestamp=1340279758424,
value=value5
1+1340279755410 column=cf:col1,
timestamp=1340279758423,
value=value3
1+1340279755410 column=cf:col1,
timestamp=1340279758417,
value=value1
1+1340279755411 column=cf:col1,
timestamp=1340279758427,
value=value9
1+1340279755411 column=cf:col1,
timestamp=1340279758426,
value=value8
1+1340279755411 column=cf:col1,
timestamp=1340279758425,
value=value7
incRow column=cf:col1,
timestamp=1340279758443,
value=\x00\x00\x00\x00\x00\x00\x00\x09
3 row(s) in 0.0420 seconds

all the data is not getting copy ??

When I run this command with version:
hbase(main):130:0> scan 't002', {VERSIONS => 3}
ROW COLUMN+CELL
1+1340279755410 column=cf:col1,
timestamp=1340279758424,
value=value5
1+1340279755410 column=cf:col1,
timestamp=1340279758423,
value=value3
1+1340279755410 column=cf:col1,
timestamp=1340279758417,
value=value1
1+1340279755411 column=cf:col1,
timestamp=1340279758427,
value=value9
1+1340279755411 column=cf:col1,
timestamp=1340279758426,
value=value8
1+1340279755411 column=cf:col1,
timestamp=1340279758425,
value=value7
1+1340279906637 column=cf:col1,
timestamp=1340279909652,
value=value1
1+1340279906638 column=cf:col1,
timestamp=1340279909659,
value=value6
1+1340279906638 column=cf:col1,
timestamp=1340279909658,
value=value5
1+1340279906638 column=cf:col1,
timestamp=1340279909654,
value=value3
1+1340279906646 column=cf:col1,
timestamp=1340279909659,
value=value7
1+1340279906647 column=cf:col1,
timestamp=1340279909659,
value=value9
incRow column=cf:col1,
timestamp=1340279909677,
value=\x00\x00\x00\x00\x00\x00\x00\x12
7 row(s) in 0.0640 seconds

Please help me understand this.




On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <do...@gmail.com>>
wrote:

Hi Will,

I got it.Thanks for the info.

Regards,
Mohammad Tariq


On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <wi...@cloudera.com>>
wrote:
Hi Mohammad,

In your config file, I think you need to remove this line:

hbase-agent.sinks.sink1.serializer.keyType = timestamp

I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
(although there is a keyType var that stores the value of the
'suffix'
prop).

Cheers,
Will


On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com>>
wrote:

Hi Rahul,

This normally happens when there is some problem in the
configuration file.Create a file called hbase-agent inside your
FLUME_HOME/conf directory and copy this content into it:
hbase-agent.sources = tail
hbase-agent.sinks = sink1
hbase-agent.channels = ch1

hbase-agent.sources.tail.type = exec
hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
hbase-agent.sources.tail.channels = ch1

hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
hbase-agent.sinks.sink1.channel = ch1
hbase-agent.sinks.sink1.table = demo
hbase-agent.sinks.sink1.columnFamily = cf

hbase-agent.sinks.sink1.serializer =
org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
hbase-agent.sinks.sink1.serializer.payloadColumn = col1

hbase-agent.sinks.sink1.serializer.keyType = timestamp
hbase-agent.sinks.sink1.serializer.rowPrefix = 1
hbase-agent.sinks.sink1.serializer.suffix = timestamp

hbase-agent.channels.ch1.type=memory

Then start the agent and see if it works for you. It worked for me.

Regards,
Mohammad Tariq


On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com>>
wrote:
Hi Sharma,

So I assume that your command looks something like this:
flume-ng agent -n hbase-agent -f
/home/hadoop/flumeng/hbaseagent.conf
-c /etc/flume-ng/conf

...?

Hari, I saw your comment:

I am not sure if HBase changed their wire protocol between these
versions.
Do you have any other advice about troubleshooting a possible
hbase
protocol
mismatch issue?

Cheers,
Will



On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
<sh...@kt.com>>
wrote:

Hi Will,



I installed flume as part of CDH3u4 version 1.1 using yum install
flume-ng. One more point, I am using flume-ng hbase sink
downloaded
from:



https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar



Now, I ran the agent with -conf parameter with updated
log4j.properties. I
don't see any error in the log. Please see the below from the log
file:



2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor:
Starting
lifecycle supervisor 1

2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting
-
hbase-agent

2012-06-21 18:25:08,146 INFO
nodemanager.DefaultLogicalNodeManager:
Node
manager starting

2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor:
Starting
lifecycle supervisor 9

2012-06-21 18:25:08,146 INFO
properties.PropertiesFileConfigurationProvider: Configuration
provider
starting

2012-06-21 18:25:08,148 DEBUG
nodemanager.DefaultLogicalNodeManager:
Node
manager started

2012-06-21 18:25:08,148 DEBUG
properties.PropertiesFileConfigurationProvider: Configuration
provider
started

2012-06-21 18:25:08,149 DEBUG
properties.PropertiesFileConfigurationProvider: Checking
file:/home/hadoop/flumeng/hbaseagent.conf<file:///\\home\hadoop\flumeng\hbaseagent.conf> for changes

2012-06-21 18:25:08,149 INFO
properties.PropertiesFileConfigurationProvider: Reloading
configuration
file:/home/hadoop/flumeng/hbaseagent.conf<file:///\\home\hadoop\flumeng\hbaseagent.conf>

2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added
sinks:
sink1
Agent: hbase-agent

2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created
context
for
sink1: serializer.rowPrefix

2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
Processing:sink1

2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
validation
of configuration for agent: hbase-agent, initial-configuration:
AgentConfiguration[hbase-agent]

SOURCES: {tail={ parameters:{command=tail -f
/home/hadoop/demo.txt,
channels=ch1, type=exec} }}

CHANNELS: {ch1={ parameters:{type=memory} }}

SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
serializer.keyType=timestamp,

serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
serializer.incrementColumn=col1, column=foo,
serializer.rowPrefix=1,
batchSize=1, columnFamily=cf1, table=test,
type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
serializer.suffix=timestamp} }}



2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created
channel
ch1

2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating
sink:
sink1 using OTHER

2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post
validation
configuration for hbase-agent

AgentConfiguration created without Configuration stubs for which
only
basic syntactical validation was performed[hbase-agent]

SOURCES: {tail={ parameters:{command=tail -f
/home/hadoop/demo.txt,
channels=ch1, type=exec} }}

CHANNELS: {ch1={ parameters:{type=memory} }}

SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
serializer.keyType=timestamp,

serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
serializer.incrementColumn=col1, column=foo,
serializer.rowPrefix=1,
batchSize=1, columnFamily=cf1, table=test,
type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
serializer.suffix=timestamp} }}

2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration:
Channels:ch1



2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks
sink1



2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources
tail



2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration:
Post-validation
flume configuration contains configuration for agents:
[hbase-agent]

2012-06-21 18:25:08,171 INFO
properties.PropertiesFileConfigurationProvider: Creating channels

2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory:
Creating
instance of channel ch1 type memory

2012-06-21 18:25:08,175 INFO
properties.PropertiesFileConfigurationProvider: created channel
ch1

2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory:
Creating
instance of source tail, type exec

2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating
instance
of
sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink

2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
org.apache.flume.sink.hbase.HBaseSink is a custom type

2012-06-21 18:25:08,298 INFO
nodemanager.DefaultLogicalNodeManager:
Node
configuration change:{
sourceRunners:{tail=EventDrivenSourceRunner:
{
source:org.apache.flume.source.ExecSource@1fd0fafc }}
sinkRunners:{sink1=SinkRunner: {
policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
counterGroup:{
name:null counters:{} } }}
channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }

2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source
starting
with
command:tail -f /home/hadoop/demo.txt

2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source
started



Output of the which Flume-ng is:

/usr/bin/flume-ng





----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

Cell: 010-7300-0150

Email: sharma.ashutosh@kt.com<ma...@kt.com>

----------------------------------------



From: Will McQueen [mailto:will@cloudera.com]
Sent: Thursday, June 21, 2012 6:07 PM


To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior



Hi Sharma,



Could you please describe how you installed flume? Also, I see
you're
getting this warning:

Warning: No configuration directory set! Use --conf <dir> to
override.



The log4j.properties that flume provides is stored in the conf
dir.
If
you
specify the flume conf dir, flume can pick it up. So for
troubleshooting you
can try:


1) modifying the log4j.properties within flume's conf dir so that
the
top
reads:
#flume.root.logger=DEBUG,console
flume.root.logger=DEBUG,LOGFILE
flume.log.dir=.
flume.log.file=flume.log

2) Run the flume agent while specifying the flume conf dir
(--conf
<dir>)

3) What's the output of 'which flume-ng'?

Cheers,
Will

On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
<sh...@kt.com>> wrote:

Hi Hari,



I checked, agent is successfully tailing the file which I
mentioned.
Yes,
you are right, agent has started properly without any error.
Because
there
is no further movement, so it's hard for me to identify the
issue. I
also
used tail -F also, but no success.

Can you suggest me some technique to troubleshoot it, so I could
identify
the issue and resolve the same. Does flume record some log
anywhere?



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

Cell: 010-7300-0150

Email: sharma.ashutosh@kt.com<ma...@kt.com>

----------------------------------------



From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Thursday, June 21, 2012 5:25 PM


To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior



I am not sure if HBase changed their wire protocol between these
versions.
Looks like your agent has started properly. Are you sure data is
being
written into the file being tailed? I suggest using tail -F. The
log
being
stuck here is ok, that is probably because nothing specific is
required(or
your log file rotated).



Thanks

Hari



--

Hari Shreedharan



On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:

Hi Hari,



Thanks for your prompt reply. I already created the table in
Hbase
with
a
column family and hadoop/hbase library is available to hadoop. I
noticed
that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?

Please see the below lines captured while running the flume
agent:



flume-ng agent -n hbase-agent -f
/home/hadoop/flumeng/hbaseagent.conf

Warning: No configuration directory set! Use --conf <dir> to
override.

Info: Including Hadoop libraries found via (/usr/bin/hadoop) for
HDFS
access

Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
classpath

Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar
from
classpath

+ exec /home/hadoop/jdk16/bin/java -Xmx20m -cp



'/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'

-Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
org.apache.flume.node.Application -n hbase-agent -f
/home/hadoop/flumeng/hbaseagent.conf

12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
lifecycle
supervisor 1

12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
hbase-agent

12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
Node
manager
starting

12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
lifecycle
supervisor 10

12/06/21 16:40:42 INFO
properties.PropertiesFileConfigurationProvider:
Configuration provider starting

12/06/21 16:40:42 INFO
properties.PropertiesFileConfigurationProvider:
Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf<file:///\\home\hadoop\flumeng\hbaseagent.conf>

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks:
sink1
Agent:
hbase-agent

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation
flume
configuration contains configuration for agents: [hbase-agent]

12/06/21 16:40:42 INFO
properties.PropertiesFileConfigurationProvider:
Creating channels

12/06/21 16:40:42 INFO
properties.PropertiesFileConfigurationProvider:
created channel ch1

12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance
of
sink
sink1 typeorg.apache.flume.sink.hbase.HBaseSink

12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
Node
configuration change:{
sourceRunners:{tail=EventDrivenSourceRunner:
{
source:org.apache.flume.source.ExecSource@1ed0af9b }}
sinkRunners:{sink1=SinkRunner: {
policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
counterGroup:{
name:null counters:{} } }}
channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }

12/06/21 16:40:42 INFO source.ExecSource: Exec source starting
with
command:tail -f /home/hadoop/demo.txt



Screen stuck here....no movement.



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------



From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Thursday, June 21, 2012 5:01 PM
To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior



Hi Ashutosh,



The sink will not create the table or column family. Make sure
you
have
the table and column family. Also please make sure you have
HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they
are
in
your
class path).





Thanks

Hari



--

Hari Shreedharan



On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:

Hi,



I have used and followed the same steps which is mentioned in
below
mails
to get start with the hbasesink. But agent is not storing any
data
into
hbase. I added the hbase-site.xml in $CLASSPATH variable to pick
the
hbase
information. Even I am able to connect to the hbase server from
that
agent
machine.



Now, I am unable to understand and troubleshoot this problem.
Seeking
advice from the community members....



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------



-----Original Message-----

From: Mohammad Tariq [mailto:dontariq@gmail.com]

Sent: Friday, June 15, 2012 9:02 AM

To: flume-user@incubator.apache.org<ma...@incubator.apache.org>

Subject: Re: Hbase-sink behavior



Thank you so much Hari for the valuable response..I'll follow the
guidelines provided by you.



Regards,

Mohammad Tariq





On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
<hs...@cloudera.com>> wrote:

Hi Mohammad,



My answers are inline.



--

Hari Shreedharan



On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:



Hello list,



I am trying to use hbase-sink to collect data from a local file
and

dump it into an Hbase table..But there are a few things I am not
able

to understand and need some guidance.



This is the content of my conf file :



hbase-agent.sources = tail

hbase-agent..sinks = sink1

hbase-agent.channels = ch1

hbase-agent.sources.tail.type = exec

hbase-agent.sources.tail.command = tail -F
/home/mohammad/demo.txt

hbase-agent.sources.tail.channels = ch1
hbase-agent.sinks.sink1.type
=

org.apache.flume.sink.hbase.HBaseSink

hbase-agent.sinks.sink1.channel = ch1

hbase-agent.sinks.sink1.table = test3

hbase-agent.sinks.sink1.columnFamily = testing

hbase-agent.sinks.sink1.column = foo

hbase-agent.sinks.sink1.serializer =

org.apache.flume.sink.hbase.SimpleHbaseEventSerializer

hbase-agent.sinks.sink1.serializer.payloadColumn = col1

hbase-agent.sinks.sink1.serializer.incrementColumn = col1

hbase-agent.sinks.sink1.serializer.keyType = timestamp

hbase-agent.sinks.sink1.serializer.rowPrefix = 1

hbase-agent.sinks.sink1.serializer.suffix = timestamp

hbase-agent.channels.ch1.type=memory



Right now I am taking just some simple text from a file which has

following content -



value1

value2

value3

value4

value5

value6



And my Hbase table looks like -



hbase(main):217:0> scan 'test3'

ROW COLUMN+CELL

11339716704561 column=testing:col1,

timestamp=1339716707569, value=value1

11339716704562 column=testing:col1,

timestamp=1339716707571, value=value4

11339716846594 column=testing:col1,

timestamp=1339716849608, value=value2

11339716846595 column=testing:col1,

timestamp=1339716849610, value=value1

11339716846596 column=testing:col1,

timestamp=1339716849611, value=value6

11339716846597 column=testing:col1,

timestamp=1339716849614, value=value6

11339716846598 column=testing:col1,

timestamp=1339716849615, value=value5

11339716846599 column=testing:col1,

timestamp=1339716849615, value=value6

incRow column=testing:col1,

timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C

9 row(s) in 0.0580 seconds



Now I have following questions -



1- Why the timestamp value is different from the row key?(I was
trying

to make "1+timestamp" as the rowkey)



The value shown by hbase shell as timestamp is the time at which
the

value was inserted into Hbase, while the value inserted by Flume
is

the timestamp at which the sink read the event from the channel.

Depending on how long the network and HBase takes, these
timestamps

can vary. If you want 1+timestamp as row key then you should
configure
it:



hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is

appended as-is to the suffix you choose.



2- Although I am not using "incRow", it stills appear in the
table

with some value. Why so and what is this value??



The SimpleHBaseEventSerializer is only an example class. For
custom

use cases you can write your own serializer by implementing

HbaseEventSerializer. In this case, you have specified

incrementColumn, which causes an increment on the column
specified.

Simply don't specify that config and that row will not appear.



3- How can avoid the last row??



See above.





I am still in the learning phase so please pardon my
ignorance..Many
thanks.



No problem. Much of this is documented

here:

https://builds.apache.org/job/flume-trunk/site/apidocs/index.html







Regards,

Mohammad Tariq





이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
없이, 본
문서에
포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
본
메일이
잘못
전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.

This E-mail may contain confidential information and/or copyright
material. This email is intended for the use of the addressee
only.
If
you
receive this email by mistake, please either delete it without
reproducing,
distributing or retaining copies thereof or notify the sender
immediately.





이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
없이, 본
문서에
포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
본
메일이
잘못
전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright
material. This email is intended for the use of the addressee
only.
If
you
receive this email by mistake, please either delete it without
reproducing,
distributing or retaining copies thereof or notify the sender
immediately.





이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
없이, 본
문서에
포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
본
메일이
잘못
전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright
material. This email is intended for the use of the addressee
only.
If
you
receive this email by mistake, please either delete it without
reproducing,
distributing or retaining copies thereof or notify the sender
immediately.





이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
없이, 본
문서에
포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
본
메일이
잘못
전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright
material. This email is intended for the use of the addressee
only.
If
you
receive this email by mistake, please either delete it without
reproducing,
distributing or retaining copies thereof or notify the sender
immediately.




--
Regards,
Rahul Patodi




--
Regards,
Rahul Patodi



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.

Re: Hbase-sink behavior

Posted by Hari Shreedharan <hs...@cloudera.com>.
Hi,  

There are a couple of things you should not here:

* If more than one event is read from the channel in the same millisecond, then these events will get written to HBase with the same row key, and one could potentially overwrite the older events, unless you have Hbase configured to support multiple versions.
* Flume does not guarantee ordering or uniqueness, it guarantees at least once delivery. if a transaction fails, then Flume will try to write all events in the transaction again, and may cause duplicates. In case of Hbase the serializer is expected to make sure duplicates do not overwrite non-duplicate data, as mentioned in the Javadocs.

Note that the SimpleHbaseEventSerializer is only an example, you should ideally write your own serializer and plug it in. This will ensure data is written in a way you expect.

Thanks
Hari

--  
Hari Shreedharan


On Thursday, June 21, 2012 at 6:27 AM, Mohammad Tariq wrote:

> Hi Rahul,
>  
> Actually that has nothing to do with Flume..Simply, out of
> excitement I used the same file more than once so all these values
> went as different versions into the Hbase table. And when you tail a
> file without modifying the behavior of the tail command it will take
> only last few records and not the entire content of the file. That
> could be a reason for the absence of value3..But there is no issue
> from Flume's side..It totally depends on tail's behavior .
> Regards,
> Mohammad Tariq
>  
>  
> On Thu, Jun 21, 2012 at 6:47 PM, Rahul Patodi
> <patodirahul.hadoop@gmail.com (mailto:patodirahul.hadoop@gmail.com)> wrote:
> > If you look at the output provided by you in the first mail of this mail
> > thread:
> > in your file (on local file system) you have value 1 to 6 (value1, value2,
> > value3....)
> > but when you scan in hbase output is value1, value4 , value2 , value1 ,
> > value6 , value6 , value5 , value6
> >  
> > value3 is not inserted
> > value 6 is inserted 3 times
> >  
> > did you figure out why so ?
> >  
> >  
> >  
> > On Thu, Jun 21, 2012 at 6:24 PM, Mohammad Tariq <dontariq@gmail.com (mailto:dontariq@gmail.com)> wrote:
> > >  
> > > Both the commands seem similar to me.
> > >  
> > > Regards,
> > > Mohammad Tariq
> > >  
> > >  
> > > On Thu, Jun 21, 2012 at 5:43 PM, Rahul Patodi
> > > <patodirahul.hadoop@gmail.com (mailto:patodirahul.hadoop@gmail.com)> wrote:
> > > > Hi Mohammad,
> > > > Thanks for your response
> > > > I have put this configuration:
> > > >  
> > > > hbase-agent.sources=tail
> > > > hbase-agent.sinks=sink1
> > > > hbase-agent.channels=ch1
> > > >  
> > > > hbase-agent.sources.tail.type=exec
> > > > hbase-agent.sources.tail.command=tail -F /tmp/test05
> > > > hbase-agent.sources.tail.channels=ch1
> > > >  
> > > > hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink
> > > > hbase-agent.sinks.sink1.channel=ch1
> > > > hbase-agent.sinks.sink1.table=t002
> > > > hbase-agent.sinks.sink1.columnFamily=cf
> > > > hbase-agent.sinks.sink1.column=foo
> > > >  
> > > > hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > > > hbase-agent.sinks.sink1.serializer.payloadColumn=col1
> > > > hbase-agent.sinks.sink1.serializer.incrementColumn=col1
> > > > #hbase-agent.sinks.sink1.serializer.keyType=timestamp
> > > > hbase-agent.sinks.sink1.serializer.rowPrefix=1+
> > > > hbase-agent.sinks.sink1.serializer.suffix=timestamp
> > > >  
> > > > hbase-agent.channels.ch1.type=memory
> > > >  
> > > >  
> > > > Data is getting copy into HBase, but I have got another issue:
> > > >  
> > > > My input data is simply:
> > > >  
> > > > value1
> > > > value2
> > > > value3
> > > > value4
> > > > value5
> > > > value6
> > > > value7
> > > > value8
> > > > value9
> > > >  
> > > > when I run this command in HBase:
> > > > hbase(main):129:0> scan 't002', {VERSIONS => 3}
> > > > ROW COLUMN+CELL
> > > > 1+1340279755410 column=cf:col1,
> > > > timestamp=1340279758424,
> > > > value=value5
> > > > 1+1340279755410 column=cf:col1,
> > > > timestamp=1340279758423,
> > > > value=value3
> > > > 1+1340279755410 column=cf:col1,
> > > > timestamp=1340279758417,
> > > > value=value1
> > > > 1+1340279755411 column=cf:col1,
> > > > timestamp=1340279758427,
> > > > value=value9
> > > > 1+1340279755411 column=cf:col1,
> > > > timestamp=1340279758426,
> > > > value=value8
> > > > 1+1340279755411 column=cf:col1,
> > > > timestamp=1340279758425,
> > > > value=value7
> > > > incRow column=cf:col1,
> > > > timestamp=1340279758443,
> > > > value=\x00\x00\x00\x00\x00\x00\x00\x09
> > > > 3 row(s) in 0.0420 seconds
> > > >  
> > > > all the data is not getting copy ??
> > > >  
> > > > When I run this command with version:
> > > > hbase(main):130:0> scan 't002', {VERSIONS => 3}
> > > > ROW COLUMN+CELL
> > > > 1+1340279755410 column=cf:col1,
> > > > timestamp=1340279758424,
> > > > value=value5
> > > > 1+1340279755410 column=cf:col1,
> > > > timestamp=1340279758423,
> > > > value=value3
> > > > 1+1340279755410 column=cf:col1,
> > > > timestamp=1340279758417,
> > > > value=value1
> > > > 1+1340279755411 column=cf:col1,
> > > > timestamp=1340279758427,
> > > > value=value9
> > > > 1+1340279755411 column=cf:col1,
> > > > timestamp=1340279758426,
> > > > value=value8
> > > > 1+1340279755411 column=cf:col1,
> > > > timestamp=1340279758425,
> > > > value=value7
> > > > 1+1340279906637 column=cf:col1,
> > > > timestamp=1340279909652,
> > > > value=value1
> > > > 1+1340279906638 column=cf:col1,
> > > > timestamp=1340279909659,
> > > > value=value6
> > > > 1+1340279906638 column=cf:col1,
> > > > timestamp=1340279909658,
> > > > value=value5
> > > > 1+1340279906638 column=cf:col1,
> > > > timestamp=1340279909654,
> > > > value=value3
> > > > 1+1340279906646 column=cf:col1,
> > > > timestamp=1340279909659,
> > > > value=value7
> > > > 1+1340279906647 column=cf:col1,
> > > > timestamp=1340279909659,
> > > > value=value9
> > > > incRow column=cf:col1,
> > > > timestamp=1340279909677,
> > > > value=\x00\x00\x00\x00\x00\x00\x00\x12
> > > > 7 row(s) in 0.0640 seconds
> > > >  
> > > > Please help me understand this.
> > > >  
> > > >  
> > > >  
> > > >  
> > > > On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <dontariq@gmail.com (mailto:dontariq@gmail.com)>
> > > > wrote:
> > > > >  
> > > > > Hi Will,
> > > > >  
> > > > > I got it.Thanks for the info.
> > > > >  
> > > > > Regards,
> > > > > Mohammad Tariq
> > > > >  
> > > > >  
> > > > > On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <will@cloudera.com (mailto:will@cloudera.com)>
> > > > > wrote:
> > > > > > Hi Mohammad,
> > > > > >  
> > > > > > In your config file, I think you need to remove this line:
> > > > > >  
> > > > > > > > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> > > > > >  
> > > > > > I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
> > > > > > (although there is a keyType var that stores the value of the
> > > > > > 'suffix'
> > > > > > prop).
> > > > > >  
> > > > > > Cheers,
> > > > > > Will
> > > > > >  
> > > > > >  
> > > > > > On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <dontariq@gmail.com (mailto:dontariq@gmail.com)>
> > > > > > wrote:
> > > > > > >  
> > > > > > > Hi Rahul,
> > > > > > >  
> > > > > > > This normally happens when there is some problem in the
> > > > > > > configuration file.Create a file called hbase-agent inside your
> > > > > > > FLUME_HOME/conf directory and copy this content into it:
> > > > > > > hbase-agent.sources = tail
> > > > > > > hbase-agent.sinks = sink1
> > > > > > > hbase-agent.channels = ch1
> > > > > > >  
> > > > > > > hbase-agent.sources.tail.type = exec
> > > > > > > hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> > > > > > > hbase-agent.sources.tail.channels = ch1
> > > > > > >  
> > > > > > > hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
> > > > > > > hbase-agent.sinks.sink1.channel = ch1
> > > > > > > hbase-agent.sinks.sink1.table = demo
> > > > > > > hbase-agent.sinks.sink1.columnFamily = cf
> > > > > > >  
> > > > > > > hbase-agent.sinks.sink1.serializer =
> > > > > > > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > > > > > > hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> > > > > > >  
> > > > > > > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> > > > > > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> > > > > > > hbase-agent.sinks.sink1.serializer.suffix = timestamp
> > > > > > >  
> > > > > > > hbase-agent.channels.ch1.type=memory
> > > > > > >  
> > > > > > > Then start the agent and see if it works for you. It worked for me.
> > > > > > >  
> > > > > > > Regards,
> > > > > > > Mohammad Tariq
> > > > > > >  
> > > > > > >  
> > > > > > > On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <will@cloudera.com (mailto:will@cloudera.com)>
> > > > > > > wrote:
> > > > > > > > Hi Sharma,
> > > > > > > >  
> > > > > > > > So I assume that your command looks something like this:
> > > > > > > > flume-ng agent -n hbase-agent -f
> > > > > > > > /home/hadoop/flumeng/hbaseagent.conf
> > > > > > > > -c /etc/flume-ng/conf
> > > > > > > >  
> > > > > > > > ...?
> > > > > > > >  
> > > > > > > > Hari, I saw your comment:
> > > > > > > >  
> > > > > > > > > > I am not sure if HBase changed their wire protocol between these
> > > > > > > > > > versions.
> > > > > > > > > >  
> > > > > > > > >  
> > > > > > > >  
> > > > > > > > Do you have any other advice about troubleshooting a possible
> > > > > > > > hbase
> > > > > > > > protocol
> > > > > > > > mismatch issue?
> > > > > > > >  
> > > > > > > > Cheers,
> > > > > > > > Will
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
> > > > > > > > <sharma.ashutosh@kt.com (mailto:sharma.ashutosh@kt.com)>
> > > > > > > > wrote:
> > > > > > > > >  
> > > > > > > > > Hi Will,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > I installed flume as part of CDH3u4 version 1.1 using yum install
> > > > > > > > > flume-ng. One more point, I am using flume-ng hbase sink
> > > > > > > > > downloaded
> > > > > > > > > from:
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Now, I ran the agent with -conf parameter with updated
> > > > > > > > > log4j.properties. I
> > > > > > > > > don't see any error in the log. Please see the below from the log
> > > > > > > > > file:
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor:
> > > > > > > > > Starting
> > > > > > > > > lifecycle supervisor 1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting
> > > > > > > > > -
> > > > > > > > > hbase-agent
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,146 INFO
> > > > > > > > > nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > Node
> > > > > > > > > manager starting
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor:
> > > > > > > > > Starting
> > > > > > > > > lifecycle supervisor 9
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,146 INFO
> > > > > > > > > properties.PropertiesFileConfigurationProvider: Configuration
> > > > > > > > > provider
> > > > > > > > > starting
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,148 DEBUG
> > > > > > > > > nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > Node
> > > > > > > > > manager started
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,148 DEBUG
> > > > > > > > > properties.PropertiesFileConfigurationProvider: Configuration
> > > > > > > > > provider
> > > > > > > > > started
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,149 DEBUG
> > > > > > > > > properties.PropertiesFileConfigurationProvider: Checking
> > > > > > > > > file:/home/hadoop/flumeng/hbaseagent.conf for changes
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,149 INFO
> > > > > > > > > properties.PropertiesFileConfigurationProvider: Reloading
> > > > > > > > > configuration
> > > > > > > > > file:/home/hadoop/flumeng/hbaseagent.conf
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added
> > > > > > > > > sinks:
> > > > > > > > > sink1
> > > > > > > > > Agent: hbase-agent
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created
> > > > > > > > > context
> > > > > > > > > for
> > > > > > > > > sink1: serializer.rowPrefix
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> > > > > > > > > Processing:sink1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
> > > > > > > > > validation
> > > > > > > > > of configuration for agent: hbase-agent, initial-configuration:
> > > > > > > > > AgentConfiguration[hbase-agent]
> > > > > > > > >  
> > > > > > > > > SOURCES: {tail={ parameters:{command=tail -f
> > > > > > > > > /home/hadoop/demo.txt,
> > > > > > > > > channels=ch1, type=exec} }}
> > > > > > > > >  
> > > > > > > > > CHANNELS: {ch1={ parameters:{type=memory} }}
> > > > > > > > >  
> > > > > > > > > SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> > > > > > > > > serializer.keyType=timestamp,
> > > > > > > > >  
> > > > > > > > > serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> > > > > > > > > serializer.incrementColumn=col1, column=foo,
> > > > > > > > > serializer.rowPrefix=1,
> > > > > > > > > batchSize=1, columnFamily=cf1, table=test,
> > > > > > > > > type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> > > > > > > > > serializer.suffix=timestamp} }}
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created
> > > > > > > > > channel
> > > > > > > > > ch1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating
> > > > > > > > > sink:
> > > > > > > > > sink1 using OTHER
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post
> > > > > > > > > validation
> > > > > > > > > configuration for hbase-agent
> > > > > > > > >  
> > > > > > > > > AgentConfiguration created without Configuration stubs for which
> > > > > > > > > only
> > > > > > > > > basic syntactical validation was performed[hbase-agent]
> > > > > > > > >  
> > > > > > > > > SOURCES: {tail={ parameters:{command=tail -f
> > > > > > > > > /home/hadoop/demo.txt,
> > > > > > > > > channels=ch1, type=exec} }}
> > > > > > > > >  
> > > > > > > > > CHANNELS: {ch1={ parameters:{type=memory} }}
> > > > > > > > >  
> > > > > > > > > SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> > > > > > > > > serializer.keyType=timestamp,
> > > > > > > > >  
> > > > > > > > > serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> > > > > > > > > serializer.incrementColumn=col1, column=foo,
> > > > > > > > > serializer.rowPrefix=1,
> > > > > > > > > batchSize=1, columnFamily=cf1, table=test,
> > > > > > > > > type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> > > > > > > > > serializer.suffix=timestamp} }}
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration:
> > > > > > > > > Channels:ch1
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks
> > > > > > > > > sink1
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources
> > > > > > > > > tail
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration:
> > > > > > > > > Post-validation
> > > > > > > > > flume configuration contains configuration for agents:
> > > > > > > > > [hbase-agent]
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,171 INFO
> > > > > > > > > properties.PropertiesFileConfigurationProvider: Creating channels
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory:
> > > > > > > > > Creating
> > > > > > > > > instance of channel ch1 type memory
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,175 INFO
> > > > > > > > > properties.PropertiesFileConfigurationProvider: created channel
> > > > > > > > > ch1
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory:
> > > > > > > > > Creating
> > > > > > > > > instance of source tail, type exec
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating
> > > > > > > > > instance
> > > > > > > > > of
> > > > > > > > > sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
> > > > > > > > > org.apache.flume.sink.hbase.HBaseSink is a custom type
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,298 INFO
> > > > > > > > > nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > Node
> > > > > > > > > configuration change:{
> > > > > > > > > sourceRunners:{tail=EventDrivenSourceRunner:
> > > > > > > > > {
> > > > > > > > > source:org.apache.flume.source.ExecSource@1fd0fafc }}
> > > > > > > > > sinkRunners:{sink1=SinkRunner: {
> > > > > > > > > policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
> > > > > > > > > counterGroup:{
> > > > > > > > > name:null counters:{} } }}
> > > > > > > > > channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source
> > > > > > > > > starting
> > > > > > > > > with
> > > > > > > > > command:tail -f /home/hadoop/demo.txt
> > > > > > > > >  
> > > > > > > > > 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source
> > > > > > > > > started
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Output of the which Flume-ng is:
> > > > > > > > >  
> > > > > > > > > /usr/bin/flume-ng
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > > Thanks & Regards,
> > > > > > > > >  
> > > > > > > > > Ashutosh Sharma
> > > > > > > > >  
> > > > > > > > > Cell: 010-7300-0150
> > > > > > > > >  
> > > > > > > > > Email: sharma.ashutosh@kt.com (mailto:sharma.ashutosh@kt.com)
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > From: Will McQueen [mailto:will@cloudera.com]
> > > > > > > > > Sent: Thursday, June 21, 2012 6:07 PM
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> > > > > > > > > Subject: Re: Hbase-sink behavior
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Hi Sharma,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Could you please describe how you installed flume? Also, I see
> > > > > > > > > you're
> > > > > > > > > getting this warning:
> > > > > > > > >  
> > > > > > > > > > > Warning: No configuration directory set! Use --conf <dir> to
> > > > > > > > > > > override.
> > > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > The log4j.properties that flume provides is stored in the conf
> > > > > > > > > dir.
> > > > > > > > > If
> > > > > > > > > you
> > > > > > > > > specify the flume conf dir, flume can pick it up. So for
> > > > > > > > > troubleshooting you
> > > > > > > > > can try:
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 1) modifying the log4j.properties within flume's conf dir so that
> > > > > > > > > the
> > > > > > > > > top
> > > > > > > > > reads:
> > > > > > > > > #flume.root.logger=DEBUG,console
> > > > > > > > > flume.root.logger=DEBUG,LOGFILE
> > > > > > > > > flume.log.dir=.
> > > > > > > > > flume.log.file=flume.log
> > > > > > > > >  
> > > > > > > > > 2) Run the flume agent while specifying the flume conf dir
> > > > > > > > > (--conf
> > > > > > > > > <dir>)
> > > > > > > > >  
> > > > > > > > > 3) What's the output of 'which flume-ng'?
> > > > > > > > >  
> > > > > > > > > Cheers,
> > > > > > > > > Will
> > > > > > > > >  
> > > > > > > > > On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
> > > > > > > > > <sharma.ashutosh@kt.com (mailto:sharma.ashutosh@kt.com)> wrote:
> > > > > > > > >  
> > > > > > > > > Hi Hari,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > I checked, agent is successfully tailing the file which I
> > > > > > > > > mentioned.
> > > > > > > > > Yes,
> > > > > > > > > you are right, agent has started properly without any error.
> > > > > > > > > Because
> > > > > > > > > there
> > > > > > > > > is no further movement, so it's hard for me to identify the
> > > > > > > > > issue. I
> > > > > > > > > also
> > > > > > > > > used tail -F also, but no success.
> > > > > > > > >  
> > > > > > > > > Can you suggest me some technique to troubleshoot it, so I could
> > > > > > > > > identify
> > > > > > > > > the issue and resolve the same. Does flume record some log
> > > > > > > > > anywhere?
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > > Thanks & Regards,
> > > > > > > > >  
> > > > > > > > > Ashutosh Sharma
> > > > > > > > >  
> > > > > > > > > Cell: 010-7300-0150
> > > > > > > > >  
> > > > > > > > > Email: sharma.ashutosh@kt.com (mailto:sharma.ashutosh@kt.com)
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> > > > > > > > > Sent: Thursday, June 21, 2012 5:25 PM
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> > > > > > > > > Subject: Re: Hbase-sink behavior
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > I am not sure if HBase changed their wire protocol between these
> > > > > > > > > versions.
> > > > > > > > > Looks like your agent has started properly. Are you sure data is
> > > > > > > > > being
> > > > > > > > > written into the file being tailed? I suggest using tail -F. The
> > > > > > > > > log
> > > > > > > > > being
> > > > > > > > > stuck here is ok, that is probably because nothing specific is
> > > > > > > > > required(or
> > > > > > > > > your log file rotated).
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Thanks
> > > > > > > > >  
> > > > > > > > > Hari
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > --
> > > > > > > > >  
> > > > > > > > > Hari Shreedharan
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
> > > > > > > > >  
> > > > > > > > > Hi Hari,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Thanks for your prompt reply. I already created the table in
> > > > > > > > > Hbase
> > > > > > > > > with
> > > > > > > > > a
> > > > > > > > > column family and hadoop/hbase library is available to hadoop. I
> > > > > > > > > noticed
> > > > > > > > > that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
> > > > > > > > >  
> > > > > > > > > Please see the below lines captured while running the flume
> > > > > > > > > agent:
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > > > > flume-ng agent -n hbase-agent -f
> > > > > > > > > > > > /home/hadoop/flumeng/hbaseagent.conf
> > > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Warning: No configuration directory set! Use --conf <dir> to
> > > > > > > > > override.
> > > > > > > > >  
> > > > > > > > > Info: Including Hadoop libraries found via (/usr/bin/hadoop) for
> > > > > > > > > HDFS
> > > > > > > > > access
> > > > > > > > >  
> > > > > > > > > Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
> > > > > > > > > classpath
> > > > > > > > >  
> > > > > > > > > Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar
> > > > > > > > > from
> > > > > > > > > classpath
> > > > > > > > >  
> > > > > > > > > + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
> > > > > > > > >  
> > > > > > > > > -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> > > > > > > > > org.apache.flume.node.Application -n hbase-agent -f
> > > > > > > > > /home/hadoop/flumeng/hbaseagent.conf
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
> > > > > > > > > lifecycle
> > > > > > > > > supervisor 1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
> > > > > > > > > hbase-agent
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > Node
> > > > > > > > > manager
> > > > > > > > > starting
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
> > > > > > > > > lifecycle
> > > > > > > > > supervisor 10
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO
> > > > > > > > > properties.PropertiesFileConfigurationProvider:
> > > > > > > > > Configuration provider starting
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO
> > > > > > > > > properties.PropertiesFileConfigurationProvider:
> > > > > > > > > Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks:
> > > > > > > > > sink1
> > > > > > > > > Agent:
> > > > > > > > > hbase-agent
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation
> > > > > > > > > flume
> > > > > > > > > configuration contains configuration for agents: [hbase-agent]
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO
> > > > > > > > > properties.PropertiesFileConfigurationProvider:
> > > > > > > > > Creating channels
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO
> > > > > > > > > properties.PropertiesFileConfigurationProvider:
> > > > > > > > > created channel ch1
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance
> > > > > > > > > of
> > > > > > > > > sink
> > > > > > > > > sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
> > > > > > > > > Node
> > > > > > > > > configuration change:{
> > > > > > > > > sourceRunners:{tail=EventDrivenSourceRunner:
> > > > > > > > > {
> > > > > > > > > source:org.apache.flume.source.ExecSource@1ed0af9b }}
> > > > > > > > > sinkRunners:{sink1=SinkRunner: {
> > > > > > > > > policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
> > > > > > > > > counterGroup:{
> > > > > > > > > name:null counters:{} } }}
> > > > > > > > > channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
> > > > > > > > >  
> > > > > > > > > 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting
> > > > > > > > > with
> > > > > > > > > command:tail -f /home/hadoop/demo.txt
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Screen stuck here....no movement.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > > Thanks & Regards,
> > > > > > > > >  
> > > > > > > > > Ashutosh Sharma
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> > > > > > > > > Sent: Thursday, June 21, 2012 5:01 PM
> > > > > > > > > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> > > > > > > > > Subject: Re: Hbase-sink behavior
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Hi Ashutosh,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > The sink will not create the table or column family. Make sure
> > > > > > > > > you
> > > > > > > > > have
> > > > > > > > > the table and column family. Also please make sure you have
> > > > > > > > > HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they
> > > > > > > > > are
> > > > > > > > > in
> > > > > > > > > your
> > > > > > > > > class path).
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Thanks
> > > > > > > > >  
> > > > > > > > > Hari
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > --
> > > > > > > > >  
> > > > > > > > > Hari Shreedharan
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
> > > > > > > > >  
> > > > > > > > > Hi,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > I have used and followed the same steps which is mentioned in
> > > > > > > > > below
> > > > > > > > > mails
> > > > > > > > > to get start with the hbasesink. But agent is not storing any
> > > > > > > > > data
> > > > > > > > > into
> > > > > > > > > hbase. I added the hbase-site.xml in $CLASSPATH variable to pick
> > > > > > > > > the
> > > > > > > > > hbase
> > > > > > > > > information. Even I am able to connect to the hbase server from
> > > > > > > > > that
> > > > > > > > > agent
> > > > > > > > > machine.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Now, I am unable to understand and troubleshoot this problem.
> > > > > > > > > Seeking
> > > > > > > > > advice from the community members....
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > > Thanks & Regards,
> > > > > > > > >  
> > > > > > > > > Ashutosh Sharma
> > > > > > > > >  
> > > > > > > > > ----------------------------------------
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > -----Original Message-----
> > > > > > > > >  
> > > > > > > > > From: Mohammad Tariq [mailto:dontariq@gmail.com]
> > > > > > > > >  
> > > > > > > > > Sent: Friday, June 15, 2012 9:02 AM
> > > > > > > > >  
> > > > > > > > > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> > > > > > > > >  
> > > > > > > > > Subject: Re: Hbase-sink behavior
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Thank you so much Hari for the valuable response..I'll follow the
> > > > > > > > > guidelines provided by you.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Regards,
> > > > > > > > >  
> > > > > > > > > Mohammad Tariq
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
> > > > > > > > > <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)> wrote:
> > > > > > > > >  
> > > > > > > > > Hi Mohammad,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > My answers are inline.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > --
> > > > > > > > >  
> > > > > > > > > Hari Shreedharan
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Hello list,
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > I am trying to use hbase-sink to collect data from a local file
> > > > > > > > > and
> > > > > > > > >  
> > > > > > > > > dump it into an Hbase table..But there are a few things I am not
> > > > > > > > > able
> > > > > > > > >  
> > > > > > > > > to understand and need some guidance.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > This is the content of my conf file :
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > hbase-agent.sources = tail
> > > > > > > > >  
> > > > > > > > > hbase-agent..sinks = sink1
> > > > > > > > >  
> > > > > > > > > hbase-agent.channels = ch1
> > > > > > > > >  
> > > > > > > > > hbase-agent.sources.tail.type = exec
> > > > > > > > >  
> > > > > > > > > hbase-agent.sources.tail.command = tail -F
> > > > > > > > > /home/mohammad/demo.txt
> > > > > > > > >  
> > > > > > > > > hbase-agent.sources.tail.channels = ch1
> > > > > > > > > hbase-agent.sinks.sink1.type
> > > > > > > > > =
> > > > > > > > >  
> > > > > > > > > org.apache.flume.sink.hbase.HBaseSink
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.channel = ch1
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.table = test3
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.columnFamily = testing
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.column = foo
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.serializer =
> > > > > > > > >  
> > > > > > > > > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.serializer.suffix = timestamp
> > > > > > > > >  
> > > > > > > > > hbase-agent.channels.ch1.type=memory
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Right now I am taking just some simple text from a file which has
> > > > > > > > >  
> > > > > > > > > following content -
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > value1
> > > > > > > > >  
> > > > > > > > > value2
> > > > > > > > >  
> > > > > > > > > value3
> > > > > > > > >  
> > > > > > > > > value4
> > > > > > > > >  
> > > > > > > > > value5
> > > > > > > > >  
> > > > > > > > > value6
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > And my Hbase table looks like -
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > hbase(main):217:0> scan 'test3'
> > > > > > > > >  
> > > > > > > > > ROW COLUMN+CELL
> > > > > > > > >  
> > > > > > > > > 11339716704561 column=testing:col1,
> > > > > > > > >  
> > > > > > > > > timestamp=1339716707569, value=value1
> > > > > > > > >  
> > > > > > > > > 11339716704562 column=testing:col1,
> > > > > > > > >  
> > > > > > > > > timestamp=1339716707571, value=value4
> > > > > > > > >  
> > > > > > > > > 11339716846594 column=testing:col1,
> > > > > > > > >  
> > > > > > > > > timestamp=1339716849608, value=value2
> > > > > > > > >  
> > > > > > > > > 11339716846595 column=testing:col1,
> > > > > > > > >  
> > > > > > > > > timestamp=1339716849610, value=value1
> > > > > > > > >  
> > > > > > > > > 11339716846596 column=testing:col1,
> > > > > > > > >  
> > > > > > > > > timestamp=1339716849611, value=value6
> > > > > > > > >  
> > > > > > > > > 11339716846597 column=testing:col1,
> > > > > > > > >  
> > > > > > > > > timestamp=1339716849614, value=value6
> > > > > > > > >  
> > > > > > > > > 11339716846598 column=testing:col1,
> > > > > > > > >  
> > > > > > > > > timestamp=1339716849615, value=value5
> > > > > > > > >  
> > > > > > > > > 11339716846599 column=testing:col1,
> > > > > > > > >  
> > > > > > > > > timestamp=1339716849615, value=value6
> > > > > > > > >  
> > > > > > > > > incRow column=testing:col1,
> > > > > > > > >  
> > > > > > > > > timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> > > > > > > > >  
> > > > > > > > > 9 row(s) in 0.0580 seconds
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Now I have following questions -
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 1- Why the timestamp value is different from the row key?(I was
> > > > > > > > > trying
> > > > > > > > >  
> > > > > > > > > to make "1+timestamp" as the rowkey)
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > The value shown by hbase shell as timestamp is the time at which
> > > > > > > > > the
> > > > > > > > >  
> > > > > > > > > value was inserted into Hbase, while the value inserted by Flume
> > > > > > > > > is
> > > > > > > > >  
> > > > > > > > > the timestamp at which the sink read the event from the channel.
> > > > > > > > >  
> > > > > > > > > Depending on how long the network and HBase takes, these
> > > > > > > > > timestamps
> > > > > > > > >  
> > > > > > > > > can vary. If you want 1+timestamp as row key then you should
> > > > > > > > > configure
> > > > > > > > > it:
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> > > > > > > > >  
> > > > > > > > > appended as-is to the suffix you choose.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 2- Although I am not using "incRow", it stills appear in the
> > > > > > > > > table
> > > > > > > > >  
> > > > > > > > > with some value. Why so and what is this value??
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > The SimpleHBaseEventSerializer is only an example class. For
> > > > > > > > > custom
> > > > > > > > >  
> > > > > > > > > use cases you can write your own serializer by implementing
> > > > > > > > >  
> > > > > > > > > HbaseEventSerializer. In this case, you have specified
> > > > > > > > >  
> > > > > > > > > incrementColumn, which causes an increment on the column
> > > > > > > > > specified.
> > > > > > > > >  
> > > > > > > > > Simply don't specify that config and that row will not appear.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 3- How can avoid the last row??
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > See above.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > I am still in the learning phase so please pardon my
> > > > > > > > > ignorance..Many
> > > > > > > > > thanks.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > No problem. Much of this is documented
> > > > > > > > >  
> > > > > > > > > here:
> > > > > > > > >  
> > > > > > > > > https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Regards,
> > > > > > > > >  
> > > > > > > > > Mohammad Tariq
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> > > > > > > > > 없이, 본
> > > > > > > > > 문서에
> > > > > > > > > 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> > > > > > > > > 본
> > > > > > > > > 메일이
> > > > > > > > > 잘못
> > > > > > > > > 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> > > > > > > > >  
> > > > > > > > > This E-mail may contain confidential information and/or copyright
> > > > > > > > > material. This email is intended for the use of the addressee
> > > > > > > > > only.
> > > > > > > > > If
> > > > > > > > > you
> > > > > > > > > receive this email by mistake, please either delete it without
> > > > > > > > > reproducing,
> > > > > > > > > distributing or retaining copies thereof or notify the sender
> > > > > > > > > immediately.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> > > > > > > > > 없이, 본
> > > > > > > > > 문서에
> > > > > > > > > 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> > > > > > > > > 본
> > > > > > > > > 메일이
> > > > > > > > > 잘못
> > > > > > > > > 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> > > > > > > > > This E-mail may contain confidential information and/or copyright
> > > > > > > > > material. This email is intended for the use of the addressee
> > > > > > > > > only.
> > > > > > > > > If
> > > > > > > > > you
> > > > > > > > > receive this email by mistake, please either delete it without
> > > > > > > > > reproducing,
> > > > > > > > > distributing or retaining copies thereof or notify the sender
> > > > > > > > > immediately.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> > > > > > > > > 없이, 본
> > > > > > > > > 문서에
> > > > > > > > > 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> > > > > > > > > 본
> > > > > > > > > 메일이
> > > > > > > > > 잘못
> > > > > > > > > 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> > > > > > > > > This E-mail may contain confidential information and/or copyright
> > > > > > > > > material. This email is intended for the use of the addressee
> > > > > > > > > only.
> > > > > > > > > If
> > > > > > > > > you
> > > > > > > > > receive this email by mistake, please either delete it without
> > > > > > > > > reproducing,
> > > > > > > > > distributing or retaining copies thereof or notify the sender
> > > > > > > > > immediately.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> > > > > > > > > 없이, 본
> > > > > > > > > 문서에
> > > > > > > > > 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> > > > > > > > > 본
> > > > > > > > > 메일이
> > > > > > > > > 잘못
> > > > > > > > > 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> > > > > > > > > This E-mail may contain confidential information and/or copyright
> > > > > > > > > material. This email is intended for the use of the addressee
> > > > > > > > > only.
> > > > > > > > > If
> > > > > > > > > you
> > > > > > > > > receive this email by mistake, please either delete it without
> > > > > > > > > reproducing,
> > > > > > > > > distributing or retaining copies thereof or notify the sender
> > > > > > > > > immediately.
> > > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > >  
> > > > > >  
> > > > >  
> > > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > > --
> > > > Regards,
> > > > Rahul Patodi
> > > >  
> > >  
> > >  
> >  
> >  
> >  
> >  
> >  
> > --
> > Regards,
> > Rahul Patodi
> >  
>  
>  
>  



Re: Hbase-sink behavior

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Rahul,

          Actually that has nothing to do with Flume..Simply, out of
excitement I used the same file more than once so all these values
went as different versions into the Hbase table. And when you tail a
file without modifying the behavior of the tail command it will take
only last few records and not the entire content of the file. That
could be a reason for the absence of value3..But there is no issue
from Flume's side..It totally depends on tail's behavior .
Regards,
    Mohammad Tariq


On Thu, Jun 21, 2012 at 6:47 PM, Rahul Patodi
<pa...@gmail.com> wrote:
> If you look at the output provided by you in the first mail of this mail
> thread:
> in your file (on local file system) you have value 1 to 6 (value1, value2,
> value3....)
> but when you scan in hbase output is value1, value4 , value2 , value1 ,
> value6 , value6 , value5 , value6
>
> value3 is not inserted
> value 6 is inserted 3 times
>
> did you figure out why so ?
>
>
>
> On Thu, Jun 21, 2012 at 6:24 PM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Both the commands seem similar to me.
>>
>> Regards,
>>    Mohammad Tariq
>>
>>
>> On Thu, Jun 21, 2012 at 5:43 PM, Rahul Patodi
>> <pa...@gmail.com> wrote:
>> > Hi Mohammad,
>> > Thanks for your response
>> > I have put this configuration:
>> >
>> > hbase-agent.sources=tail
>> > hbase-agent.sinks=sink1
>> > hbase-agent.channels=ch1
>> >
>> > hbase-agent.sources.tail.type=exec
>> > hbase-agent.sources.tail.command=tail -F /tmp/test05
>> > hbase-agent.sources.tail.channels=ch1
>> >
>> > hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink
>> > hbase-agent.sinks.sink1.channel=ch1
>> > hbase-agent.sinks.sink1.table=t002
>> > hbase-agent.sinks.sink1.columnFamily=cf
>> > hbase-agent.sinks.sink1.column=foo
>> >
>> > hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>> > hbase-agent.sinks.sink1.serializer.payloadColumn=col1
>> > hbase-agent.sinks.sink1.serializer.incrementColumn=col1
>> > #hbase-agent.sinks.sink1.serializer.keyType=timestamp
>> > hbase-agent.sinks.sink1.serializer.rowPrefix=1+
>> > hbase-agent.sinks.sink1.serializer.suffix=timestamp
>> >
>> > hbase-agent.channels.ch1.type=memory
>> >
>> >
>> > Data is getting copy into HBase, but I have got another issue:
>> >
>> > My input data is simply:
>> >
>> > value1
>> > value2
>> > value3
>> > value4
>> > value5
>> > value6
>> > value7
>> > value8
>> > value9
>> >
>> > when I run this command in HBase:
>> > hbase(main):129:0> scan 't002', {VERSIONS => 3}
>> > ROW                              COLUMN+CELL
>> >  1+1340279755410                 column=cf:col1,
>> > timestamp=1340279758424,
>> > value=value5
>> >  1+1340279755410                 column=cf:col1,
>> > timestamp=1340279758423,
>> > value=value3
>> >  1+1340279755410                 column=cf:col1,
>> > timestamp=1340279758417,
>> > value=value1
>> >  1+1340279755411                 column=cf:col1,
>> > timestamp=1340279758427,
>> > value=value9
>> >  1+1340279755411                 column=cf:col1,
>> > timestamp=1340279758426,
>> > value=value8
>> >  1+1340279755411                 column=cf:col1,
>> > timestamp=1340279758425,
>> > value=value7
>> >  incRow                          column=cf:col1,
>> > timestamp=1340279758443,
>> > value=\x00\x00\x00\x00\x00\x00\x00\x09
>> > 3 row(s) in 0.0420 seconds
>> >
>> > all the data is not getting copy ??
>> >
>> > When I run this command with version:
>> > hbase(main):130:0> scan 't002', {VERSIONS => 3}
>> > ROW                              COLUMN+CELL
>> >  1+1340279755410                 column=cf:col1,
>> > timestamp=1340279758424,
>> > value=value5
>> >  1+1340279755410                 column=cf:col1,
>> > timestamp=1340279758423,
>> > value=value3
>> >  1+1340279755410                 column=cf:col1,
>> > timestamp=1340279758417,
>> > value=value1
>> >  1+1340279755411                 column=cf:col1,
>> > timestamp=1340279758427,
>> > value=value9
>> >  1+1340279755411                 column=cf:col1,
>> > timestamp=1340279758426,
>> > value=value8
>> >  1+1340279755411                 column=cf:col1,
>> > timestamp=1340279758425,
>> > value=value7
>> >  1+1340279906637                 column=cf:col1,
>> > timestamp=1340279909652,
>> > value=value1
>> >  1+1340279906638                 column=cf:col1,
>> > timestamp=1340279909659,
>> > value=value6
>> >  1+1340279906638                 column=cf:col1,
>> > timestamp=1340279909658,
>> > value=value5
>> >  1+1340279906638                 column=cf:col1,
>> > timestamp=1340279909654,
>> > value=value3
>> >  1+1340279906646                 column=cf:col1,
>> > timestamp=1340279909659,
>> > value=value7
>> >  1+1340279906647                 column=cf:col1,
>> > timestamp=1340279909659,
>> > value=value9
>> >  incRow                          column=cf:col1,
>> > timestamp=1340279909677,
>> > value=\x00\x00\x00\x00\x00\x00\x00\x12
>> > 7 row(s) in 0.0640 seconds
>> >
>> > Please help me understand this.
>> >
>> >
>> >
>> >
>> > On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <do...@gmail.com>
>> > wrote:
>> >>
>> >> Hi Will,
>> >>
>> >>          I got it.Thanks for the info.
>> >>
>> >> Regards,
>> >>    Mohammad Tariq
>> >>
>> >>
>> >> On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <wi...@cloudera.com>
>> >> wrote:
>> >> > Hi Mohammad,
>> >> >
>> >> > In your config file, I think you need to remove this line:
>> >> >
>> >> >>>hbase-agent.sinks.sink1.serializer.keyType = timestamp
>> >> >
>> >> > I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
>> >> > (although there is a keyType var that stores the value of the
>> >> > 'suffix'
>> >> > prop).
>> >> >
>> >> > Cheers,
>> >> > Will
>> >> >
>> >> >
>> >> > On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Hi Rahul,
>> >> >>
>> >> >>          This normally happens when there is some problem in the
>> >> >> configuration file.Create a file called hbase-agent inside your
>> >> >> FLUME_HOME/conf directory and copy this content into it:
>> >> >> hbase-agent.sources = tail
>> >> >> hbase-agent.sinks = sink1
>> >> >> hbase-agent.channels = ch1
>> >> >>
>> >> >> hbase-agent.sources.tail.type = exec
>> >> >> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>> >> >> hbase-agent.sources.tail.channels = ch1
>> >> >>
>> >> >> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
>> >> >> hbase-agent.sinks.sink1.channel = ch1
>> >> >> hbase-agent.sinks.sink1.table = demo
>> >> >> hbase-agent.sinks.sink1.columnFamily = cf
>> >> >>
>> >> >> hbase-agent.sinks.sink1.serializer =
>> >> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>> >> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>> >> >>
>> >> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>> >> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>> >> >>
>> >> >> hbase-agent.channels.ch1.type=memory
>> >> >>
>> >> >> Then start the agent and see if it works for you. It worked for me.
>> >> >>
>> >> >> Regards,
>> >> >>    Mohammad Tariq
>> >> >>
>> >> >>
>> >> >> On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com>
>> >> >> wrote:
>> >> >> > Hi Sharma,
>> >> >> >
>> >> >> > So I assume that your command looks something like this:
>> >> >> >      flume-ng agent -n hbase-agent -f
>> >> >> > /home/hadoop/flumeng/hbaseagent.conf
>> >> >> > -c /etc/flume-ng/conf
>> >> >> >
>> >> >> > ...?
>> >> >> >
>> >> >> > Hari, I saw your comment:
>> >> >> >
>> >> >> >>>I am not sure if HBase changed their wire protocol between these
>> >> >> >>> versions.
>> >> >> > Do you have any other advice about troubleshooting a possible
>> >> >> > hbase
>> >> >> > protocol
>> >> >> > mismatch issue?
>> >> >> >
>> >> >> > Cheers,
>> >> >> > Will
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
>> >> >> > <sh...@kt.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Hi Will,
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> I installed flume as part of CDH3u4 version 1.1 using yum install
>> >> >> >> flume-ng. One more point, I am using flume-ng hbase sink
>> >> >> >> downloaded
>> >> >> >> from:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Now, I ran the agent with -conf parameter with updated
>> >> >> >> log4j.properties. I
>> >> >> >> don't see any error in the log. Please see the below from the log
>> >> >> >> file:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor:
>> >> >> >> Starting
>> >> >> >> lifecycle supervisor 1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting
>> >> >> >> -
>> >> >> >> hbase-agent
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,146 INFO
>> >> >> >> nodemanager.DefaultLogicalNodeManager:
>> >> >> >> Node
>> >> >> >> manager starting
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor:
>> >> >> >> Starting
>> >> >> >> lifecycle supervisor 9
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,146 INFO
>> >> >> >> properties.PropertiesFileConfigurationProvider: Configuration
>> >> >> >> provider
>> >> >> >> starting
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,148 DEBUG
>> >> >> >> nodemanager.DefaultLogicalNodeManager:
>> >> >> >> Node
>> >> >> >> manager started
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,148 DEBUG
>> >> >> >> properties.PropertiesFileConfigurationProvider: Configuration
>> >> >> >> provider
>> >> >> >> started
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,149 DEBUG
>> >> >> >> properties.PropertiesFileConfigurationProvider: Checking
>> >> >> >> file:/home/hadoop/flumeng/hbaseagent.conf for changes
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,149 INFO
>> >> >> >> properties.PropertiesFileConfigurationProvider: Reloading
>> >> >> >> configuration
>> >> >> >> file:/home/hadoop/flumeng/hbaseagent.conf
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added
>> >> >> >> sinks:
>> >> >> >> sink1
>> >> >> >> Agent: hbase-agent
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created
>> >> >> >> context
>> >> >> >> for
>> >> >> >> sink1: serializer.rowPrefix
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> >> Processing:sink1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
>> >> >> >> validation
>> >> >> >> of configuration for agent: hbase-agent, initial-configuration:
>> >> >> >> AgentConfiguration[hbase-agent]
>> >> >> >>
>> >> >> >> SOURCES: {tail={ parameters:{command=tail -f
>> >> >> >> /home/hadoop/demo.txt,
>> >> >> >> channels=ch1, type=exec} }}
>> >> >> >>
>> >> >> >> CHANNELS: {ch1={ parameters:{type=memory} }}
>> >> >> >>
>> >> >> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>> >> >> >> serializer.keyType=timestamp,
>> >> >> >>
>> >> >> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>> >> >> >> serializer.incrementColumn=col1, column=foo,
>> >> >> >> serializer.rowPrefix=1,
>> >> >> >> batchSize=1, columnFamily=cf1, table=test,
>> >> >> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>> >> >> >> serializer.suffix=timestamp} }}
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created
>> >> >> >> channel
>> >> >> >> ch1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating
>> >> >> >> sink:
>> >> >> >> sink1 using OTHER
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post
>> >> >> >> validation
>> >> >> >> configuration for hbase-agent
>> >> >> >>
>> >> >> >> AgentConfiguration created without Configuration stubs for which
>> >> >> >> only
>> >> >> >> basic syntactical validation was performed[hbase-agent]
>> >> >> >>
>> >> >> >> SOURCES: {tail={ parameters:{command=tail -f
>> >> >> >> /home/hadoop/demo.txt,
>> >> >> >> channels=ch1, type=exec} }}
>> >> >> >>
>> >> >> >> CHANNELS: {ch1={ parameters:{type=memory} }}
>> >> >> >>
>> >> >> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>> >> >> >> serializer.keyType=timestamp,
>> >> >> >>
>> >> >> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>> >> >> >> serializer.incrementColumn=col1, column=foo,
>> >> >> >> serializer.rowPrefix=1,
>> >> >> >> batchSize=1, columnFamily=cf1, table=test,
>> >> >> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>> >> >> >> serializer.suffix=timestamp} }}
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration:
>> >> >> >> Channels:ch1
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks
>> >> >> >> sink1
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources
>> >> >> >> tail
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration:
>> >> >> >> Post-validation
>> >> >> >> flume configuration contains configuration  for agents:
>> >> >> >> [hbase-agent]
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,171 INFO
>> >> >> >> properties.PropertiesFileConfigurationProvider: Creating channels
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory:
>> >> >> >> Creating
>> >> >> >> instance of channel ch1 type memory
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,175 INFO
>> >> >> >> properties.PropertiesFileConfigurationProvider: created channel
>> >> >> >> ch1
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory:
>> >> >> >> Creating
>> >> >> >> instance of source tail, type exec
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating
>> >> >> >> instance
>> >> >> >> of
>> >> >> >> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
>> >> >> >> org.apache.flume.sink.hbase.HBaseSink is a custom type
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,298 INFO
>> >> >> >> nodemanager.DefaultLogicalNodeManager:
>> >> >> >> Node
>> >> >> >> configuration change:{
>> >> >> >> sourceRunners:{tail=EventDrivenSourceRunner:
>> >> >> >> {
>> >> >> >> source:org.apache.flume.source.ExecSource@1fd0fafc }}
>> >> >> >> sinkRunners:{sink1=SinkRunner: {
>> >> >> >> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
>> >> >> >> counterGroup:{
>> >> >> >> name:null counters:{} } }}
>> >> >> >> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source
>> >> >> >> starting
>> >> >> >> with
>> >> >> >> command:tail -f /home/hadoop/demo.txt
>> >> >> >>
>> >> >> >> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source
>> >> >> >> started
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Output of the which Flume-ng is:
>> >> >> >>
>> >> >> >> /usr/bin/flume-ng
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >> Thanks & Regards,
>> >> >> >>
>> >> >> >> Ashutosh Sharma
>> >> >> >>
>> >> >> >> Cell: 010-7300-0150
>> >> >> >>
>> >> >> >> Email: sharma.ashutosh@kt.com
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> From: Will McQueen [mailto:will@cloudera.com]
>> >> >> >> Sent: Thursday, June 21, 2012 6:07 PM
>> >> >> >>
>> >> >> >>
>> >> >> >> To: flume-user@incubator.apache.org
>> >> >> >> Subject: Re: Hbase-sink behavior
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Hi Sharma,
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Could you please describe how you installed flume? Also, I see
>> >> >> >> you're
>> >> >> >> getting this warning:
>> >> >> >>
>> >> >> >> >> Warning: No configuration directory set! Use --conf <dir> to
>> >> >> >> >> override.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> The log4j.properties that flume provides is stored in the conf
>> >> >> >> dir.
>> >> >> >> If
>> >> >> >> you
>> >> >> >> specify the flume conf dir, flume can pick it up. So for
>> >> >> >> troubleshooting you
>> >> >> >> can try:
>> >> >> >>
>> >> >> >>
>> >> >> >> 1) modifying the log4j.properties within flume's conf dir so that
>> >> >> >> the
>> >> >> >> top
>> >> >> >> reads:
>> >> >> >> #flume.root.logger=DEBUG,console
>> >> >> >> flume.root.logger=DEBUG,LOGFILE
>> >> >> >> flume.log.dir=.
>> >> >> >> flume.log.file=flume.log
>> >> >> >>
>> >> >> >> 2) Run the flume agent while specifying the flume conf dir
>> >> >> >> (--conf
>> >> >> >> <dir>)
>> >> >> >>
>> >> >> >> 3) What's the output of 'which flume-ng'?
>> >> >> >>
>> >> >> >> Cheers,
>> >> >> >> Will
>> >> >> >>
>> >> >> >> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
>> >> >> >> <sh...@kt.com> wrote:
>> >> >> >>
>> >> >> >> Hi Hari,
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> I checked, agent is successfully tailing the file which I
>> >> >> >> mentioned.
>> >> >> >> Yes,
>> >> >> >> you are right, agent has started properly without any error.
>> >> >> >> Because
>> >> >> >> there
>> >> >> >> is no further movement, so it's hard for me to identify the
>> >> >> >> issue. I
>> >> >> >> also
>> >> >> >> used tail -F also, but no success.
>> >> >> >>
>> >> >> >> Can you suggest me some technique to troubleshoot it, so I could
>> >> >> >> identify
>> >> >> >> the issue and resolve the same. Does flume record some log
>> >> >> >> anywhere?
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >> Thanks & Regards,
>> >> >> >>
>> >> >> >> Ashutosh Sharma
>> >> >> >>
>> >> >> >> Cell: 010-7300-0150
>> >> >> >>
>> >> >> >> Email: sharma.ashutosh@kt.com
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
>> >> >> >> Sent: Thursday, June 21, 2012 5:25 PM
>> >> >> >>
>> >> >> >>
>> >> >> >> To: flume-user@incubator.apache.org
>> >> >> >> Subject: Re: Hbase-sink behavior
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> I am not sure if HBase changed their wire protocol between these
>> >> >> >> versions.
>> >> >> >> Looks like your agent has started properly. Are you sure data is
>> >> >> >> being
>> >> >> >> written into the file being tailed? I suggest using tail -F. The
>> >> >> >> log
>> >> >> >> being
>> >> >> >> stuck here is ok, that is probably because nothing specific is
>> >> >> >> required(or
>> >> >> >> your log file rotated).
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Thanks
>> >> >> >>
>> >> >> >> Hari
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >>
>> >> >> >> Hari Shreedharan
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
>> >> >> >>
>> >> >> >> Hi Hari,
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Thanks for your prompt reply. I already created the table in
>> >> >> >> Hbase
>> >> >> >> with
>> >> >> >> a
>> >> >> >> column family and hadoop/hbase library is available to hadoop. I
>> >> >> >> noticed
>> >> >> >> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
>> >> >> >>
>> >> >> >> Please see the below lines captured while running the flume
>> >> >> >> agent:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> >>> flume-ng  agent -n hbase-agent -f
>> >> >> >> >>> /home/hadoop/flumeng/hbaseagent.conf
>> >> >> >>
>> >> >> >> Warning: No configuration directory set! Use --conf <dir> to
>> >> >> >> override.
>> >> >> >>
>> >> >> >> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for
>> >> >> >> HDFS
>> >> >> >> access
>> >> >> >>
>> >> >> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
>> >> >> >> classpath
>> >> >> >>
>> >> >> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar
>> >> >> >> from
>> >> >> >> classpath
>> >> >> >>
>> >> >> >> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
>> >> >> >>
>> >> >> >> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
>> >> >> >> org.apache.flume.node.Application -n hbase-agent -f
>> >> >> >> /home/hadoop/flumeng/hbaseagent.conf
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>> >> >> >> lifecycle
>> >> >> >> supervisor 1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
>> >> >> >> hbase-agent
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
>> >> >> >> Node
>> >> >> >> manager
>> >> >> >> starting
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>> >> >> >> lifecycle
>> >> >> >> supervisor 10
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO
>> >> >> >> properties.PropertiesFileConfigurationProvider:
>> >> >> >> Configuration provider starting
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO
>> >> >> >> properties.PropertiesFileConfigurationProvider:
>> >> >> >> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks:
>> >> >> >> sink1
>> >> >> >> Agent:
>> >> >> >> hbase-agent
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation
>> >> >> >> flume
>> >> >> >> configuration contains configuration  for agents: [hbase-agent]
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO
>> >> >> >> properties.PropertiesFileConfigurationProvider:
>> >> >> >> Creating channels
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO
>> >> >> >> properties.PropertiesFileConfigurationProvider:
>> >> >> >> created channel ch1
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance
>> >> >> >> of
>> >> >> >> sink
>> >> >> >> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager:
>> >> >> >> Node
>> >> >> >> configuration change:{
>> >> >> >> sourceRunners:{tail=EventDrivenSourceRunner:
>> >> >> >> {
>> >> >> >> source:org.apache.flume.source.ExecSource@1ed0af9b }}
>> >> >> >> sinkRunners:{sink1=SinkRunner: {
>> >> >> >> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
>> >> >> >> counterGroup:{
>> >> >> >> name:null counters:{} } }}
>> >> >> >> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
>> >> >> >>
>> >> >> >> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting
>> >> >> >> with
>> >> >> >> command:tail -f /home/hadoop/demo.txt
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Screen stuck here....no movement.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >> Thanks & Regards,
>> >> >> >>
>> >> >> >> Ashutosh Sharma
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
>> >> >> >> Sent: Thursday, June 21, 2012 5:01 PM
>> >> >> >> To: flume-user@incubator.apache.org
>> >> >> >> Subject: Re: Hbase-sink behavior
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Hi Ashutosh,
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> The sink will not create the table or column family. Make sure
>> >> >> >> you
>> >> >> >> have
>> >> >> >> the table and column family. Also please make sure you have
>> >> >> >> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they
>> >> >> >> are
>> >> >> >> in
>> >> >> >> your
>> >> >> >> class path).
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Thanks
>> >> >> >>
>> >> >> >> Hari
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >>
>> >> >> >> Hari Shreedharan
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
>> >> >> >>
>> >> >> >> Hi,
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> I have used and followed the same steps which is mentioned in
>> >> >> >> below
>> >> >> >> mails
>> >> >> >> to get start with the hbasesink. But agent is not storing any
>> >> >> >> data
>> >> >> >> into
>> >> >> >> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick
>> >> >> >> the
>> >> >> >> hbase
>> >> >> >> information. Even I am able to connect to the hbase server from
>> >> >> >> that
>> >> >> >> agent
>> >> >> >> machine.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Now, I am unable to understand and troubleshoot this problem.
>> >> >> >> Seeking
>> >> >> >> advice from the community members....
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >> Thanks & Regards,
>> >> >> >>
>> >> >> >> Ashutosh Sharma
>> >> >> >>
>> >> >> >> ----------------------------------------
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> -----Original Message-----
>> >> >> >>
>> >> >> >> From: Mohammad Tariq [mailto:dontariq@gmail.com]
>> >> >> >>
>> >> >> >> Sent: Friday, June 15, 2012 9:02 AM
>> >> >> >>
>> >> >> >> To: flume-user@incubator.apache.org
>> >> >> >>
>> >> >> >> Subject: Re: Hbase-sink behavior
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Thank you so much Hari for the valuable response..I'll follow the
>> >> >> >> guidelines provided by you.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Regards,
>> >> >> >>
>> >> >> >> Mohammad Tariq
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
>> >> >> >> <hs...@cloudera.com> wrote:
>> >> >> >>
>> >> >> >> Hi Mohammad,
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> My answers are inline.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >>
>> >> >> >> Hari Shreedharan
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Hello list,
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> I am trying to use hbase-sink to collect data from a local file
>> >> >> >> and
>> >> >> >>
>> >> >> >> dump it into an Hbase table..But there are a few things I am not
>> >> >> >> able
>> >> >> >>
>> >> >> >> to understand and need some guidance.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> This is the content of my conf file :
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> hbase-agent.sources = tail
>> >> >> >>
>> >> >> >> hbase-agent..sinks = sink1
>> >> >> >>
>> >> >> >> hbase-agent.channels = ch1
>> >> >> >>
>> >> >> >> hbase-agent.sources.tail.type = exec
>> >> >> >>
>> >> >> >> hbase-agent.sources.tail.command = tail -F
>> >> >> >> /home/mohammad/demo.txt
>> >> >> >>
>> >> >> >> hbase-agent.sources.tail.channels = ch1
>> >> >> >> hbase-agent.sinks.sink1.type
>> >> >> >> =
>> >> >> >>
>> >> >> >> org.apache.flume.sink.hbase.HBaseSink
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.channel = ch1
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.table = test3
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.columnFamily = testing
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.column = foo
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.serializer =
>> >> >> >>
>> >> >> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>> >> >> >>
>> >> >> >> hbase-agent.channels.ch1.type=memory
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Right now I am taking just some simple text from a file which has
>> >> >> >>
>> >> >> >> following content -
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> value1
>> >> >> >>
>> >> >> >> value2
>> >> >> >>
>> >> >> >> value3
>> >> >> >>
>> >> >> >> value4
>> >> >> >>
>> >> >> >> value5
>> >> >> >>
>> >> >> >> value6
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> And my Hbase table looks like -
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> hbase(main):217:0> scan 'test3'
>> >> >> >>
>> >> >> >> ROW COLUMN+CELL
>> >> >> >>
>> >> >> >> 11339716704561 column=testing:col1,
>> >> >> >>
>> >> >> >> timestamp=1339716707569, value=value1
>> >> >> >>
>> >> >> >> 11339716704562 column=testing:col1,
>> >> >> >>
>> >> >> >> timestamp=1339716707571, value=value4
>> >> >> >>
>> >> >> >> 11339716846594 column=testing:col1,
>> >> >> >>
>> >> >> >> timestamp=1339716849608, value=value2
>> >> >> >>
>> >> >> >> 11339716846595 column=testing:col1,
>> >> >> >>
>> >> >> >> timestamp=1339716849610, value=value1
>> >> >> >>
>> >> >> >> 11339716846596 column=testing:col1,
>> >> >> >>
>> >> >> >> timestamp=1339716849611, value=value6
>> >> >> >>
>> >> >> >> 11339716846597 column=testing:col1,
>> >> >> >>
>> >> >> >> timestamp=1339716849614, value=value6
>> >> >> >>
>> >> >> >> 11339716846598 column=testing:col1,
>> >> >> >>
>> >> >> >> timestamp=1339716849615, value=value5
>> >> >> >>
>> >> >> >> 11339716846599 column=testing:col1,
>> >> >> >>
>> >> >> >> timestamp=1339716849615, value=value6
>> >> >> >>
>> >> >> >> incRow column=testing:col1,
>> >> >> >>
>> >> >> >> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
>> >> >> >>
>> >> >> >> 9 row(s) in 0.0580 seconds
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Now I have following questions -
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 1- Why the timestamp value is different from the row key?(I was
>> >> >> >> trying
>> >> >> >>
>> >> >> >> to make "1+timestamp" as the rowkey)
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> The value shown by hbase shell as timestamp is the time at which
>> >> >> >> the
>> >> >> >>
>> >> >> >> value was inserted into Hbase, while the value inserted by Flume
>> >> >> >> is
>> >> >> >>
>> >> >> >> the timestamp at which the sink read the event from the channel.
>> >> >> >>
>> >> >> >> Depending on how long the network and HBase takes, these
>> >> >> >> timestamps
>> >> >> >>
>> >> >> >> can vary. If you want 1+timestamp as row key then you should
>> >> >> >> configure
>> >> >> >> it:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
>> >> >> >>
>> >> >> >> appended as-is to the suffix you choose.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 2- Although I am not using "incRow", it stills appear in the
>> >> >> >> table
>> >> >> >>
>> >> >> >> with some value. Why so and what is this value??
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> The SimpleHBaseEventSerializer is only an example class. For
>> >> >> >> custom
>> >> >> >>
>> >> >> >> use cases you can write your own serializer by implementing
>> >> >> >>
>> >> >> >> HbaseEventSerializer. In this case, you have specified
>> >> >> >>
>> >> >> >> incrementColumn, which causes an increment on the column
>> >> >> >> specified.
>> >> >> >>
>> >> >> >> Simply don't specify that config and that row will not appear.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 3- How can avoid the last row??
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> See above.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> I am still in the learning phase so please pardon my
>> >> >> >> ignorance..Many
>> >> >> >> thanks.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> No problem. Much of this is documented
>> >> >> >>
>> >> >> >> here:
>> >> >> >>
>> >> >> >> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Regards,
>> >> >> >>
>> >> >> >> Mohammad Tariq
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>> >> >> >> 없이, 본
>> >> >> >> 문서에
>> >> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>> >> >> >> 본
>> >> >> >> 메일이
>> >> >> >> 잘못
>> >> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> >> >>
>> >> >> >> This E-mail may contain confidential information and/or copyright
>> >> >> >> material. This email is intended for the use of the addressee
>> >> >> >> only.
>> >> >> >> If
>> >> >> >> you
>> >> >> >> receive this email by mistake, please either delete it without
>> >> >> >> reproducing,
>> >> >> >> distributing or retaining copies thereof or notify the sender
>> >> >> >> immediately.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>> >> >> >> 없이, 본
>> >> >> >> 문서에
>> >> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>> >> >> >> 본
>> >> >> >> 메일이
>> >> >> >> 잘못
>> >> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> >> >> This E-mail may contain confidential information and/or copyright
>> >> >> >> material. This email is intended for the use of the addressee
>> >> >> >> only.
>> >> >> >> If
>> >> >> >> you
>> >> >> >> receive this email by mistake, please either delete it without
>> >> >> >> reproducing,
>> >> >> >> distributing or retaining copies thereof or notify the sender
>> >> >> >> immediately.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>> >> >> >> 없이, 본
>> >> >> >> 문서에
>> >> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>> >> >> >> 본
>> >> >> >> 메일이
>> >> >> >> 잘못
>> >> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> >> >> This E-mail may contain confidential information and/or copyright
>> >> >> >> material. This email is intended for the use of the addressee
>> >> >> >> only.
>> >> >> >> If
>> >> >> >> you
>> >> >> >> receive this email by mistake, please either delete it without
>> >> >> >> reproducing,
>> >> >> >> distributing or retaining copies thereof or notify the sender
>> >> >> >> immediately.
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
>> >> >> >> 없이, 본
>> >> >> >> 문서에
>> >> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
>> >> >> >> 본
>> >> >> >> 메일이
>> >> >> >> 잘못
>> >> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> >> >> This E-mail may contain confidential information and/or copyright
>> >> >> >> material. This email is intended for the use of the addressee
>> >> >> >> only.
>> >> >> >> If
>> >> >> >> you
>> >> >> >> receive this email by mistake, please either delete it without
>> >> >> >> reproducing,
>> >> >> >> distributing or retaining copies thereof or notify the sender
>> >> >> >> immediately.
>> >> >> >
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>> >
>> >
>> > --
>> > Regards,
>> > Rahul Patodi
>> >
>> >
>
>
>
>
> --
> Regards,
> Rahul Patodi
>
>

Re: Hbase-sink behavior

Posted by Rahul Patodi <pa...@gmail.com>.
If you look at the output provided by you in the first mail of this mail
thread:
in your file (on local file system) you have value 1 to 6 (value1, value2,
value3....)
but when you scan in hbase output is value1, value4 , value2 , value1 ,
value6 , value6 , value5 , value6

value3 is not inserted
value 6 is inserted 3 times

did you figure out why so ?


On Thu, Jun 21, 2012 at 6:24 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Both the commands seem similar to me.
>
> Regards,
>    Mohammad Tariq
>
>
> On Thu, Jun 21, 2012 at 5:43 PM, Rahul Patodi
> <pa...@gmail.com> wrote:
> > Hi Mohammad,
> > Thanks for your response
> > I have put this configuration:
> >
> > hbase-agent.sources=tail
> > hbase-agent.sinks=sink1
> > hbase-agent.channels=ch1
> >
> > hbase-agent.sources.tail.type=exec
> > hbase-agent.sources.tail.command=tail -F /tmp/test05
> > hbase-agent.sources.tail.channels=ch1
> >
> > hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink
> > hbase-agent.sinks.sink1.channel=ch1
> > hbase-agent.sinks.sink1.table=t002
> > hbase-agent.sinks.sink1.columnFamily=cf
> > hbase-agent.sinks.sink1.column=foo
> >
> hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > hbase-agent.sinks.sink1.serializer.payloadColumn=col1
> > hbase-agent.sinks.sink1.serializer.incrementColumn=col1
> > #hbase-agent.sinks.sink1.serializer.keyType=timestamp
> > hbase-agent.sinks.sink1.serializer.rowPrefix=1+
> > hbase-agent.sinks.sink1.serializer.suffix=timestamp
> >
> > hbase-agent.channels.ch1.type=memory
> >
> >
> > Data is getting copy into HBase, but I have got another issue:
> >
> > My input data is simply:
> >
> > value1
> > value2
> > value3
> > value4
> > value5
> > value6
> > value7
> > value8
> > value9
> >
> > when I run this command in HBase:
> > hbase(main):129:0> scan 't002', {VERSIONS => 3}
> > ROW                              COLUMN+CELL
> >  1+1340279755410                 column=cf:col1, timestamp=1340279758424,
> > value=value5
> >  1+1340279755410                 column=cf:col1, timestamp=1340279758423,
> > value=value3
> >  1+1340279755410                 column=cf:col1, timestamp=1340279758417,
> > value=value1
> >  1+1340279755411                 column=cf:col1, timestamp=1340279758427,
> > value=value9
> >  1+1340279755411                 column=cf:col1, timestamp=1340279758426,
> > value=value8
> >  1+1340279755411                 column=cf:col1, timestamp=1340279758425,
> > value=value7
> >  incRow                          column=cf:col1, timestamp=1340279758443,
> > value=\x00\x00\x00\x00\x00\x00\x00\x09
> > 3 row(s) in 0.0420 seconds
> >
> > all the data is not getting copy ??
> >
> > When I run this command with version:
> > hbase(main):130:0> scan 't002', {VERSIONS => 3}
> > ROW                              COLUMN+CELL
> >  1+1340279755410                 column=cf:col1, timestamp=1340279758424,
> > value=value5
> >  1+1340279755410                 column=cf:col1, timestamp=1340279758423,
> > value=value3
> >  1+1340279755410                 column=cf:col1, timestamp=1340279758417,
> > value=value1
> >  1+1340279755411                 column=cf:col1, timestamp=1340279758427,
> > value=value9
> >  1+1340279755411                 column=cf:col1, timestamp=1340279758426,
> > value=value8
> >  1+1340279755411                 column=cf:col1, timestamp=1340279758425,
> > value=value7
> >  1+1340279906637                 column=cf:col1, timestamp=1340279909652,
> > value=value1
> >  1+1340279906638                 column=cf:col1, timestamp=1340279909659,
> > value=value6
> >  1+1340279906638                 column=cf:col1, timestamp=1340279909658,
> > value=value5
> >  1+1340279906638                 column=cf:col1, timestamp=1340279909654,
> > value=value3
> >  1+1340279906646                 column=cf:col1, timestamp=1340279909659,
> > value=value7
> >  1+1340279906647                 column=cf:col1, timestamp=1340279909659,
> > value=value9
> >  incRow                          column=cf:col1, timestamp=1340279909677,
> > value=\x00\x00\x00\x00\x00\x00\x00\x12
> > 7 row(s) in 0.0640 seconds
> >
> > Please help me understand this.
> >
> >
> >
> >
> > On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >>
> >> Hi Will,
> >>
> >>          I got it.Thanks for the info.
> >>
> >> Regards,
> >>    Mohammad Tariq
> >>
> >>
> >> On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <wi...@cloudera.com>
> wrote:
> >> > Hi Mohammad,
> >> >
> >> > In your config file, I think you need to remove this line:
> >> >
> >> >>>hbase-agent.sinks.sink1.serializer.keyType = timestamp
> >> >
> >> > I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
> >> > (although there is a keyType var that stores the value of the 'suffix'
> >> > prop).
> >> >
> >> > Cheers,
> >> > Will
> >> >
> >> >
> >> > On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hi Rahul,
> >> >>
> >> >>          This normally happens when there is some problem in the
> >> >> configuration file.Create a file called hbase-agent inside your
> >> >> FLUME_HOME/conf directory and copy this content into it:
> >> >> hbase-agent.sources = tail
> >> >> hbase-agent.sinks = sink1
> >> >> hbase-agent.channels = ch1
> >> >>
> >> >> hbase-agent.sources.tail.type = exec
> >> >> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> >> >> hbase-agent.sources.tail.channels = ch1
> >> >>
> >> >> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
> >> >> hbase-agent.sinks.sink1.channel = ch1
> >> >> hbase-agent.sinks.sink1.table = demo
> >> >> hbase-agent.sinks.sink1.columnFamily = cf
> >> >>
> >> >> hbase-agent.sinks.sink1.serializer =
> >> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> >> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> >> >>
> >> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> >> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
> >> >>
> >> >> hbase-agent.channels.ch1.type=memory
> >> >>
> >> >> Then start the agent and see if it works for you. It worked for me.
> >> >>
> >> >> Regards,
> >> >>    Mohammad Tariq
> >> >>
> >> >>
> >> >> On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com>
> >> >> wrote:
> >> >> > Hi Sharma,
> >> >> >
> >> >> > So I assume that your command looks something like this:
> >> >> >      flume-ng agent -n hbase-agent -f
> >> >> > /home/hadoop/flumeng/hbaseagent.conf
> >> >> > -c /etc/flume-ng/conf
> >> >> >
> >> >> > ...?
> >> >> >
> >> >> > Hari, I saw your comment:
> >> >> >
> >> >> >>>I am not sure if HBase changed their wire protocol between these
> >> >> >>> versions.
> >> >> > Do you have any other advice about troubleshooting a possible hbase
> >> >> > protocol
> >> >> > mismatch issue?
> >> >> >
> >> >> > Cheers,
> >> >> > Will
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
> >> >> > <sh...@kt.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi Will,
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> I installed flume as part of CDH3u4 version 1.1 using yum install
> >> >> >> flume-ng. One more point, I am using flume-ng hbase sink
> downloaded
> >> >> >> from:
> >> >> >>
> >> >> >>
> >> >> >>
> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Now, I ran the agent with -conf parameter with updated
> >> >> >> log4j.properties. I
> >> >> >> don't see any error in the log. Please see the below from the log
> >> >> >> file:
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor:
> Starting
> >> >> >> lifecycle supervisor 1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting -
> >> >> >> hbase-agent
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,146 INFO
> nodemanager.DefaultLogicalNodeManager:
> >> >> >> Node
> >> >> >> manager starting
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor:
> Starting
> >> >> >> lifecycle supervisor 9
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,146 INFO
> >> >> >> properties.PropertiesFileConfigurationProvider: Configuration
> >> >> >> provider
> >> >> >> starting
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,148 DEBUG
> nodemanager.DefaultLogicalNodeManager:
> >> >> >> Node
> >> >> >> manager started
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,148 DEBUG
> >> >> >> properties.PropertiesFileConfigurationProvider: Configuration
> >> >> >> provider
> >> >> >> started
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,149 DEBUG
> >> >> >> properties.PropertiesFileConfigurationProvider: Checking
> >> >> >> file:/home/hadoop/flumeng/hbaseagent.conf for changes
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,149 INFO
> >> >> >> properties.PropertiesFileConfigurationProvider: Reloading
> >> >> >> configuration
> >> >> >> file:/home/hadoop/flumeng/hbaseagent.conf
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added sinks:
> >> >> >> sink1
> >> >> >> Agent: hbase-agent
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created
> >> >> >> context
> >> >> >> for
> >> >> >> sink1: serializer.rowPrefix
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> >> >> >> Processing:sink1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
> >> >> >> validation
> >> >> >> of configuration for agent: hbase-agent, initial-configuration:
> >> >> >> AgentConfiguration[hbase-agent]
> >> >> >>
> >> >> >> SOURCES: {tail={ parameters:{command=tail -f
> /home/hadoop/demo.txt,
> >> >> >> channels=ch1, type=exec} }}
> >> >> >>
> >> >> >> CHANNELS: {ch1={ parameters:{type=memory} }}
> >> >> >>
> >> >> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> >> >> >> serializer.keyType=timestamp,
> >> >> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> >> >> >> serializer.incrementColumn=col1, column=foo,
> serializer.rowPrefix=1,
> >> >> >> batchSize=1, columnFamily=cf1, table=test,
> >> >> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> >> >> >> serializer.suffix=timestamp} }}
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created
> >> >> >> channel
> >> >> >> ch1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating
> >> >> >> sink:
> >> >> >> sink1 using OTHER
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post
> >> >> >> validation
> >> >> >> configuration for hbase-agent
> >> >> >>
> >> >> >> AgentConfiguration created without Configuration stubs for which
> >> >> >> only
> >> >> >> basic syntactical validation was performed[hbase-agent]
> >> >> >>
> >> >> >> SOURCES: {tail={ parameters:{command=tail -f
> /home/hadoop/demo.txt,
> >> >> >> channels=ch1, type=exec} }}
> >> >> >>
> >> >> >> CHANNELS: {ch1={ parameters:{type=memory} }}
> >> >> >>
> >> >> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> >> >> >> serializer.keyType=timestamp,
> >> >> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> >> >> >> serializer.incrementColumn=col1, column=foo,
> serializer.rowPrefix=1,
> >> >> >> batchSize=1, columnFamily=cf1, table=test,
> >> >> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> >> >> >> serializer.suffix=timestamp} }}
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration:
> Channels:ch1
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks sink1
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources
> tail
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration:
> >> >> >> Post-validation
> >> >> >> flume configuration contains configuration  for agents:
> >> >> >> [hbase-agent]
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,171 INFO
> >> >> >> properties.PropertiesFileConfigurationProvider: Creating channels
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory:
> >> >> >> Creating
> >> >> >> instance of channel ch1 type memory
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,175 INFO
> >> >> >> properties.PropertiesFileConfigurationProvider: created channel
> ch1
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory:
> Creating
> >> >> >> instance of source tail, type exec
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating
> >> >> >> instance
> >> >> >> of
> >> >> >> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
> >> >> >> org.apache.flume.sink.hbase.HBaseSink is a custom type
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,298 INFO
> nodemanager.DefaultLogicalNodeManager:
> >> >> >> Node
> >> >> >> configuration change:{
> sourceRunners:{tail=EventDrivenSourceRunner:
> >> >> >> {
> >> >> >> source:org.apache.flume.source.ExecSource@1fd0fafc }}
> >> >> >> sinkRunners:{sink1=SinkRunner: {
> >> >> >> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
> >> >> >> counterGroup:{
> >> >> >> name:null counters:{} } }}
> >> >> >> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source
> starting
> >> >> >> with
> >> >> >> command:tail -f /home/hadoop/demo.txt
> >> >> >>
> >> >> >> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source
> started
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Output of the which Flume-ng is:
> >> >> >>
> >> >> >> /usr/bin/flume-ng
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >> Thanks & Regards,
> >> >> >>
> >> >> >> Ashutosh Sharma
> >> >> >>
> >> >> >> Cell: 010-7300-0150
> >> >> >>
> >> >> >> Email: sharma.ashutosh@kt.com
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> From: Will McQueen [mailto:will@cloudera.com]
> >> >> >> Sent: Thursday, June 21, 2012 6:07 PM
> >> >> >>
> >> >> >>
> >> >> >> To: flume-user@incubator.apache.org
> >> >> >> Subject: Re: Hbase-sink behavior
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Hi Sharma,
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Could you please describe how you installed flume? Also, I see
> >> >> >> you're
> >> >> >> getting this warning:
> >> >> >>
> >> >> >> >> Warning: No configuration directory set! Use --conf <dir> to
> >> >> >> >> override.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> The log4j.properties that flume provides is stored in the conf
> dir.
> >> >> >> If
> >> >> >> you
> >> >> >> specify the flume conf dir, flume can pick it up. So for
> >> >> >> troubleshooting you
> >> >> >> can try:
> >> >> >>
> >> >> >>
> >> >> >> 1) modifying the log4j.properties within flume's conf dir so that
> >> >> >> the
> >> >> >> top
> >> >> >> reads:
> >> >> >> #flume.root.logger=DEBUG,console
> >> >> >> flume.root.logger=DEBUG,LOGFILE
> >> >> >> flume.log.dir=.
> >> >> >> flume.log.file=flume.log
> >> >> >>
> >> >> >> 2) Run the flume agent while specifying the flume conf dir (--conf
> >> >> >> <dir>)
> >> >> >>
> >> >> >> 3) What's the output of 'which flume-ng'?
> >> >> >>
> >> >> >> Cheers,
> >> >> >> Will
> >> >> >>
> >> >> >> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
> >> >> >> <sh...@kt.com> wrote:
> >> >> >>
> >> >> >> Hi Hari,
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> I checked, agent is successfully tailing the file which I
> mentioned.
> >> >> >> Yes,
> >> >> >> you are right, agent has started properly without any error.
> Because
> >> >> >> there
> >> >> >> is no further movement, so it's hard for me to identify the
> issue. I
> >> >> >> also
> >> >> >> used tail -F also, but no success.
> >> >> >>
> >> >> >> Can you suggest me some technique to troubleshoot it, so I could
> >> >> >> identify
> >> >> >> the issue and resolve the same. Does flume record some log
> anywhere?
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >> Thanks & Regards,
> >> >> >>
> >> >> >> Ashutosh Sharma
> >> >> >>
> >> >> >> Cell: 010-7300-0150
> >> >> >>
> >> >> >> Email: sharma.ashutosh@kt.com
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> >> >> >> Sent: Thursday, June 21, 2012 5:25 PM
> >> >> >>
> >> >> >>
> >> >> >> To: flume-user@incubator.apache.org
> >> >> >> Subject: Re: Hbase-sink behavior
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> I am not sure if HBase changed their wire protocol between these
> >> >> >> versions.
> >> >> >> Looks like your agent has started properly. Are you sure data is
> >> >> >> being
> >> >> >> written into the file being tailed? I suggest using tail -F. The
> log
> >> >> >> being
> >> >> >> stuck here is ok, that is probably because nothing specific is
> >> >> >> required(or
> >> >> >> your log file rotated).
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Thanks
> >> >> >>
> >> >> >> Hari
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >>
> >> >> >> Hari Shreedharan
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
> >> >> >>
> >> >> >> Hi Hari,
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Thanks for your prompt reply. I already created the table in Hbase
> >> >> >> with
> >> >> >> a
> >> >> >> column family and hadoop/hbase library is available to hadoop. I
> >> >> >> noticed
> >> >> >> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
> >> >> >>
> >> >> >> Please see the below lines captured while running the flume agent:
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> >>> flume-ng  agent -n hbase-agent -f
> >> >> >> >>> /home/hadoop/flumeng/hbaseagent.conf
> >> >> >>
> >> >> >> Warning: No configuration directory set! Use --conf <dir> to
> >> >> >> override.
> >> >> >>
> >> >> >> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for
> >> >> >> HDFS
> >> >> >> access
> >> >> >>
> >> >> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
> >> >> >> classpath
> >> >> >>
> >> >> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar
> >> >> >> from
> >> >> >> classpath
> >> >> >>
> >> >> >> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
> >> >> >>
> >> >> >>
> >> >> >>
> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
> >> >> >>
> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> >> >> >> org.apache.flume.node.Application -n hbase-agent -f
> >> >> >> /home/hadoop/flumeng/hbaseagent.conf
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
> >> >> >> lifecycle
> >> >> >> supervisor 1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
> >> >> >> hbase-agent
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
> >> >> >> manager
> >> >> >> starting
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
> >> >> >> lifecycle
> >> >> >> supervisor 10
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO
> >> >> >> properties.PropertiesFileConfigurationProvider:
> >> >> >> Configuration provider starting
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO
> >> >> >> properties.PropertiesFileConfigurationProvider:
> >> >> >> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1
> >> >> >> Agent:
> >> >> >> hbase-agent
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation
> >> >> >> flume
> >> >> >> configuration contains configuration  for agents: [hbase-agent]
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO
> >> >> >> properties.PropertiesFileConfigurationProvider:
> >> >> >> Creating channels
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO
> >> >> >> properties.PropertiesFileConfigurationProvider:
> >> >> >> created channel ch1
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance
> of
> >> >> >> sink
> >> >> >> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
> >> >> >> configuration change:{
> sourceRunners:{tail=EventDrivenSourceRunner:
> >> >> >> {
> >> >> >> source:org.apache.flume.source.ExecSource@1ed0af9b }}
> >> >> >> sinkRunners:{sink1=SinkRunner: {
> >> >> >> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
> >> >> >> counterGroup:{
> >> >> >> name:null counters:{} } }}
> >> >> >> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
> >> >> >>
> >> >> >> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting
> with
> >> >> >> command:tail -f /home/hadoop/demo.txt
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Screen stuck here....no movement.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >> Thanks & Regards,
> >> >> >>
> >> >> >> Ashutosh Sharma
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> >> >> >> Sent: Thursday, June 21, 2012 5:01 PM
> >> >> >> To: flume-user@incubator.apache.org
> >> >> >> Subject: Re: Hbase-sink behavior
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Hi Ashutosh,
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> The sink will not create the table or column family. Make sure you
> >> >> >> have
> >> >> >> the table and column family. Also please make sure you have
> >> >> >> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are
> >> >> >> in
> >> >> >> your
> >> >> >> class path).
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Thanks
> >> >> >>
> >> >> >> Hari
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >>
> >> >> >> Hari Shreedharan
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> I have used and followed the same steps which is mentioned in
> below
> >> >> >> mails
> >> >> >> to get start with the hbasesink. But agent is not storing any data
> >> >> >> into
> >> >> >> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick
> the
> >> >> >> hbase
> >> >> >> information. Even I am able to connect to the hbase server from
> that
> >> >> >> agent
> >> >> >> machine.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Now, I am unable to understand and troubleshoot this problem.
> >> >> >> Seeking
> >> >> >> advice from the community members....
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >> Thanks & Regards,
> >> >> >>
> >> >> >> Ashutosh Sharma
> >> >> >>
> >> >> >> ----------------------------------------
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> -----Original Message-----
> >> >> >>
> >> >> >> From: Mohammad Tariq [mailto:dontariq@gmail.com]
> >> >> >>
> >> >> >> Sent: Friday, June 15, 2012 9:02 AM
> >> >> >>
> >> >> >> To: flume-user@incubator.apache.org
> >> >> >>
> >> >> >> Subject: Re: Hbase-sink behavior
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Thank you so much Hari for the valuable response..I'll follow the
> >> >> >> guidelines provided by you.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Regards,
> >> >> >>
> >> >> >> Mohammad Tariq
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
> >> >> >> <hs...@cloudera.com> wrote:
> >> >> >>
> >> >> >> Hi Mohammad,
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> My answers are inline.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >>
> >> >> >> Hari Shreedharan
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Hello list,
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> I am trying to use hbase-sink to collect data from a local file
> and
> >> >> >>
> >> >> >> dump it into an Hbase table..But there are a few things I am not
> >> >> >> able
> >> >> >>
> >> >> >> to understand and need some guidance.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> This is the content of my conf file :
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> hbase-agent.sources = tail
> >> >> >>
> >> >> >> hbase-agent..sinks = sink1
> >> >> >>
> >> >> >> hbase-agent.channels = ch1
> >> >> >>
> >> >> >> hbase-agent.sources.tail.type = exec
> >> >> >>
> >> >> >> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> >> >> >>
> >> >> >> hbase-agent.sources.tail.channels = ch1
> hbase-agent.sinks.sink1.type
> >> >> >> =
> >> >> >>
> >> >> >> org.apache.flume.sink.hbase.HBaseSink
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.channel = ch1
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.table = test3
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.columnFamily = testing
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.column = foo
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.serializer =
> >> >> >>
> >> >> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
> >> >> >>
> >> >> >> hbase-agent.channels.ch1.type=memory
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Right now I am taking just some simple text from a file which has
> >> >> >>
> >> >> >> following content -
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> value1
> >> >> >>
> >> >> >> value2
> >> >> >>
> >> >> >> value3
> >> >> >>
> >> >> >> value4
> >> >> >>
> >> >> >> value5
> >> >> >>
> >> >> >> value6
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> And my Hbase table looks like -
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> hbase(main):217:0> scan 'test3'
> >> >> >>
> >> >> >> ROW COLUMN+CELL
> >> >> >>
> >> >> >> 11339716704561 column=testing:col1,
> >> >> >>
> >> >> >> timestamp=1339716707569, value=value1
> >> >> >>
> >> >> >> 11339716704562 column=testing:col1,
> >> >> >>
> >> >> >> timestamp=1339716707571, value=value4
> >> >> >>
> >> >> >> 11339716846594 column=testing:col1,
> >> >> >>
> >> >> >> timestamp=1339716849608, value=value2
> >> >> >>
> >> >> >> 11339716846595 column=testing:col1,
> >> >> >>
> >> >> >> timestamp=1339716849610, value=value1
> >> >> >>
> >> >> >> 11339716846596 column=testing:col1,
> >> >> >>
> >> >> >> timestamp=1339716849611, value=value6
> >> >> >>
> >> >> >> 11339716846597 column=testing:col1,
> >> >> >>
> >> >> >> timestamp=1339716849614, value=value6
> >> >> >>
> >> >> >> 11339716846598 column=testing:col1,
> >> >> >>
> >> >> >> timestamp=1339716849615, value=value5
> >> >> >>
> >> >> >> 11339716846599 column=testing:col1,
> >> >> >>
> >> >> >> timestamp=1339716849615, value=value6
> >> >> >>
> >> >> >> incRow column=testing:col1,
> >> >> >>
> >> >> >> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> >> >> >>
> >> >> >> 9 row(s) in 0.0580 seconds
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Now I have following questions -
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 1- Why the timestamp value is different from the row key?(I was
> >> >> >> trying
> >> >> >>
> >> >> >> to make "1+timestamp" as the rowkey)
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> The value shown by hbase shell as timestamp is the time at which
> the
> >> >> >>
> >> >> >> value was inserted into Hbase, while the value inserted by Flume
> is
> >> >> >>
> >> >> >> the timestamp at which the sink read the event from the channel.
> >> >> >>
> >> >> >> Depending on how long the network and HBase takes, these
> timestamps
> >> >> >>
> >> >> >> can vary. If you want 1+timestamp as row key then you should
> >> >> >> configure
> >> >> >> it:
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> >> >> >>
> >> >> >> appended as-is to the suffix you choose.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 2- Although I am not using "incRow", it stills appear in the table
> >> >> >>
> >> >> >> with some value. Why so and what is this value??
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> The SimpleHBaseEventSerializer is only an example class. For
> custom
> >> >> >>
> >> >> >> use cases you can write your own serializer by implementing
> >> >> >>
> >> >> >> HbaseEventSerializer. In this case, you have specified
> >> >> >>
> >> >> >> incrementColumn, which causes an increment on the column
> specified.
> >> >> >>
> >> >> >> Simply don't specify that config and that row will not appear.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 3- How can avoid the last row??
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> See above.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> I am still in the learning phase so please pardon my
> ignorance..Many
> >> >> >> thanks.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> No problem. Much of this is documented
> >> >> >>
> >> >> >> here:
> >> >> >>
> >> >> >> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Regards,
> >> >> >>
> >> >> >> Mohammad Tariq
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> 없이, 본
> >> >> >> 문서에
> >> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> 본
> >> >> >> 메일이
> >> >> >> 잘못
> >> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> >> >>
> >> >> >> This E-mail may contain confidential information and/or copyright
> >> >> >> material. This email is intended for the use of the addressee
> only.
> >> >> >> If
> >> >> >> you
> >> >> >> receive this email by mistake, please either delete it without
> >> >> >> reproducing,
> >> >> >> distributing or retaining copies thereof or notify the sender
> >> >> >> immediately.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> 없이, 본
> >> >> >> 문서에
> >> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> 본
> >> >> >> 메일이
> >> >> >> 잘못
> >> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> >> >> This E-mail may contain confidential information and/or copyright
> >> >> >> material. This email is intended for the use of the addressee
> only.
> >> >> >> If
> >> >> >> you
> >> >> >> receive this email by mistake, please either delete it without
> >> >> >> reproducing,
> >> >> >> distributing or retaining copies thereof or notify the sender
> >> >> >> immediately.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> 없이, 본
> >> >> >> 문서에
> >> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> 본
> >> >> >> 메일이
> >> >> >> 잘못
> >> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> >> >> This E-mail may contain confidential information and/or copyright
> >> >> >> material. This email is intended for the use of the addressee
> only.
> >> >> >> If
> >> >> >> you
> >> >> >> receive this email by mistake, please either delete it without
> >> >> >> reproducing,
> >> >> >> distributing or retaining copies thereof or notify the sender
> >> >> >> immediately.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한
> 없이, 본
> >> >> >> 문서에
> >> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약,
> 본
> >> >> >> 메일이
> >> >> >> 잘못
> >> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> >> >> This E-mail may contain confidential information and/or copyright
> >> >> >> material. This email is intended for the use of the addressee
> only.
> >> >> >> If
> >> >> >> you
> >> >> >> receive this email by mistake, please either delete it without
> >> >> >> reproducing,
> >> >> >> distributing or retaining copies thereof or notify the sender
> >> >> >> immediately.
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
> >
> >
> > --
> > Regards,
> > Rahul Patodi
> >
> >
>



-- 
*Regards*,
Rahul Patodi

Re: Hbase-sink behavior

Posted by Mohammad Tariq <do...@gmail.com>.
Both the commands seem similar to me.

Regards,
    Mohammad Tariq


On Thu, Jun 21, 2012 at 5:43 PM, Rahul Patodi
<pa...@gmail.com> wrote:
> Hi Mohammad,
> Thanks for your response
> I have put this configuration:
>
> hbase-agent.sources=tail
> hbase-agent.sinks=sink1
> hbase-agent.channels=ch1
>
> hbase-agent.sources.tail.type=exec
> hbase-agent.sources.tail.command=tail -F /tmp/test05
> hbase-agent.sources.tail.channels=ch1
>
> hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink
> hbase-agent.sinks.sink1.channel=ch1
> hbase-agent.sinks.sink1.table=t002
> hbase-agent.sinks.sink1.columnFamily=cf
> hbase-agent.sinks.sink1.column=foo
> hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> hbase-agent.sinks.sink1.serializer.payloadColumn=col1
> hbase-agent.sinks.sink1.serializer.incrementColumn=col1
> #hbase-agent.sinks.sink1.serializer.keyType=timestamp
> hbase-agent.sinks.sink1.serializer.rowPrefix=1+
> hbase-agent.sinks.sink1.serializer.suffix=timestamp
>
> hbase-agent.channels.ch1.type=memory
>
>
> Data is getting copy into HBase, but I have got another issue:
>
> My input data is simply:
>
> value1
> value2
> value3
> value4
> value5
> value6
> value7
> value8
> value9
>
> when I run this command in HBase:
> hbase(main):129:0> scan 't002', {VERSIONS => 3}
> ROW                              COLUMN+CELL
>  1+1340279755410                 column=cf:col1, timestamp=1340279758424,
> value=value5
>  1+1340279755410                 column=cf:col1, timestamp=1340279758423,
> value=value3
>  1+1340279755410                 column=cf:col1, timestamp=1340279758417,
> value=value1
>  1+1340279755411                 column=cf:col1, timestamp=1340279758427,
> value=value9
>  1+1340279755411                 column=cf:col1, timestamp=1340279758426,
> value=value8
>  1+1340279755411                 column=cf:col1, timestamp=1340279758425,
> value=value7
>  incRow                          column=cf:col1, timestamp=1340279758443,
> value=\x00\x00\x00\x00\x00\x00\x00\x09
> 3 row(s) in 0.0420 seconds
>
> all the data is not getting copy ??
>
> When I run this command with version:
> hbase(main):130:0> scan 't002', {VERSIONS => 3}
> ROW                              COLUMN+CELL
>  1+1340279755410                 column=cf:col1, timestamp=1340279758424,
> value=value5
>  1+1340279755410                 column=cf:col1, timestamp=1340279758423,
> value=value3
>  1+1340279755410                 column=cf:col1, timestamp=1340279758417,
> value=value1
>  1+1340279755411                 column=cf:col1, timestamp=1340279758427,
> value=value9
>  1+1340279755411                 column=cf:col1, timestamp=1340279758426,
> value=value8
>  1+1340279755411                 column=cf:col1, timestamp=1340279758425,
> value=value7
>  1+1340279906637                 column=cf:col1, timestamp=1340279909652,
> value=value1
>  1+1340279906638                 column=cf:col1, timestamp=1340279909659,
> value=value6
>  1+1340279906638                 column=cf:col1, timestamp=1340279909658,
> value=value5
>  1+1340279906638                 column=cf:col1, timestamp=1340279909654,
> value=value3
>  1+1340279906646                 column=cf:col1, timestamp=1340279909659,
> value=value7
>  1+1340279906647                 column=cf:col1, timestamp=1340279909659,
> value=value9
>  incRow                          column=cf:col1, timestamp=1340279909677,
> value=\x00\x00\x00\x00\x00\x00\x00\x12
> 7 row(s) in 0.0640 seconds
>
> Please help me understand this.
>
>
>
>
> On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hi Will,
>>
>>          I got it.Thanks for the info.
>>
>> Regards,
>>    Mohammad Tariq
>>
>>
>> On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <wi...@cloudera.com> wrote:
>> > Hi Mohammad,
>> >
>> > In your config file, I think you need to remove this line:
>> >
>> >>>hbase-agent.sinks.sink1.serializer.keyType = timestamp
>> >
>> > I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
>> > (although there is a keyType var that stores the value of the 'suffix'
>> > prop).
>> >
>> > Cheers,
>> > Will
>> >
>> >
>> > On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com>
>> > wrote:
>> >>
>> >> Hi Rahul,
>> >>
>> >>          This normally happens when there is some problem in the
>> >> configuration file.Create a file called hbase-agent inside your
>> >> FLUME_HOME/conf directory and copy this content into it:
>> >> hbase-agent.sources = tail
>> >> hbase-agent.sinks = sink1
>> >> hbase-agent.channels = ch1
>> >>
>> >> hbase-agent.sources.tail.type = exec
>> >> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>> >> hbase-agent.sources.tail.channels = ch1
>> >>
>> >> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
>> >> hbase-agent.sinks.sink1.channel = ch1
>> >> hbase-agent.sinks.sink1.table = demo
>> >> hbase-agent.sinks.sink1.columnFamily = cf
>> >>
>> >> hbase-agent.sinks.sink1.serializer =
>> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>> >>
>> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>> >>
>> >> hbase-agent.channels.ch1.type=memory
>> >>
>> >> Then start the agent and see if it works for you. It worked for me.
>> >>
>> >> Regards,
>> >>    Mohammad Tariq
>> >>
>> >>
>> >> On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com>
>> >> wrote:
>> >> > Hi Sharma,
>> >> >
>> >> > So I assume that your command looks something like this:
>> >> >      flume-ng agent -n hbase-agent -f
>> >> > /home/hadoop/flumeng/hbaseagent.conf
>> >> > -c /etc/flume-ng/conf
>> >> >
>> >> > ...?
>> >> >
>> >> > Hari, I saw your comment:
>> >> >
>> >> >>>I am not sure if HBase changed their wire protocol between these
>> >> >>> versions.
>> >> > Do you have any other advice about troubleshooting a possible hbase
>> >> > protocol
>> >> > mismatch issue?
>> >> >
>> >> > Cheers,
>> >> > Will
>> >> >
>> >> >
>> >> >
>> >> > On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
>> >> > <sh...@kt.com>
>> >> > wrote:
>> >> >>
>> >> >> Hi Will,
>> >> >>
>> >> >>
>> >> >>
>> >> >> I installed flume as part of CDH3u4 version 1.1 using yum install
>> >> >> flume-ng. One more point, I am using flume-ng hbase sink downloaded
>> >> >> from:
>> >> >>
>> >> >>
>> >> >> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
>> >> >>
>> >> >>
>> >> >>
>> >> >> Now, I ran the agent with -conf parameter with updated
>> >> >> log4j.properties. I
>> >> >> don't see any error in the log. Please see the below from the log
>> >> >> file:
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor: Starting
>> >> >> lifecycle supervisor 1
>> >> >>
>> >> >> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting -
>> >> >> hbase-agent
>> >> >>
>> >> >> 2012-06-21 18:25:08,146 INFO nodemanager.DefaultLogicalNodeManager:
>> >> >> Node
>> >> >> manager starting
>> >> >>
>> >> >> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor: Starting
>> >> >> lifecycle supervisor 9
>> >> >>
>> >> >> 2012-06-21 18:25:08,146 INFO
>> >> >> properties.PropertiesFileConfigurationProvider: Configuration
>> >> >> provider
>> >> >> starting
>> >> >>
>> >> >> 2012-06-21 18:25:08,148 DEBUG nodemanager.DefaultLogicalNodeManager:
>> >> >> Node
>> >> >> manager started
>> >> >>
>> >> >> 2012-06-21 18:25:08,148 DEBUG
>> >> >> properties.PropertiesFileConfigurationProvider: Configuration
>> >> >> provider
>> >> >> started
>> >> >>
>> >> >> 2012-06-21 18:25:08,149 DEBUG
>> >> >> properties.PropertiesFileConfigurationProvider: Checking
>> >> >> file:/home/hadoop/flumeng/hbaseagent.conf for changes
>> >> >>
>> >> >> 2012-06-21 18:25:08,149 INFO
>> >> >> properties.PropertiesFileConfigurationProvider: Reloading
>> >> >> configuration
>> >> >> file:/home/hadoop/flumeng/hbaseagent.conf
>> >> >>
>> >> >> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added sinks:
>> >> >> sink1
>> >> >> Agent: hbase-agent
>> >> >>
>> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created
>> >> >> context
>> >> >> for
>> >> >> sink1: serializer.rowPrefix
>> >> >>
>> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
>> >> >> Processing:sink1
>> >> >>
>> >> >> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
>> >> >> validation
>> >> >> of configuration for agent: hbase-agent, initial-configuration:
>> >> >> AgentConfiguration[hbase-agent]
>> >> >>
>> >> >> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
>> >> >> channels=ch1, type=exec} }}
>> >> >>
>> >> >> CHANNELS: {ch1={ parameters:{type=memory} }}
>> >> >>
>> >> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>> >> >> serializer.keyType=timestamp,
>> >> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>> >> >> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
>> >> >> batchSize=1, columnFamily=cf1, table=test,
>> >> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>> >> >> serializer.suffix=timestamp} }}
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created
>> >> >> channel
>> >> >> ch1
>> >> >>
>> >> >> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating
>> >> >> sink:
>> >> >> sink1 using OTHER
>> >> >>
>> >> >> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post
>> >> >> validation
>> >> >> configuration for hbase-agent
>> >> >>
>> >> >> AgentConfiguration created without Configuration stubs for which
>> >> >> only
>> >> >> basic syntactical validation was performed[hbase-agent]
>> >> >>
>> >> >> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
>> >> >> channels=ch1, type=exec} }}
>> >> >>
>> >> >> CHANNELS: {ch1={ parameters:{type=memory} }}
>> >> >>
>> >> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>> >> >> serializer.keyType=timestamp,
>> >> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>> >> >> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
>> >> >> batchSize=1, columnFamily=cf1, table=test,
>> >> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>> >> >> serializer.suffix=timestamp} }}
>> >> >>
>> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Channels:ch1
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks sink1
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources tail
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration:
>> >> >> Post-validation
>> >> >> flume configuration contains configuration  for agents:
>> >> >> [hbase-agent]
>> >> >>
>> >> >> 2012-06-21 18:25:08,171 INFO
>> >> >> properties.PropertiesFileConfigurationProvider: Creating channels
>> >> >>
>> >> >> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory:
>> >> >> Creating
>> >> >> instance of channel ch1 type memory
>> >> >>
>> >> >> 2012-06-21 18:25:08,175 INFO
>> >> >> properties.PropertiesFileConfigurationProvider: created channel ch1
>> >> >>
>> >> >> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory: Creating
>> >> >> instance of source tail, type exec
>> >> >>
>> >> >> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating
>> >> >> instance
>> >> >> of
>> >> >> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>> >> >>
>> >> >> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
>> >> >> org.apache.flume.sink.hbase.HBaseSink is a custom type
>> >> >>
>> >> >> 2012-06-21 18:25:08,298 INFO nodemanager.DefaultLogicalNodeManager:
>> >> >> Node
>> >> >> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner:
>> >> >> {
>> >> >> source:org.apache.flume.source.ExecSource@1fd0fafc }}
>> >> >> sinkRunners:{sink1=SinkRunner: {
>> >> >> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
>> >> >> counterGroup:{
>> >> >> name:null counters:{} } }}
>> >> >> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
>> >> >>
>> >> >> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source starting
>> >> >> with
>> >> >> command:tail -f /home/hadoop/demo.txt
>> >> >>
>> >> >> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source started
>> >> >>
>> >> >>
>> >> >>
>> >> >> Output of the which Flume-ng is:
>> >> >>
>> >> >> /usr/bin/flume-ng
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >> Thanks & Regards,
>> >> >>
>> >> >> Ashutosh Sharma
>> >> >>
>> >> >> Cell: 010-7300-0150
>> >> >>
>> >> >> Email: sharma.ashutosh@kt.com
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >>
>> >> >>
>> >> >> From: Will McQueen [mailto:will@cloudera.com]
>> >> >> Sent: Thursday, June 21, 2012 6:07 PM
>> >> >>
>> >> >>
>> >> >> To: flume-user@incubator.apache.org
>> >> >> Subject: Re: Hbase-sink behavior
>> >> >>
>> >> >>
>> >> >>
>> >> >> Hi Sharma,
>> >> >>
>> >> >>
>> >> >>
>> >> >> Could you please describe how you installed flume? Also, I see
>> >> >> you're
>> >> >> getting this warning:
>> >> >>
>> >> >> >> Warning: No configuration directory set! Use --conf <dir> to
>> >> >> >> override.
>> >> >>
>> >> >>
>> >> >>
>> >> >> The log4j.properties that flume provides is stored in the conf dir.
>> >> >> If
>> >> >> you
>> >> >> specify the flume conf dir, flume can pick it up. So for
>> >> >> troubleshooting you
>> >> >> can try:
>> >> >>
>> >> >>
>> >> >> 1) modifying the log4j.properties within flume's conf dir so that
>> >> >> the
>> >> >> top
>> >> >> reads:
>> >> >> #flume.root.logger=DEBUG,console
>> >> >> flume.root.logger=DEBUG,LOGFILE
>> >> >> flume.log.dir=.
>> >> >> flume.log.file=flume.log
>> >> >>
>> >> >> 2) Run the flume agent while specifying the flume conf dir (--conf
>> >> >> <dir>)
>> >> >>
>> >> >> 3) What's the output of 'which flume-ng'?
>> >> >>
>> >> >> Cheers,
>> >> >> Will
>> >> >>
>> >> >> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
>> >> >> <sh...@kt.com> wrote:
>> >> >>
>> >> >> Hi Hari,
>> >> >>
>> >> >>
>> >> >>
>> >> >> I checked, agent is successfully tailing the file which I mentioned.
>> >> >> Yes,
>> >> >> you are right, agent has started properly without any error. Because
>> >> >> there
>> >> >> is no further movement, so it's hard for me to identify the issue. I
>> >> >> also
>> >> >> used tail -F also, but no success.
>> >> >>
>> >> >> Can you suggest me some technique to troubleshoot it, so I could
>> >> >> identify
>> >> >> the issue and resolve the same. Does flume record some log anywhere?
>> >> >>
>> >> >>
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >> Thanks & Regards,
>> >> >>
>> >> >> Ashutosh Sharma
>> >> >>
>> >> >> Cell: 010-7300-0150
>> >> >>
>> >> >> Email: sharma.ashutosh@kt.com
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >>
>> >> >>
>> >> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
>> >> >> Sent: Thursday, June 21, 2012 5:25 PM
>> >> >>
>> >> >>
>> >> >> To: flume-user@incubator.apache.org
>> >> >> Subject: Re: Hbase-sink behavior
>> >> >>
>> >> >>
>> >> >>
>> >> >> I am not sure if HBase changed their wire protocol between these
>> >> >> versions.
>> >> >> Looks like your agent has started properly. Are you sure data is
>> >> >> being
>> >> >> written into the file being tailed? I suggest using tail -F. The log
>> >> >> being
>> >> >> stuck here is ok, that is probably because nothing specific is
>> >> >> required(or
>> >> >> your log file rotated).
>> >> >>
>> >> >>
>> >> >>
>> >> >> Thanks
>> >> >>
>> >> >> Hari
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >> Hari Shreedharan
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
>> >> >>
>> >> >> Hi Hari,
>> >> >>
>> >> >>
>> >> >>
>> >> >> Thanks for your prompt reply. I already created the table in Hbase
>> >> >> with
>> >> >> a
>> >> >> column family and hadoop/hbase library is available to hadoop. I
>> >> >> noticed
>> >> >> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
>> >> >>
>> >> >> Please see the below lines captured while running the flume agent:
>> >> >>
>> >> >>
>> >> >>
>> >> >> >>> flume-ng  agent -n hbase-agent -f
>> >> >> >>> /home/hadoop/flumeng/hbaseagent.conf
>> >> >>
>> >> >> Warning: No configuration directory set! Use --conf <dir> to
>> >> >> override.
>> >> >>
>> >> >> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for
>> >> >> HDFS
>> >> >> access
>> >> >>
>> >> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
>> >> >> classpath
>> >> >>
>> >> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar
>> >> >> from
>> >> >> classpath
>> >> >>
>> >> >> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
>> >> >>
>> >> >>
>> >> >> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
>> >> >> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
>> >> >> org.apache.flume.node.Application -n hbase-agent -f
>> >> >> /home/hadoop/flumeng/hbaseagent.conf
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>> >> >> lifecycle
>> >> >> supervisor 1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
>> >> >> hbase-agent
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
>> >> >> manager
>> >> >> starting
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>> >> >> lifecycle
>> >> >> supervisor 10
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO
>> >> >> properties.PropertiesFileConfigurationProvider:
>> >> >> Configuration provider starting
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO
>> >> >> properties.PropertiesFileConfigurationProvider:
>> >> >> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1
>> >> >> Agent:
>> >> >> hbase-agent
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation
>> >> >> flume
>> >> >> configuration contains configuration  for agents: [hbase-agent]
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO
>> >> >> properties.PropertiesFileConfigurationProvider:
>> >> >> Creating channels
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO
>> >> >> properties.PropertiesFileConfigurationProvider:
>> >> >> created channel ch1
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of
>> >> >> sink
>> >> >> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
>> >> >> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner:
>> >> >> {
>> >> >> source:org.apache.flume.source.ExecSource@1ed0af9b }}
>> >> >> sinkRunners:{sink1=SinkRunner: {
>> >> >> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
>> >> >> counterGroup:{
>> >> >> name:null counters:{} } }}
>> >> >> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
>> >> >>
>> >> >> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with
>> >> >> command:tail -f /home/hadoop/demo.txt
>> >> >>
>> >> >>
>> >> >>
>> >> >> Screen stuck here....no movement.
>> >> >>
>> >> >>
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >> Thanks & Regards,
>> >> >>
>> >> >> Ashutosh Sharma
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >>
>> >> >>
>> >> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
>> >> >> Sent: Thursday, June 21, 2012 5:01 PM
>> >> >> To: flume-user@incubator.apache.org
>> >> >> Subject: Re: Hbase-sink behavior
>> >> >>
>> >> >>
>> >> >>
>> >> >> Hi Ashutosh,
>> >> >>
>> >> >>
>> >> >>
>> >> >> The sink will not create the table or column family. Make sure you
>> >> >> have
>> >> >> the table and column family. Also please make sure you have
>> >> >> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are
>> >> >> in
>> >> >> your
>> >> >> class path).
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> Thanks
>> >> >>
>> >> >> Hari
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >> Hari Shreedharan
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >>
>> >> >>
>> >> >> I have used and followed the same steps which is mentioned in below
>> >> >> mails
>> >> >> to get start with the hbasesink. But agent is not storing any data
>> >> >> into
>> >> >> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the
>> >> >> hbase
>> >> >> information. Even I am able to connect to the hbase server from that
>> >> >> agent
>> >> >> machine.
>> >> >>
>> >> >>
>> >> >>
>> >> >> Now, I am unable to understand and troubleshoot this problem.
>> >> >> Seeking
>> >> >> advice from the community members....
>> >> >>
>> >> >>
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >> Thanks & Regards,
>> >> >>
>> >> >> Ashutosh Sharma
>> >> >>
>> >> >> ----------------------------------------
>> >> >>
>> >> >>
>> >> >>
>> >> >> -----Original Message-----
>> >> >>
>> >> >> From: Mohammad Tariq [mailto:dontariq@gmail.com]
>> >> >>
>> >> >> Sent: Friday, June 15, 2012 9:02 AM
>> >> >>
>> >> >> To: flume-user@incubator.apache.org
>> >> >>
>> >> >> Subject: Re: Hbase-sink behavior
>> >> >>
>> >> >>
>> >> >>
>> >> >> Thank you so much Hari for the valuable response..I'll follow the
>> >> >> guidelines provided by you.
>> >> >>
>> >> >>
>> >> >>
>> >> >> Regards,
>> >> >>
>> >> >> Mohammad Tariq
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
>> >> >> <hs...@cloudera.com> wrote:
>> >> >>
>> >> >> Hi Mohammad,
>> >> >>
>> >> >>
>> >> >>
>> >> >> My answers are inline.
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >> Hari Shreedharan
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>> >> >>
>> >> >>
>> >> >>
>> >> >> Hello list,
>> >> >>
>> >> >>
>> >> >>
>> >> >> I am trying to use hbase-sink to collect data from a local file and
>> >> >>
>> >> >> dump it into an Hbase table..But there are a few things I am not
>> >> >> able
>> >> >>
>> >> >> to understand and need some guidance.
>> >> >>
>> >> >>
>> >> >>
>> >> >> This is the content of my conf file :
>> >> >>
>> >> >>
>> >> >>
>> >> >> hbase-agent.sources = tail
>> >> >>
>> >> >> hbase-agent..sinks = sink1
>> >> >>
>> >> >> hbase-agent.channels = ch1
>> >> >>
>> >> >> hbase-agent.sources.tail.type = exec
>> >> >>
>> >> >> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>> >> >>
>> >> >> hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type
>> >> >> =
>> >> >>
>> >> >> org.apache.flume.sink.hbase.HBaseSink
>> >> >>
>> >> >> hbase-agent.sinks.sink1.channel = ch1
>> >> >>
>> >> >> hbase-agent.sinks.sink1.table = test3
>> >> >>
>> >> >> hbase-agent.sinks.sink1.columnFamily = testing
>> >> >>
>> >> >> hbase-agent.sinks.sink1.column = foo
>> >> >>
>> >> >> hbase-agent.sinks.sink1.serializer =
>> >> >>
>> >> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>> >> >>
>> >> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>> >> >>
>> >> >> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>> >> >>
>> >> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>> >> >>
>> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>> >> >>
>> >> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>> >> >>
>> >> >> hbase-agent.channels.ch1.type=memory
>> >> >>
>> >> >>
>> >> >>
>> >> >> Right now I am taking just some simple text from a file which has
>> >> >>
>> >> >> following content -
>> >> >>
>> >> >>
>> >> >>
>> >> >> value1
>> >> >>
>> >> >> value2
>> >> >>
>> >> >> value3
>> >> >>
>> >> >> value4
>> >> >>
>> >> >> value5
>> >> >>
>> >> >> value6
>> >> >>
>> >> >>
>> >> >>
>> >> >> And my Hbase table looks like -
>> >> >>
>> >> >>
>> >> >>
>> >> >> hbase(main):217:0> scan 'test3'
>> >> >>
>> >> >> ROW COLUMN+CELL
>> >> >>
>> >> >> 11339716704561 column=testing:col1,
>> >> >>
>> >> >> timestamp=1339716707569, value=value1
>> >> >>
>> >> >> 11339716704562 column=testing:col1,
>> >> >>
>> >> >> timestamp=1339716707571, value=value4
>> >> >>
>> >> >> 11339716846594 column=testing:col1,
>> >> >>
>> >> >> timestamp=1339716849608, value=value2
>> >> >>
>> >> >> 11339716846595 column=testing:col1,
>> >> >>
>> >> >> timestamp=1339716849610, value=value1
>> >> >>
>> >> >> 11339716846596 column=testing:col1,
>> >> >>
>> >> >> timestamp=1339716849611, value=value6
>> >> >>
>> >> >> 11339716846597 column=testing:col1,
>> >> >>
>> >> >> timestamp=1339716849614, value=value6
>> >> >>
>> >> >> 11339716846598 column=testing:col1,
>> >> >>
>> >> >> timestamp=1339716849615, value=value5
>> >> >>
>> >> >> 11339716846599 column=testing:col1,
>> >> >>
>> >> >> timestamp=1339716849615, value=value6
>> >> >>
>> >> >> incRow column=testing:col1,
>> >> >>
>> >> >> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
>> >> >>
>> >> >> 9 row(s) in 0.0580 seconds
>> >> >>
>> >> >>
>> >> >>
>> >> >> Now I have following questions -
>> >> >>
>> >> >>
>> >> >>
>> >> >> 1- Why the timestamp value is different from the row key?(I was
>> >> >> trying
>> >> >>
>> >> >> to make "1+timestamp" as the rowkey)
>> >> >>
>> >> >>
>> >> >>
>> >> >> The value shown by hbase shell as timestamp is the time at which the
>> >> >>
>> >> >> value was inserted into Hbase, while the value inserted by Flume is
>> >> >>
>> >> >> the timestamp at which the sink read the event from the channel.
>> >> >>
>> >> >> Depending on how long the network and HBase takes, these timestamps
>> >> >>
>> >> >> can vary. If you want 1+timestamp as row key then you should
>> >> >> configure
>> >> >> it:
>> >> >>
>> >> >>
>> >> >>
>> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
>> >> >>
>> >> >> appended as-is to the suffix you choose.
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2- Although I am not using "incRow", it stills appear in the table
>> >> >>
>> >> >> with some value. Why so and what is this value??
>> >> >>
>> >> >>
>> >> >>
>> >> >> The SimpleHBaseEventSerializer is only an example class. For custom
>> >> >>
>> >> >> use cases you can write your own serializer by implementing
>> >> >>
>> >> >> HbaseEventSerializer. In this case, you have specified
>> >> >>
>> >> >> incrementColumn, which causes an increment on the column specified.
>> >> >>
>> >> >> Simply don't specify that config and that row will not appear.
>> >> >>
>> >> >>
>> >> >>
>> >> >> 3- How can avoid the last row??
>> >> >>
>> >> >>
>> >> >>
>> >> >> See above.
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> I am still in the learning phase so please pardon my ignorance..Many
>> >> >> thanks.
>> >> >>
>> >> >>
>> >> >>
>> >> >> No problem. Much of this is documented
>> >> >>
>> >> >> here:
>> >> >>
>> >> >> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> Regards,
>> >> >>
>> >> >> Mohammad Tariq
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
>> >> >> 문서에
>> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본
>> >> >> 메일이
>> >> >> 잘못
>> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> >>
>> >> >> This E-mail may contain confidential information and/or copyright
>> >> >> material. This email is intended for the use of the addressee only.
>> >> >> If
>> >> >> you
>> >> >> receive this email by mistake, please either delete it without
>> >> >> reproducing,
>> >> >> distributing or retaining copies thereof or notify the sender
>> >> >> immediately.
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
>> >> >> 문서에
>> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본
>> >> >> 메일이
>> >> >> 잘못
>> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> >> This E-mail may contain confidential information and/or copyright
>> >> >> material. This email is intended for the use of the addressee only.
>> >> >> If
>> >> >> you
>> >> >> receive this email by mistake, please either delete it without
>> >> >> reproducing,
>> >> >> distributing or retaining copies thereof or notify the sender
>> >> >> immediately.
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
>> >> >> 문서에
>> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본
>> >> >> 메일이
>> >> >> 잘못
>> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> >> This E-mail may contain confidential information and/or copyright
>> >> >> material. This email is intended for the use of the addressee only.
>> >> >> If
>> >> >> you
>> >> >> receive this email by mistake, please either delete it without
>> >> >> reproducing,
>> >> >> distributing or retaining copies thereof or notify the sender
>> >> >> immediately.
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
>> >> >> 문서에
>> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본
>> >> >> 메일이
>> >> >> 잘못
>> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> >> This E-mail may contain confidential information and/or copyright
>> >> >> material. This email is intended for the use of the addressee only.
>> >> >> If
>> >> >> you
>> >> >> receive this email by mistake, please either delete it without
>> >> >> reproducing,
>> >> >> distributing or retaining copies thereof or notify the sender
>> >> >> immediately.
>> >> >
>> >> >
>> >
>> >
>
>
>
>
> --
> Regards,
> Rahul Patodi
>
>

Re: Hbase-sink behavior

Posted by Rahul Patodi <pa...@gmail.com>.
Hi Mohammad,
Thanks for your response
I have put this configuration:

hbase-agent.sources=tail
hbase-agent.sinks=sink1
hbase-agent.channels=ch1

hbase-agent.sources.tail.type=exec
hbase-agent.sources.tail.command=tail -F /tmp/test05
hbase-agent.sources.tail.channels=ch1

hbase-agent.sinks.sink1.type=org.apache.flume.sink.hbase.HBaseSink
hbase-agent.sinks.sink1.channel=ch1
hbase-agent.sinks.sink1.table=t002
hbase-agent.sinks.sink1.columnFamily=cf
hbase-agent.sinks.sink1.column=foo
hbase-agent.sinks.sink1.serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
hbase-agent.sinks.sink1.serializer.payloadColumn=col1
hbase-agent.sinks.sink1.serializer.incrementColumn=col1
#hbase-agent.sinks.sink1.serializer.keyType=timestamp
hbase-agent.sinks.sink1.serializer.rowPrefix=1+
hbase-agent.sinks.sink1.serializer.suffix=timestamp

hbase-agent.channels.ch1.type=memory


Data is getting copy into HBase, but I have got another issue:

My input data is simply:
value1
value2
value3
value4
value5
value6
value7
value8
value9

when I run this command in HBase:
hbase(main):129:0> scan 't002', {VERSIONS => 3}
ROW                              COLUMN+CELL
 1+1340279755410                 column=cf:col1, timestamp=1340279758424,
value=value5
 1+1340279755410                 column=cf:col1, timestamp=1340279758423,
value=value3
 1+1340279755410                 column=cf:col1, timestamp=1340279758417,
value=value1
 1+1340279755411                 column=cf:col1, timestamp=1340279758427,
value=value9
 1+1340279755411                 column=cf:col1, timestamp=1340279758426,
value=value8
 1+1340279755411                 column=cf:col1, timestamp=1340279758425,
value=value7
 incRow                          column=cf:col1, timestamp=1340279758443,
value=\x00\x00\x00\x00\x00\x00\x00\x09
3 row(s) in 0.0420 seconds

all the data is not getting copy ??

When I run this command with version:
hbase(main):130:0> scan 't002', {VERSIONS => 3}
ROW                              COLUMN+CELL
 1+1340279755410                 column=cf:col1, timestamp=1340279758424,
value=value5
 1+1340279755410                 column=cf:col1, timestamp=1340279758423,
value=value3
 1+1340279755410                 column=cf:col1, timestamp=1340279758417,
value=value1
 1+1340279755411                 column=cf:col1, timestamp=1340279758427,
value=value9
 1+1340279755411                 column=cf:col1, timestamp=1340279758426,
value=value8
 1+1340279755411                 column=cf:col1, timestamp=1340279758425,
value=value7
 1+1340279906637                 column=cf:col1, timestamp=1340279909652,
value=value1
 1+1340279906638                 column=cf:col1, timestamp=1340279909659,
value=value6
 1+1340279906638                 column=cf:col1, timestamp=1340279909658,
value=value5
 1+1340279906638                 column=cf:col1, timestamp=1340279909654,
value=value3
 1+1340279906646                 column=cf:col1, timestamp=1340279909659,
value=value7
 1+1340279906647                 column=cf:col1, timestamp=1340279909659,
value=value9
 incRow                          column=cf:col1, timestamp=1340279909677,
value=\x00\x00\x00\x00\x00\x00\x00\x12
7 row(s) in 0.0640 seconds

Please help me understand this.



On Thu, Jun 21, 2012 at 4:48 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hi Will,
>
>          I got it.Thanks for the info.
>
> Regards,
>    Mohammad Tariq
>
>
> On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <wi...@cloudera.com> wrote:
> > Hi Mohammad,
> >
> > In your config file, I think you need to remove this line:
> >
> >>>hbase-agent.sinks.sink1.serializer.keyType = timestamp
> >
> > I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
> > (although there is a keyType var that stores the value of the 'suffix'
> > prop).
> >
> > Cheers,
> > Will
> >
> >
> > On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >>
> >> Hi Rahul,
> >>
> >>          This normally happens when there is some problem in the
> >> configuration file.Create a file called hbase-agent inside your
> >> FLUME_HOME/conf directory and copy this content into it:
> >> hbase-agent.sources = tail
> >> hbase-agent.sinks = sink1
> >> hbase-agent.channels = ch1
> >>
> >> hbase-agent.sources.tail.type = exec
> >> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> >> hbase-agent.sources.tail.channels = ch1
> >>
> >> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
> >> hbase-agent.sinks.sink1.channel = ch1
> >> hbase-agent.sinks.sink1.table = demo
> >> hbase-agent.sinks.sink1.columnFamily = cf
> >>
> >> hbase-agent.sinks.sink1.serializer =
> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> >>
> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
> >>
> >> hbase-agent.channels.ch1.type=memory
> >>
> >> Then start the agent and see if it works for you. It worked for me.
> >>
> >> Regards,
> >>    Mohammad Tariq
> >>
> >>
> >> On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com>
> wrote:
> >> > Hi Sharma,
> >> >
> >> > So I assume that your command looks something like this:
> >> >      flume-ng agent -n hbase-agent -f
> >> > /home/hadoop/flumeng/hbaseagent.conf
> >> > -c /etc/flume-ng/conf
> >> >
> >> > ...?
> >> >
> >> > Hari, I saw your comment:
> >> >
> >> >>>I am not sure if HBase changed their wire protocol between these
> >> >>> versions.
> >> > Do you have any other advice about troubleshooting a possible hbase
> >> > protocol
> >> > mismatch issue?
> >> >
> >> > Cheers,
> >> > Will
> >> >
> >> >
> >> >
> >> > On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
> >> > <sh...@kt.com>
> >> > wrote:
> >> >>
> >> >> Hi Will,
> >> >>
> >> >>
> >> >>
> >> >> I installed flume as part of CDH3u4 version 1.1 using yum install
> >> >> flume-ng. One more point, I am using flume-ng hbase sink downloaded
> >> >> from:
> >> >>
> >> >>
> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
> >> >>
> >> >>
> >> >>
> >> >> Now, I ran the agent with -conf parameter with updated
> >> >> log4j.properties. I
> >> >> don't see any error in the log. Please see the below from the log
> file:
> >> >>
> >> >>
> >> >>
> >> >> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor: Starting
> >> >> lifecycle supervisor 1
> >> >>
> >> >> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting -
> >> >> hbase-agent
> >> >>
> >> >> 2012-06-21 18:25:08,146 INFO nodemanager.DefaultLogicalNodeManager:
> >> >> Node
> >> >> manager starting
> >> >>
> >> >> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor: Starting
> >> >> lifecycle supervisor 9
> >> >>
> >> >> 2012-06-21 18:25:08,146 INFO
> >> >> properties.PropertiesFileConfigurationProvider: Configuration
> provider
> >> >> starting
> >> >>
> >> >> 2012-06-21 18:25:08,148 DEBUG nodemanager.DefaultLogicalNodeManager:
> >> >> Node
> >> >> manager started
> >> >>
> >> >> 2012-06-21 18:25:08,148 DEBUG
> >> >> properties.PropertiesFileConfigurationProvider: Configuration
> provider
> >> >> started
> >> >>
> >> >> 2012-06-21 18:25:08,149 DEBUG
> >> >> properties.PropertiesFileConfigurationProvider: Checking
> >> >> file:/home/hadoop/flumeng/hbaseagent.conf for changes
> >> >>
> >> >> 2012-06-21 18:25:08,149 INFO
> >> >> properties.PropertiesFileConfigurationProvider: Reloading
> configuration
> >> >> file:/home/hadoop/flumeng/hbaseagent.conf
> >> >>
> >> >> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added sinks:
> >> >> sink1
> >> >> Agent: hbase-agent
> >> >>
> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created
> context
> >> >> for
> >> >> sink1: serializer.rowPrefix
> >> >>
> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration:
> Processing:sink1
> >> >>
> >> >> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
> >> >> validation
> >> >> of configuration for agent: hbase-agent, initial-configuration:
> >> >> AgentConfiguration[hbase-agent]
> >> >>
> >> >> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
> >> >> channels=ch1, type=exec} }}
> >> >>
> >> >> CHANNELS: {ch1={ parameters:{type=memory} }}
> >> >>
> >> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> >> >> serializer.keyType=timestamp,
> >> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> >> >> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
> >> >> batchSize=1, columnFamily=cf1, table=test,
> >> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> >> >> serializer.suffix=timestamp} }}
> >> >>
> >> >>
> >> >>
> >> >> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created
> channel
> >> >> ch1
> >> >>
> >> >> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating sink:
> >> >> sink1 using OTHER
> >> >>
> >> >> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post
> validation
> >> >> configuration for hbase-agent
> >> >>
> >> >> AgentConfiguration created without Configuration stubs for which only
> >> >> basic syntactical validation was performed[hbase-agent]
> >> >>
> >> >> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
> >> >> channels=ch1, type=exec} }}
> >> >>
> >> >> CHANNELS: {ch1={ parameters:{type=memory} }}
> >> >>
> >> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> >> >> serializer.keyType=timestamp,
> >> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> >> >> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
> >> >> batchSize=1, columnFamily=cf1, table=test,
> >> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> >> >> serializer.suffix=timestamp} }}
> >> >>
> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Channels:ch1
> >> >>
> >> >>
> >> >>
> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks sink1
> >> >>
> >> >>
> >> >>
> >> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources tail
> >> >>
> >> >>
> >> >>
> >> >> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration: Post-validation
> >> >> flume configuration contains configuration  for agents: [hbase-agent]
> >> >>
> >> >> 2012-06-21 18:25:08,171 INFO
> >> >> properties.PropertiesFileConfigurationProvider: Creating channels
> >> >>
> >> >> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory: Creating
> >> >> instance of channel ch1 type memory
> >> >>
> >> >> 2012-06-21 18:25:08,175 INFO
> >> >> properties.PropertiesFileConfigurationProvider: created channel ch1
> >> >>
> >> >> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory: Creating
> >> >> instance of source tail, type exec
> >> >>
> >> >> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating
> instance
> >> >> of
> >> >> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> >> >>
> >> >> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
> >> >> org.apache.flume.sink.hbase.HBaseSink is a custom type
> >> >>
> >> >> 2012-06-21 18:25:08,298 INFO nodemanager.DefaultLogicalNodeManager:
> >> >> Node
> >> >> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
> >> >> source:org.apache.flume.source.ExecSource@1fd0fafc }}
> >> >> sinkRunners:{sink1=SinkRunner: {
> >> >> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
> >> >> counterGroup:{
> >> >> name:null counters:{} } }}
> >> >> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
> >> >>
> >> >> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source starting
> >> >> with
> >> >> command:tail -f /home/hadoop/demo.txt
> >> >>
> >> >> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source started
> >> >>
> >> >>
> >> >>
> >> >> Output of the which Flume-ng is:
> >> >>
> >> >> /usr/bin/flume-ng
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >> Thanks & Regards,
> >> >>
> >> >> Ashutosh Sharma
> >> >>
> >> >> Cell: 010-7300-0150
> >> >>
> >> >> Email: sharma.ashutosh@kt.com
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >>
> >> >>
> >> >> From: Will McQueen [mailto:will@cloudera.com]
> >> >> Sent: Thursday, June 21, 2012 6:07 PM
> >> >>
> >> >>
> >> >> To: flume-user@incubator.apache.org
> >> >> Subject: Re: Hbase-sink behavior
> >> >>
> >> >>
> >> >>
> >> >> Hi Sharma,
> >> >>
> >> >>
> >> >>
> >> >> Could you please describe how you installed flume? Also, I see you're
> >> >> getting this warning:
> >> >>
> >> >> >> Warning: No configuration directory set! Use --conf <dir> to
> >> >> >> override.
> >> >>
> >> >>
> >> >>
> >> >> The log4j.properties that flume provides is stored in the conf dir.
> If
> >> >> you
> >> >> specify the flume conf dir, flume can pick it up. So for
> >> >> troubleshooting you
> >> >> can try:
> >> >>
> >> >>
> >> >> 1) modifying the log4j.properties within flume's conf dir so that the
> >> >> top
> >> >> reads:
> >> >> #flume.root.logger=DEBUG,console
> >> >> flume.root.logger=DEBUG,LOGFILE
> >> >> flume.log.dir=.
> >> >> flume.log.file=flume.log
> >> >>
> >> >> 2) Run the flume agent while specifying the flume conf dir (--conf
> >> >> <dir>)
> >> >>
> >> >> 3) What's the output of 'which flume-ng'?
> >> >>
> >> >> Cheers,
> >> >> Will
> >> >>
> >> >> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
> >> >> <sh...@kt.com> wrote:
> >> >>
> >> >> Hi Hari,
> >> >>
> >> >>
> >> >>
> >> >> I checked, agent is successfully tailing the file which I mentioned.
> >> >> Yes,
> >> >> you are right, agent has started properly without any error. Because
> >> >> there
> >> >> is no further movement, so it's hard for me to identify the issue. I
> >> >> also
> >> >> used tail -F also, but no success.
> >> >>
> >> >> Can you suggest me some technique to troubleshoot it, so I could
> >> >> identify
> >> >> the issue and resolve the same. Does flume record some log anywhere?
> >> >>
> >> >>
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >> Thanks & Regards,
> >> >>
> >> >> Ashutosh Sharma
> >> >>
> >> >> Cell: 010-7300-0150
> >> >>
> >> >> Email: sharma.ashutosh@kt.com
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >>
> >> >>
> >> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> >> >> Sent: Thursday, June 21, 2012 5:25 PM
> >> >>
> >> >>
> >> >> To: flume-user@incubator.apache.org
> >> >> Subject: Re: Hbase-sink behavior
> >> >>
> >> >>
> >> >>
> >> >> I am not sure if HBase changed their wire protocol between these
> >> >> versions.
> >> >> Looks like your agent has started properly. Are you sure data is
> being
> >> >> written into the file being tailed? I suggest using tail -F. The log
> >> >> being
> >> >> stuck here is ok, that is probably because nothing specific is
> >> >> required(or
> >> >> your log file rotated).
> >> >>
> >> >>
> >> >>
> >> >> Thanks
> >> >>
> >> >> Hari
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >>
> >> >> Hari Shreedharan
> >> >>
> >> >>
> >> >>
> >> >> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
> >> >>
> >> >> Hi Hari,
> >> >>
> >> >>
> >> >>
> >> >> Thanks for your prompt reply. I already created the table in Hbase
> with
> >> >> a
> >> >> column family and hadoop/hbase library is available to hadoop. I
> >> >> noticed
> >> >> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
> >> >>
> >> >> Please see the below lines captured while running the flume agent:
> >> >>
> >> >>
> >> >>
> >> >> >>> flume-ng  agent -n hbase-agent -f
> >> >> >>> /home/hadoop/flumeng/hbaseagent.conf
> >> >>
> >> >> Warning: No configuration directory set! Use --conf <dir> to
> override.
> >> >>
> >> >> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
> >> >> access
> >> >>
> >> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
> >> >> classpath
> >> >>
> >> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from
> >> >> classpath
> >> >>
> >> >> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
> >> >>
> >> >>
> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
> >> >> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> >> >> org.apache.flume.node.Application -n hbase-agent -f
> >> >> /home/hadoop/flumeng/hbaseagent.conf
> >> >>
> >> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
> >> >> lifecycle
> >> >> supervisor 1
> >> >>
> >> >> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
> >> >> hbase-agent
> >> >>
> >> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
> >> >> manager
> >> >> starting
> >> >>
> >> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
> >> >> lifecycle
> >> >> supervisor 10
> >> >>
> >> >> 12/06/21 16:40:42 INFO
> properties.PropertiesFileConfigurationProvider:
> >> >> Configuration provider starting
> >> >>
> >> >> 12/06/21 16:40:42 INFO
> properties.PropertiesFileConfigurationProvider:
> >> >> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1
> >> >> Agent:
> >> >> hbase-agent
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >> >>
> >> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume
> >> >> configuration contains configuration  for agents: [hbase-agent]
> >> >>
> >> >> 12/06/21 16:40:42 INFO
> properties.PropertiesFileConfigurationProvider:
> >> >> Creating channels
> >> >>
> >> >> 12/06/21 16:40:42 INFO
> properties.PropertiesFileConfigurationProvider:
> >> >> created channel ch1
> >> >>
> >> >> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of
> >> >> sink
> >> >> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> >> >>
> >> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
> >> >> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
> >> >> source:org.apache.flume.source.ExecSource@1ed0af9b }}
> >> >> sinkRunners:{sink1=SinkRunner: {
> >> >> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
> >> >> counterGroup:{
> >> >> name:null counters:{} } }}
> >> >> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
> >> >>
> >> >> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with
> >> >> command:tail -f /home/hadoop/demo.txt
> >> >>
> >> >>
> >> >>
> >> >> Screen stuck here....no movement.
> >> >>
> >> >>
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >> Thanks & Regards,
> >> >>
> >> >> Ashutosh Sharma
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >>
> >> >>
> >> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> >> >> Sent: Thursday, June 21, 2012 5:01 PM
> >> >> To: flume-user@incubator.apache.org
> >> >> Subject: Re: Hbase-sink behavior
> >> >>
> >> >>
> >> >>
> >> >> Hi Ashutosh,
> >> >>
> >> >>
> >> >>
> >> >> The sink will not create the table or column family. Make sure you
> have
> >> >> the table and column family. Also please make sure you have
> >> >> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in
> >> >> your
> >> >> class path).
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Thanks
> >> >>
> >> >> Hari
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >>
> >> >> Hari Shreedharan
> >> >>
> >> >>
> >> >>
> >> >> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >>
> >> >>
> >> >> I have used and followed the same steps which is mentioned in below
> >> >> mails
> >> >> to get start with the hbasesink. But agent is not storing any data
> into
> >> >> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the
> >> >> hbase
> >> >> information. Even I am able to connect to the hbase server from that
> >> >> agent
> >> >> machine.
> >> >>
> >> >>
> >> >>
> >> >> Now, I am unable to understand and troubleshoot this problem. Seeking
> >> >> advice from the community members....
> >> >>
> >> >>
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >> Thanks & Regards,
> >> >>
> >> >> Ashutosh Sharma
> >> >>
> >> >> ----------------------------------------
> >> >>
> >> >>
> >> >>
> >> >> -----Original Message-----
> >> >>
> >> >> From: Mohammad Tariq [mailto:dontariq@gmail.com]
> >> >>
> >> >> Sent: Friday, June 15, 2012 9:02 AM
> >> >>
> >> >> To: flume-user@incubator.apache.org
> >> >>
> >> >> Subject: Re: Hbase-sink behavior
> >> >>
> >> >>
> >> >>
> >> >> Thank you so much Hari for the valuable response..I'll follow the
> >> >> guidelines provided by you.
> >> >>
> >> >>
> >> >>
> >> >> Regards,
> >> >>
> >> >> Mohammad Tariq
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
> >> >> <hs...@cloudera.com> wrote:
> >> >>
> >> >> Hi Mohammad,
> >> >>
> >> >>
> >> >>
> >> >> My answers are inline.
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >>
> >> >> Hari Shreedharan
> >> >>
> >> >>
> >> >>
> >> >> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
> >> >>
> >> >>
> >> >>
> >> >> Hello list,
> >> >>
> >> >>
> >> >>
> >> >> I am trying to use hbase-sink to collect data from a local file and
> >> >>
> >> >> dump it into an Hbase table..But there are a few things I am not able
> >> >>
> >> >> to understand and need some guidance.
> >> >>
> >> >>
> >> >>
> >> >> This is the content of my conf file :
> >> >>
> >> >>
> >> >>
> >> >> hbase-agent.sources = tail
> >> >>
> >> >> hbase-agent..sinks = sink1
> >> >>
> >> >> hbase-agent.channels = ch1
> >> >>
> >> >> hbase-agent.sources.tail.type = exec
> >> >>
> >> >> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> >> >>
> >> >> hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type
> =
> >> >>
> >> >> org.apache.flume.sink.hbase.HBaseSink
> >> >>
> >> >> hbase-agent.sinks.sink1.channel = ch1
> >> >>
> >> >> hbase-agent.sinks.sink1.table = test3
> >> >>
> >> >> hbase-agent.sinks.sink1.columnFamily = testing
> >> >>
> >> >> hbase-agent.sinks.sink1.column = foo
> >> >>
> >> >> hbase-agent.sinks.sink1.serializer =
> >> >>
> >> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> >> >>
> >> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> >> >>
> >> >> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> >> >>
> >> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
> >> >>
> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> >> >>
> >> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
> >> >>
> >> >> hbase-agent.channels.ch1.type=memory
> >> >>
> >> >>
> >> >>
> >> >> Right now I am taking just some simple text from a file which has
> >> >>
> >> >> following content -
> >> >>
> >> >>
> >> >>
> >> >> value1
> >> >>
> >> >> value2
> >> >>
> >> >> value3
> >> >>
> >> >> value4
> >> >>
> >> >> value5
> >> >>
> >> >> value6
> >> >>
> >> >>
> >> >>
> >> >> And my Hbase table looks like -
> >> >>
> >> >>
> >> >>
> >> >> hbase(main):217:0> scan 'test3'
> >> >>
> >> >> ROW COLUMN+CELL
> >> >>
> >> >> 11339716704561 column=testing:col1,
> >> >>
> >> >> timestamp=1339716707569, value=value1
> >> >>
> >> >> 11339716704562 column=testing:col1,
> >> >>
> >> >> timestamp=1339716707571, value=value4
> >> >>
> >> >> 11339716846594 column=testing:col1,
> >> >>
> >> >> timestamp=1339716849608, value=value2
> >> >>
> >> >> 11339716846595 column=testing:col1,
> >> >>
> >> >> timestamp=1339716849610, value=value1
> >> >>
> >> >> 11339716846596 column=testing:col1,
> >> >>
> >> >> timestamp=1339716849611, value=value6
> >> >>
> >> >> 11339716846597 column=testing:col1,
> >> >>
> >> >> timestamp=1339716849614, value=value6
> >> >>
> >> >> 11339716846598 column=testing:col1,
> >> >>
> >> >> timestamp=1339716849615, value=value5
> >> >>
> >> >> 11339716846599 column=testing:col1,
> >> >>
> >> >> timestamp=1339716849615, value=value6
> >> >>
> >> >> incRow column=testing:col1,
> >> >>
> >> >> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> >> >>
> >> >> 9 row(s) in 0.0580 seconds
> >> >>
> >> >>
> >> >>
> >> >> Now I have following questions -
> >> >>
> >> >>
> >> >>
> >> >> 1- Why the timestamp value is different from the row key?(I was
> trying
> >> >>
> >> >> to make "1+timestamp" as the rowkey)
> >> >>
> >> >>
> >> >>
> >> >> The value shown by hbase shell as timestamp is the time at which the
> >> >>
> >> >> value was inserted into Hbase, while the value inserted by Flume is
> >> >>
> >> >> the timestamp at which the sink read the event from the channel.
> >> >>
> >> >> Depending on how long the network and HBase takes, these timestamps
> >> >>
> >> >> can vary. If you want 1+timestamp as row key then you should
> configure
> >> >> it:
> >> >>
> >> >>
> >> >>
> >> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> >> >>
> >> >> appended as-is to the suffix you choose.
> >> >>
> >> >>
> >> >>
> >> >> 2- Although I am not using "incRow", it stills appear in the table
> >> >>
> >> >> with some value. Why so and what is this value??
> >> >>
> >> >>
> >> >>
> >> >> The SimpleHBaseEventSerializer is only an example class. For custom
> >> >>
> >> >> use cases you can write your own serializer by implementing
> >> >>
> >> >> HbaseEventSerializer. In this case, you have specified
> >> >>
> >> >> incrementColumn, which causes an increment on the column specified.
> >> >>
> >> >> Simply don't specify that config and that row will not appear.
> >> >>
> >> >>
> >> >>
> >> >> 3- How can avoid the last row??
> >> >>
> >> >>
> >> >>
> >> >> See above.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> I am still in the learning phase so please pardon my ignorance..Many
> >> >> thanks.
> >> >>
> >> >>
> >> >>
> >> >> No problem. Much of this is documented
> >> >>
> >> >> here:
> >> >>
> >> >> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Regards,
> >> >>
> >> >> Mohammad Tariq
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
> >> >> 문서에
> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본
> 메일이
> >> >> 잘못
> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> >>
> >> >> This E-mail may contain confidential information and/or copyright
> >> >> material. This email is intended for the use of the addressee only.
> If
> >> >> you
> >> >> receive this email by mistake, please either delete it without
> >> >> reproducing,
> >> >> distributing or retaining copies thereof or notify the sender
> >> >> immediately.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
> >> >> 문서에
> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본
> 메일이
> >> >> 잘못
> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> >> This E-mail may contain confidential information and/or copyright
> >> >> material. This email is intended for the use of the addressee only.
> If
> >> >> you
> >> >> receive this email by mistake, please either delete it without
> >> >> reproducing,
> >> >> distributing or retaining copies thereof or notify the sender
> >> >> immediately.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
> >> >> 문서에
> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본
> 메일이
> >> >> 잘못
> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> >> This E-mail may contain confidential information and/or copyright
> >> >> material. This email is intended for the use of the addressee only.
> If
> >> >> you
> >> >> receive this email by mistake, please either delete it without
> >> >> reproducing,
> >> >> distributing or retaining copies thereof or notify the sender
> >> >> immediately.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
> >> >> 문서에
> >> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본
> 메일이
> >> >> 잘못
> >> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> >> This E-mail may contain confidential information and/or copyright
> >> >> material. This email is intended for the use of the addressee only.
> If
> >> >> you
> >> >> receive this email by mistake, please either delete it without
> >> >> reproducing,
> >> >> distributing or retaining copies thereof or notify the sender
> >> >> immediately.
> >> >
> >> >
> >
> >
>



-- 
*Regards*,
Rahul Patodi

Re: Hbase-sink behavior

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Will,

          I got it.Thanks for the info.

Regards,
    Mohammad Tariq


On Thu, Jun 21, 2012 at 4:37 PM, Will McQueen <wi...@cloudera.com> wrote:
> Hi Mohammad,
>
> In your config file, I think you need to remove this line:
>
>>>hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
> I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
> (although there is a keyType var that stores the value of the 'suffix'
> prop).
>
> Cheers,
> Will
>
>
> On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hi Rahul,
>>
>>          This normally happens when there is some problem in the
>> configuration file.Create a file called hbase-agent inside your
>> FLUME_HOME/conf directory and copy this content into it:
>> hbase-agent.sources = tail
>> hbase-agent.sinks = sink1
>> hbase-agent.channels = ch1
>>
>> hbase-agent.sources.tail.type = exec
>> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>> hbase-agent.sources.tail.channels = ch1
>>
>> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
>> hbase-agent.sinks.sink1.channel = ch1
>> hbase-agent.sinks.sink1.table = demo
>> hbase-agent.sinks.sink1.columnFamily = cf
>>
>> hbase-agent.sinks.sink1.serializer =
>> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>>
>> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>>
>> hbase-agent.channels.ch1.type=memory
>>
>> Then start the agent and see if it works for you. It worked for me.
>>
>> Regards,
>>    Mohammad Tariq
>>
>>
>> On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com> wrote:
>> > Hi Sharma,
>> >
>> > So I assume that your command looks something like this:
>> >      flume-ng agent -n hbase-agent -f
>> > /home/hadoop/flumeng/hbaseagent.conf
>> > -c /etc/flume-ng/conf
>> >
>> > ...?
>> >
>> > Hari, I saw your comment:
>> >
>> >>>I am not sure if HBase changed their wire protocol between these
>> >>> versions.
>> > Do you have any other advice about troubleshooting a possible hbase
>> > protocol
>> > mismatch issue?
>> >
>> > Cheers,
>> > Will
>> >
>> >
>> >
>> > On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
>> > <sh...@kt.com>
>> > wrote:
>> >>
>> >> Hi Will,
>> >>
>> >>
>> >>
>> >> I installed flume as part of CDH3u4 version 1.1 using yum install
>> >> flume-ng. One more point, I am using flume-ng hbase sink downloaded
>> >> from:
>> >>
>> >> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
>> >>
>> >>
>> >>
>> >> Now, I ran the agent with -conf parameter with updated
>> >> log4j.properties. I
>> >> don't see any error in the log. Please see the below from the log file:
>> >>
>> >>
>> >>
>> >> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor: Starting
>> >> lifecycle supervisor 1
>> >>
>> >> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting -
>> >> hbase-agent
>> >>
>> >> 2012-06-21 18:25:08,146 INFO nodemanager.DefaultLogicalNodeManager:
>> >> Node
>> >> manager starting
>> >>
>> >> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor: Starting
>> >> lifecycle supervisor 9
>> >>
>> >> 2012-06-21 18:25:08,146 INFO
>> >> properties.PropertiesFileConfigurationProvider: Configuration provider
>> >> starting
>> >>
>> >> 2012-06-21 18:25:08,148 DEBUG nodemanager.DefaultLogicalNodeManager:
>> >> Node
>> >> manager started
>> >>
>> >> 2012-06-21 18:25:08,148 DEBUG
>> >> properties.PropertiesFileConfigurationProvider: Configuration provider
>> >> started
>> >>
>> >> 2012-06-21 18:25:08,149 DEBUG
>> >> properties.PropertiesFileConfigurationProvider: Checking
>> >> file:/home/hadoop/flumeng/hbaseagent.conf for changes
>> >>
>> >> 2012-06-21 18:25:08,149 INFO
>> >> properties.PropertiesFileConfigurationProvider: Reloading configuration
>> >> file:/home/hadoop/flumeng/hbaseagent.conf
>> >>
>> >> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added sinks:
>> >> sink1
>> >> Agent: hbase-agent
>> >>
>> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created context
>> >> for
>> >> sink1: serializer.rowPrefix
>> >>
>> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
>> >> validation
>> >> of configuration for agent: hbase-agent, initial-configuration:
>> >> AgentConfiguration[hbase-agent]
>> >>
>> >> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
>> >> channels=ch1, type=exec} }}
>> >>
>> >> CHANNELS: {ch1={ parameters:{type=memory} }}
>> >>
>> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>> >> serializer.keyType=timestamp,
>> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>> >> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
>> >> batchSize=1, columnFamily=cf1, table=test,
>> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>> >> serializer.suffix=timestamp} }}
>> >>
>> >>
>> >>
>> >> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created channel
>> >> ch1
>> >>
>> >> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating sink:
>> >> sink1 using OTHER
>> >>
>> >> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post validation
>> >> configuration for hbase-agent
>> >>
>> >> AgentConfiguration created without Configuration stubs for which only
>> >> basic syntactical validation was performed[hbase-agent]
>> >>
>> >> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
>> >> channels=ch1, type=exec} }}
>> >>
>> >> CHANNELS: {ch1={ parameters:{type=memory} }}
>> >>
>> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>> >> serializer.keyType=timestamp,
>> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>> >> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
>> >> batchSize=1, columnFamily=cf1, table=test,
>> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>> >> serializer.suffix=timestamp} }}
>> >>
>> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Channels:ch1
>> >>
>> >>
>> >>
>> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks sink1
>> >>
>> >>
>> >>
>> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources tail
>> >>
>> >>
>> >>
>> >> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration: Post-validation
>> >> flume configuration contains configuration  for agents: [hbase-agent]
>> >>
>> >> 2012-06-21 18:25:08,171 INFO
>> >> properties.PropertiesFileConfigurationProvider: Creating channels
>> >>
>> >> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory: Creating
>> >> instance of channel ch1 type memory
>> >>
>> >> 2012-06-21 18:25:08,175 INFO
>> >> properties.PropertiesFileConfigurationProvider: created channel ch1
>> >>
>> >> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory: Creating
>> >> instance of source tail, type exec
>> >>
>> >> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating instance
>> >> of
>> >> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>> >>
>> >> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
>> >> org.apache.flume.sink.hbase.HBaseSink is a custom type
>> >>
>> >> 2012-06-21 18:25:08,298 INFO nodemanager.DefaultLogicalNodeManager:
>> >> Node
>> >> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
>> >> source:org.apache.flume.source.ExecSource@1fd0fafc }}
>> >> sinkRunners:{sink1=SinkRunner: {
>> >> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5
>> >> counterGroup:{
>> >> name:null counters:{} } }}
>> >> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
>> >>
>> >> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source starting
>> >> with
>> >> command:tail -f /home/hadoop/demo.txt
>> >>
>> >> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source started
>> >>
>> >>
>> >>
>> >> Output of the which Flume-ng is:
>> >>
>> >> /usr/bin/flume-ng
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> ----------------------------------------
>> >>
>> >> ----------------------------------------
>> >>
>> >> Thanks & Regards,
>> >>
>> >> Ashutosh Sharma
>> >>
>> >> Cell: 010-7300-0150
>> >>
>> >> Email: sharma.ashutosh@kt.com
>> >>
>> >> ----------------------------------------
>> >>
>> >>
>> >>
>> >> From: Will McQueen [mailto:will@cloudera.com]
>> >> Sent: Thursday, June 21, 2012 6:07 PM
>> >>
>> >>
>> >> To: flume-user@incubator.apache.org
>> >> Subject: Re: Hbase-sink behavior
>> >>
>> >>
>> >>
>> >> Hi Sharma,
>> >>
>> >>
>> >>
>> >> Could you please describe how you installed flume? Also, I see you're
>> >> getting this warning:
>> >>
>> >> >> Warning: No configuration directory set! Use --conf <dir> to
>> >> >> override.
>> >>
>> >>
>> >>
>> >> The log4j.properties that flume provides is stored in the conf dir. If
>> >> you
>> >> specify the flume conf dir, flume can pick it up. So for
>> >> troubleshooting you
>> >> can try:
>> >>
>> >>
>> >> 1) modifying the log4j.properties within flume's conf dir so that the
>> >> top
>> >> reads:
>> >> #flume.root.logger=DEBUG,console
>> >> flume.root.logger=DEBUG,LOGFILE
>> >> flume.log.dir=.
>> >> flume.log.file=flume.log
>> >>
>> >> 2) Run the flume agent while specifying the flume conf dir (--conf
>> >> <dir>)
>> >>
>> >> 3) What's the output of 'which flume-ng'?
>> >>
>> >> Cheers,
>> >> Will
>> >>
>> >> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
>> >> <sh...@kt.com> wrote:
>> >>
>> >> Hi Hari,
>> >>
>> >>
>> >>
>> >> I checked, agent is successfully tailing the file which I mentioned.
>> >> Yes,
>> >> you are right, agent has started properly without any error. Because
>> >> there
>> >> is no further movement, so it's hard for me to identify the issue. I
>> >> also
>> >> used tail -F also, but no success.
>> >>
>> >> Can you suggest me some technique to troubleshoot it, so I could
>> >> identify
>> >> the issue and resolve the same. Does flume record some log anywhere?
>> >>
>> >>
>> >>
>> >> ----------------------------------------
>> >>
>> >> ----------------------------------------
>> >>
>> >> Thanks & Regards,
>> >>
>> >> Ashutosh Sharma
>> >>
>> >> Cell: 010-7300-0150
>> >>
>> >> Email: sharma.ashutosh@kt.com
>> >>
>> >> ----------------------------------------
>> >>
>> >>
>> >>
>> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
>> >> Sent: Thursday, June 21, 2012 5:25 PM
>> >>
>> >>
>> >> To: flume-user@incubator.apache.org
>> >> Subject: Re: Hbase-sink behavior
>> >>
>> >>
>> >>
>> >> I am not sure if HBase changed their wire protocol between these
>> >> versions.
>> >> Looks like your agent has started properly. Are you sure data is being
>> >> written into the file being tailed? I suggest using tail -F. The log
>> >> being
>> >> stuck here is ok, that is probably because nothing specific is
>> >> required(or
>> >> your log file rotated).
>> >>
>> >>
>> >>
>> >> Thanks
>> >>
>> >> Hari
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> Hari Shreedharan
>> >>
>> >>
>> >>
>> >> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
>> >>
>> >> Hi Hari,
>> >>
>> >>
>> >>
>> >> Thanks for your prompt reply. I already created the table in Hbase with
>> >> a
>> >> column family and hadoop/hbase library is available to hadoop. I
>> >> noticed
>> >> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
>> >>
>> >> Please see the below lines captured while running the flume agent:
>> >>
>> >>
>> >>
>> >> >>> flume-ng  agent -n hbase-agent -f
>> >> >>> /home/hadoop/flumeng/hbaseagent.conf
>> >>
>> >> Warning: No configuration directory set! Use --conf <dir> to override.
>> >>
>> >> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
>> >> access
>> >>
>> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
>> >> classpath
>> >>
>> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from
>> >> classpath
>> >>
>> >> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
>> >>
>> >> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
>> >> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
>> >> org.apache.flume.node.Application -n hbase-agent -f
>> >> /home/hadoop/flumeng/hbaseagent.conf
>> >>
>> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>> >> lifecycle
>> >> supervisor 1
>> >>
>> >> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting -
>> >> hbase-agent
>> >>
>> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
>> >> manager
>> >> starting
>> >>
>> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting
>> >> lifecycle
>> >> supervisor 10
>> >>
>> >> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
>> >> Configuration provider starting
>> >>
>> >> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
>> >> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1
>> >> Agent:
>> >> hbase-agent
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>> >>
>> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume
>> >> configuration contains configuration  for agents: [hbase-agent]
>> >>
>> >> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
>> >> Creating channels
>> >>
>> >> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
>> >> created channel ch1
>> >>
>> >> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of
>> >> sink
>> >> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>> >>
>> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
>> >> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
>> >> source:org.apache.flume.source.ExecSource@1ed0af9b }}
>> >> sinkRunners:{sink1=SinkRunner: {
>> >> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb
>> >> counterGroup:{
>> >> name:null counters:{} } }}
>> >> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
>> >>
>> >> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with
>> >> command:tail -f /home/hadoop/demo.txt
>> >>
>> >>
>> >>
>> >> Screen stuck here....no movement.
>> >>
>> >>
>> >>
>> >> ----------------------------------------
>> >>
>> >> ----------------------------------------
>> >>
>> >> Thanks & Regards,
>> >>
>> >> Ashutosh Sharma
>> >>
>> >> ----------------------------------------
>> >>
>> >>
>> >>
>> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
>> >> Sent: Thursday, June 21, 2012 5:01 PM
>> >> To: flume-user@incubator.apache.org
>> >> Subject: Re: Hbase-sink behavior
>> >>
>> >>
>> >>
>> >> Hi Ashutosh,
>> >>
>> >>
>> >>
>> >> The sink will not create the table or column family. Make sure you have
>> >> the table and column family. Also please make sure you have
>> >> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in
>> >> your
>> >> class path).
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Thanks
>> >>
>> >> Hari
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> Hari Shreedharan
>> >>
>> >>
>> >>
>> >> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
>> >>
>> >> Hi,
>> >>
>> >>
>> >>
>> >> I have used and followed the same steps which is mentioned in below
>> >> mails
>> >> to get start with the hbasesink. But agent is not storing any data into
>> >> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the
>> >> hbase
>> >> information. Even I am able to connect to the hbase server from that
>> >> agent
>> >> machine.
>> >>
>> >>
>> >>
>> >> Now, I am unable to understand and troubleshoot this problem. Seeking
>> >> advice from the community members....
>> >>
>> >>
>> >>
>> >> ----------------------------------------
>> >>
>> >> ----------------------------------------
>> >>
>> >> Thanks & Regards,
>> >>
>> >> Ashutosh Sharma
>> >>
>> >> ----------------------------------------
>> >>
>> >>
>> >>
>> >> -----Original Message-----
>> >>
>> >> From: Mohammad Tariq [mailto:dontariq@gmail.com]
>> >>
>> >> Sent: Friday, June 15, 2012 9:02 AM
>> >>
>> >> To: flume-user@incubator.apache.org
>> >>
>> >> Subject: Re: Hbase-sink behavior
>> >>
>> >>
>> >>
>> >> Thank you so much Hari for the valuable response..I'll follow the
>> >> guidelines provided by you.
>> >>
>> >>
>> >>
>> >> Regards,
>> >>
>> >> Mohammad Tariq
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
>> >> <hs...@cloudera.com> wrote:
>> >>
>> >> Hi Mohammad,
>> >>
>> >>
>> >>
>> >> My answers are inline.
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> Hari Shreedharan
>> >>
>> >>
>> >>
>> >> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>> >>
>> >>
>> >>
>> >> Hello list,
>> >>
>> >>
>> >>
>> >> I am trying to use hbase-sink to collect data from a local file and
>> >>
>> >> dump it into an Hbase table..But there are a few things I am not able
>> >>
>> >> to understand and need some guidance.
>> >>
>> >>
>> >>
>> >> This is the content of my conf file :
>> >>
>> >>
>> >>
>> >> hbase-agent.sources = tail
>> >>
>> >> hbase-agent..sinks = sink1
>> >>
>> >> hbase-agent.channels = ch1
>> >>
>> >> hbase-agent.sources.tail.type = exec
>> >>
>> >> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>> >>
>> >> hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
>> >>
>> >> org.apache.flume.sink.hbase.HBaseSink
>> >>
>> >> hbase-agent.sinks.sink1.channel = ch1
>> >>
>> >> hbase-agent.sinks.sink1.table = test3
>> >>
>> >> hbase-agent.sinks.sink1.columnFamily = testing
>> >>
>> >> hbase-agent.sinks.sink1.column = foo
>> >>
>> >> hbase-agent.sinks.sink1.serializer =
>> >>
>> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>> >>
>> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>> >>
>> >> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>> >>
>> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>> >>
>> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>> >>
>> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>> >>
>> >> hbase-agent.channels.ch1.type=memory
>> >>
>> >>
>> >>
>> >> Right now I am taking just some simple text from a file which has
>> >>
>> >> following content -
>> >>
>> >>
>> >>
>> >> value1
>> >>
>> >> value2
>> >>
>> >> value3
>> >>
>> >> value4
>> >>
>> >> value5
>> >>
>> >> value6
>> >>
>> >>
>> >>
>> >> And my Hbase table looks like -
>> >>
>> >>
>> >>
>> >> hbase(main):217:0> scan 'test3'
>> >>
>> >> ROW COLUMN+CELL
>> >>
>> >> 11339716704561 column=testing:col1,
>> >>
>> >> timestamp=1339716707569, value=value1
>> >>
>> >> 11339716704562 column=testing:col1,
>> >>
>> >> timestamp=1339716707571, value=value4
>> >>
>> >> 11339716846594 column=testing:col1,
>> >>
>> >> timestamp=1339716849608, value=value2
>> >>
>> >> 11339716846595 column=testing:col1,
>> >>
>> >> timestamp=1339716849610, value=value1
>> >>
>> >> 11339716846596 column=testing:col1,
>> >>
>> >> timestamp=1339716849611, value=value6
>> >>
>> >> 11339716846597 column=testing:col1,
>> >>
>> >> timestamp=1339716849614, value=value6
>> >>
>> >> 11339716846598 column=testing:col1,
>> >>
>> >> timestamp=1339716849615, value=value5
>> >>
>> >> 11339716846599 column=testing:col1,
>> >>
>> >> timestamp=1339716849615, value=value6
>> >>
>> >> incRow column=testing:col1,
>> >>
>> >> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
>> >>
>> >> 9 row(s) in 0.0580 seconds
>> >>
>> >>
>> >>
>> >> Now I have following questions -
>> >>
>> >>
>> >>
>> >> 1- Why the timestamp value is different from the row key?(I was trying
>> >>
>> >> to make "1+timestamp" as the rowkey)
>> >>
>> >>
>> >>
>> >> The value shown by hbase shell as timestamp is the time at which the
>> >>
>> >> value was inserted into Hbase, while the value inserted by Flume is
>> >>
>> >> the timestamp at which the sink read the event from the channel.
>> >>
>> >> Depending on how long the network and HBase takes, these timestamps
>> >>
>> >> can vary. If you want 1+timestamp as row key then you should configure
>> >> it:
>> >>
>> >>
>> >>
>> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
>> >>
>> >> appended as-is to the suffix you choose.
>> >>
>> >>
>> >>
>> >> 2- Although I am not using "incRow", it stills appear in the table
>> >>
>> >> with some value. Why so and what is this value??
>> >>
>> >>
>> >>
>> >> The SimpleHBaseEventSerializer is only an example class. For custom
>> >>
>> >> use cases you can write your own serializer by implementing
>> >>
>> >> HbaseEventSerializer. In this case, you have specified
>> >>
>> >> incrementColumn, which causes an increment on the column specified.
>> >>
>> >> Simply don't specify that config and that row will not appear.
>> >>
>> >>
>> >>
>> >> 3- How can avoid the last row??
>> >>
>> >>
>> >>
>> >> See above.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> I am still in the learning phase so please pardon my ignorance..Many
>> >> thanks.
>> >>
>> >>
>> >>
>> >> No problem. Much of this is documented
>> >>
>> >> here:
>> >>
>> >> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Regards,
>> >>
>> >> Mohammad Tariq
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
>> >> 문서에
>> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이
>> >> 잘못
>> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >>
>> >> This E-mail may contain confidential information and/or copyright
>> >> material. This email is intended for the use of the addressee only. If
>> >> you
>> >> receive this email by mistake, please either delete it without
>> >> reproducing,
>> >> distributing or retaining copies thereof or notify the sender
>> >> immediately.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
>> >> 문서에
>> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이
>> >> 잘못
>> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> This E-mail may contain confidential information and/or copyright
>> >> material. This email is intended for the use of the addressee only. If
>> >> you
>> >> receive this email by mistake, please either delete it without
>> >> reproducing,
>> >> distributing or retaining copies thereof or notify the sender
>> >> immediately.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
>> >> 문서에
>> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이
>> >> 잘못
>> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> This E-mail may contain confidential information and/or copyright
>> >> material. This email is intended for the use of the addressee only. If
>> >> you
>> >> receive this email by mistake, please either delete it without
>> >> reproducing,
>> >> distributing or retaining copies thereof or notify the sender
>> >> immediately.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본
>> >> 문서에
>> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이
>> >> 잘못
>> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> >> This E-mail may contain confidential information and/or copyright
>> >> material. This email is intended for the use of the addressee only. If
>> >> you
>> >> receive this email by mistake, please either delete it without
>> >> reproducing,
>> >> distributing or retaining copies thereof or notify the sender
>> >> immediately.
>> >
>> >
>
>

Re: Hbase-sink behavior

Posted by Will McQueen <wi...@cloudera.com>.
Hi Mohammad,

In your config file, I think you need to remove this line:

>>hbase-agent.sinks.sink1.serializer.keyType = timestamp

I don't see any 'keyType' property in SimpleHbaseEventSerializer.java
(although there is a keyType var that stores the value of the 'suffix'
prop).

Cheers,
Will

On Thu, Jun 21, 2012 at 3:52 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hi Rahul,
>
>          This normally happens when there is some problem in the
> configuration file.Create a file called hbase-agent inside your
> FLUME_HOME/conf directory and copy this content into it:
> hbase-agent.sources = tail
> hbase-agent.sinks = sink1
> hbase-agent.channels = ch1
>
> hbase-agent.sources.tail.type = exec
> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> hbase-agent.sources.tail.channels = ch1
>
> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
> hbase-agent.sinks.sink1.channel = ch1
> hbase-agent.sinks.sink1.table = demo
> hbase-agent.sinks.sink1.columnFamily = cf
>
> hbase-agent.sinks.sink1.serializer =
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>
> hbase-agent.channels.ch1.type=memory
>
> Then start the agent and see if it works for you. It worked for me.
>
> Regards,
>    Mohammad Tariq
>
>
> On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com> wrote:
> > Hi Sharma,
> >
> > So I assume that your command looks something like this:
> >      flume-ng agent -n hbase-agent -f
> /home/hadoop/flumeng/hbaseagent.conf
> > -c /etc/flume-ng/conf
> >
> > ...?
> >
> > Hari, I saw your comment:
> >
> >>>I am not sure if HBase changed their wire protocol between these
> versions.
> > Do you have any other advice about troubleshooting a possible hbase
> protocol
> > mismatch issue?
> >
> > Cheers,
> > Will
> >
> >
> >
> > On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀) <
> sharma.ashutosh@kt.com>
> > wrote:
> >>
> >> Hi Will,
> >>
> >>
> >>
> >> I installed flume as part of CDH3u4 version 1.1 using yum install
> >> flume-ng. One more point, I am using flume-ng hbase sink downloaded
> from:
> >>
> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
> >>
> >>
> >>
> >> Now, I ran the agent with -conf parameter with updated
> log4j.properties. I
> >> don't see any error in the log. Please see the below from the log file:
> >>
> >>
> >>
> >> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor: Starting
> >> lifecycle supervisor 1
> >>
> >> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting -
> >> hbase-agent
> >>
> >> 2012-06-21 18:25:08,146 INFO nodemanager.DefaultLogicalNodeManager: Node
> >> manager starting
> >>
> >> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor: Starting
> >> lifecycle supervisor 9
> >>
> >> 2012-06-21 18:25:08,146 INFO
> >> properties.PropertiesFileConfigurationProvider: Configuration provider
> >> starting
> >>
> >> 2012-06-21 18:25:08,148 DEBUG nodemanager.DefaultLogicalNodeManager:
> Node
> >> manager started
> >>
> >> 2012-06-21 18:25:08,148 DEBUG
> >> properties.PropertiesFileConfigurationProvider: Configuration provider
> >> started
> >>
> >> 2012-06-21 18:25:08,149 DEBUG
> >> properties.PropertiesFileConfigurationProvider: Checking
> >> file:/home/hadoop/flumeng/hbaseagent.conf for changes
> >>
> >> 2012-06-21 18:25:08,149 INFO
> >> properties.PropertiesFileConfigurationProvider: Reloading configuration
> >> file:/home/hadoop/flumeng/hbaseagent.conf
> >>
> >> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added sinks: sink1
> >> Agent: hbase-agent
> >>
> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created context
> for
> >> sink1: serializer.rowPrefix
> >>
> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting
> validation
> >> of configuration for agent: hbase-agent, initial-configuration:
> >> AgentConfiguration[hbase-agent]
> >>
> >> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
> >> channels=ch1, type=exec} }}
> >>
> >> CHANNELS: {ch1={ parameters:{type=memory} }}
> >>
> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> >> serializer.keyType=timestamp,
> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> >> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
> >> batchSize=1, columnFamily=cf1, table=test,
> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> >> serializer.suffix=timestamp} }}
> >>
> >>
> >>
> >> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created channel
> ch1
> >>
> >> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating sink:
> >> sink1 using OTHER
> >>
> >> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post validation
> >> configuration for hbase-agent
> >>
> >> AgentConfiguration created without Configuration stubs for which only
> >> basic syntactical validation was performed[hbase-agent]
> >>
> >> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
> >> channels=ch1, type=exec} }}
> >>
> >> CHANNELS: {ch1={ parameters:{type=memory} }}
> >>
> >> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> >> serializer.keyType=timestamp,
> >> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> >> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
> >> batchSize=1, columnFamily=cf1, table=test,
> >> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> >> serializer.suffix=timestamp} }}
> >>
> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Channels:ch1
> >>
> >>
> >>
> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks sink1
> >>
> >>
> >>
> >> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources tail
> >>
> >>
> >>
> >> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration: Post-validation
> >> flume configuration contains configuration  for agents: [hbase-agent]
> >>
> >> 2012-06-21 18:25:08,171 INFO
> >> properties.PropertiesFileConfigurationProvider: Creating channels
> >>
> >> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory: Creating
> >> instance of channel ch1 type memory
> >>
> >> 2012-06-21 18:25:08,175 INFO
> >> properties.PropertiesFileConfigurationProvider: created channel ch1
> >>
> >> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory: Creating
> >> instance of source tail, type exec
> >>
> >> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating instance
> of
> >> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> >>
> >> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
> >> org.apache.flume.sink.hbase.HBaseSink is a custom type
> >>
> >> 2012-06-21 18:25:08,298 INFO nodemanager.DefaultLogicalNodeManager: Node
> >> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
> >> source:org.apache.flume.source.ExecSource@1fd0fafc }}
> >> sinkRunners:{sink1=SinkRunner: {
> >> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5counterGroup:{
> >> name:null counters:{} } }}
> >> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
> >>
> >> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source starting
> with
> >> command:tail -f /home/hadoop/demo.txt
> >>
> >> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source started
> >>
> >>
> >>
> >> Output of the which Flume-ng is:
> >>
> >> /usr/bin/flume-ng
> >>
> >>
> >>
> >>
> >>
> >> ----------------------------------------
> >>
> >> ----------------------------------------
> >>
> >> Thanks & Regards,
> >>
> >> Ashutosh Sharma
> >>
> >> Cell: 010-7300-0150
> >>
> >> Email: sharma.ashutosh@kt.com
> >>
> >> ----------------------------------------
> >>
> >>
> >>
> >> From: Will McQueen [mailto:will@cloudera.com]
> >> Sent: Thursday, June 21, 2012 6:07 PM
> >>
> >>
> >> To: flume-user@incubator.apache.org
> >> Subject: Re: Hbase-sink behavior
> >>
> >>
> >>
> >> Hi Sharma,
> >>
> >>
> >>
> >> Could you please describe how you installed flume? Also, I see you're
> >> getting this warning:
> >>
> >> >> Warning: No configuration directory set! Use --conf <dir> to
> override.
> >>
> >>
> >>
> >> The log4j.properties that flume provides is stored in the conf dir. If
> you
> >> specify the flume conf dir, flume can pick it up. So for
> troubleshooting you
> >> can try:
> >>
> >>
> >> 1) modifying the log4j.properties within flume's conf dir so that the
> top
> >> reads:
> >> #flume.root.logger=DEBUG,console
> >> flume.root.logger=DEBUG,LOGFILE
> >> flume.log.dir=.
> >> flume.log.file=flume.log
> >>
> >> 2) Run the flume agent while specifying the flume conf dir (--conf
> <dir>)
> >>
> >> 3) What's the output of 'which flume-ng'?
> >>
> >> Cheers,
> >> Will
> >>
> >> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
> >> <sh...@kt.com> wrote:
> >>
> >> Hi Hari,
> >>
> >>
> >>
> >> I checked, agent is successfully tailing the file which I mentioned.
> Yes,
> >> you are right, agent has started properly without any error. Because
> there
> >> is no further movement, so it's hard for me to identify the issue. I
> also
> >> used tail -F also, but no success.
> >>
> >> Can you suggest me some technique to troubleshoot it, so I could
> identify
> >> the issue and resolve the same. Does flume record some log anywhere?
> >>
> >>
> >>
> >> ----------------------------------------
> >>
> >> ----------------------------------------
> >>
> >> Thanks & Regards,
> >>
> >> Ashutosh Sharma
> >>
> >> Cell: 010-7300-0150
> >>
> >> Email: sharma.ashutosh@kt.com
> >>
> >> ----------------------------------------
> >>
> >>
> >>
> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> >> Sent: Thursday, June 21, 2012 5:25 PM
> >>
> >>
> >> To: flume-user@incubator.apache.org
> >> Subject: Re: Hbase-sink behavior
> >>
> >>
> >>
> >> I am not sure if HBase changed their wire protocol between these
> versions.
> >> Looks like your agent has started properly. Are you sure data is being
> >> written into the file being tailed? I suggest using tail -F. The log
> being
> >> stuck here is ok, that is probably because nothing specific is
> required(or
> >> your log file rotated).
> >>
> >>
> >>
> >> Thanks
> >>
> >> Hari
> >>
> >>
> >>
> >> --
> >>
> >> Hari Shreedharan
> >>
> >>
> >>
> >> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
> >>
> >> Hi Hari,
> >>
> >>
> >>
> >> Thanks for your prompt reply. I already created the table in Hbase with
> a
> >> column family and hadoop/hbase library is available to hadoop. I noticed
> >> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
> >>
> >> Please see the below lines captured while running the flume agent:
> >>
> >>
> >>
> >> >>> flume-ng  agent -n hbase-agent -f
> /home/hadoop/flumeng/hbaseagent.conf
> >>
> >> Warning: No configuration directory set! Use --conf <dir> to override.
> >>
> >> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
> >> access
> >>
> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
> >> classpath
> >>
> >> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from
> >> classpath
> >>
> >> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
> >>
> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
> >> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> >> org.apache.flume.node.Application -n hbase-agent -f
> >> /home/hadoop/flumeng/hbaseagent.conf
> >>
> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> >> supervisor 1
> >>
> >> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting - hbase-agent
> >>
> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
> manager
> >> starting
> >>
> >> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> >> supervisor 10
> >>
> >> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> >> Configuration provider starting
> >>
> >> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> >> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1
> Agent:
> >> hbase-agent
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
> >>
> >> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume
> >> configuration contains configuration  for agents: [hbase-agent]
> >>
> >> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> >> Creating channels
> >>
> >> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> >> created channel ch1
> >>
> >> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of
> sink
> >> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
> >>
> >> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
> >> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
> >> source:org.apache.flume.source.ExecSource@1ed0af9b }}
> >> sinkRunners:{sink1=SinkRunner: {
> >> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8ebcounterGroup:{
> >> name:null counters:{} } }}
> >> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
> >>
> >> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with
> >> command:tail -f /home/hadoop/demo.txt
> >>
> >>
> >>
> >> Screen stuck here....no movement.
> >>
> >>
> >>
> >> ----------------------------------------
> >>
> >> ----------------------------------------
> >>
> >> Thanks & Regards,
> >>
> >> Ashutosh Sharma
> >>
> >> ----------------------------------------
> >>
> >>
> >>
> >> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> >> Sent: Thursday, June 21, 2012 5:01 PM
> >> To: flume-user@incubator.apache.org
> >> Subject: Re: Hbase-sink behavior
> >>
> >>
> >>
> >> Hi Ashutosh,
> >>
> >>
> >>
> >> The sink will not create the table or column family. Make sure you have
> >> the table and column family. Also please make sure you have
> >> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in
> your
> >> class path).
> >>
> >>
> >>
> >>
> >>
> >> Thanks
> >>
> >> Hari
> >>
> >>
> >>
> >> --
> >>
> >> Hari Shreedharan
> >>
> >>
> >>
> >> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
> >>
> >> Hi,
> >>
> >>
> >>
> >> I have used and followed the same steps which is mentioned in below
> mails
> >> to get start with the hbasesink. But agent is not storing any data into
> >> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the
> hbase
> >> information. Even I am able to connect to the hbase server from that
> agent
> >> machine.
> >>
> >>
> >>
> >> Now, I am unable to understand and troubleshoot this problem. Seeking
> >> advice from the community members....
> >>
> >>
> >>
> >> ----------------------------------------
> >>
> >> ----------------------------------------
> >>
> >> Thanks & Regards,
> >>
> >> Ashutosh Sharma
> >>
> >> ----------------------------------------
> >>
> >>
> >>
> >> -----Original Message-----
> >>
> >> From: Mohammad Tariq [mailto:dontariq@gmail.com]
> >>
> >> Sent: Friday, June 15, 2012 9:02 AM
> >>
> >> To: flume-user@incubator.apache.org
> >>
> >> Subject: Re: Hbase-sink behavior
> >>
> >>
> >>
> >> Thank you so much Hari for the valuable response..I'll follow the
> >> guidelines provided by you.
> >>
> >>
> >>
> >> Regards,
> >>
> >> Mohammad Tariq
> >>
> >>
> >>
> >>
> >>
> >> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
> >> <hs...@cloudera.com> wrote:
> >>
> >> Hi Mohammad,
> >>
> >>
> >>
> >> My answers are inline.
> >>
> >>
> >>
> >> --
> >>
> >> Hari Shreedharan
> >>
> >>
> >>
> >> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
> >>
> >>
> >>
> >> Hello list,
> >>
> >>
> >>
> >> I am trying to use hbase-sink to collect data from a local file and
> >>
> >> dump it into an Hbase table..But there are a few things I am not able
> >>
> >> to understand and need some guidance.
> >>
> >>
> >>
> >> This is the content of my conf file :
> >>
> >>
> >>
> >> hbase-agent.sources = tail
> >>
> >> hbase-agent..sinks = sink1
> >>
> >> hbase-agent.channels = ch1
> >>
> >> hbase-agent.sources.tail.type = exec
> >>
> >> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> >>
> >> hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
> >>
> >> org.apache.flume.sink.hbase.HBaseSink
> >>
> >> hbase-agent.sinks.sink1.channel = ch1
> >>
> >> hbase-agent.sinks.sink1.table = test3
> >>
> >> hbase-agent.sinks.sink1.columnFamily = testing
> >>
> >> hbase-agent.sinks.sink1.column = foo
> >>
> >> hbase-agent.sinks.sink1.serializer =
> >>
> >> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> >>
> >> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> >>
> >> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> >>
> >> hbase-agent.sinks.sink1.serializer.keyType = timestamp
> >>
> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> >>
> >> hbase-agent.sinks.sink1.serializer.suffix = timestamp
> >>
> >> hbase-agent.channels.ch1.type=memory
> >>
> >>
> >>
> >> Right now I am taking just some simple text from a file which has
> >>
> >> following content -
> >>
> >>
> >>
> >> value1
> >>
> >> value2
> >>
> >> value3
> >>
> >> value4
> >>
> >> value5
> >>
> >> value6
> >>
> >>
> >>
> >> And my Hbase table looks like -
> >>
> >>
> >>
> >> hbase(main):217:0> scan 'test3'
> >>
> >> ROW COLUMN+CELL
> >>
> >> 11339716704561 column=testing:col1,
> >>
> >> timestamp=1339716707569, value=value1
> >>
> >> 11339716704562 column=testing:col1,
> >>
> >> timestamp=1339716707571, value=value4
> >>
> >> 11339716846594 column=testing:col1,
> >>
> >> timestamp=1339716849608, value=value2
> >>
> >> 11339716846595 column=testing:col1,
> >>
> >> timestamp=1339716849610, value=value1
> >>
> >> 11339716846596 column=testing:col1,
> >>
> >> timestamp=1339716849611, value=value6
> >>
> >> 11339716846597 column=testing:col1,
> >>
> >> timestamp=1339716849614, value=value6
> >>
> >> 11339716846598 column=testing:col1,
> >>
> >> timestamp=1339716849615, value=value5
> >>
> >> 11339716846599 column=testing:col1,
> >>
> >> timestamp=1339716849615, value=value6
> >>
> >> incRow column=testing:col1,
> >>
> >> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> >>
> >> 9 row(s) in 0.0580 seconds
> >>
> >>
> >>
> >> Now I have following questions -
> >>
> >>
> >>
> >> 1- Why the timestamp value is different from the row key?(I was trying
> >>
> >> to make "1+timestamp" as the rowkey)
> >>
> >>
> >>
> >> The value shown by hbase shell as timestamp is the time at which the
> >>
> >> value was inserted into Hbase, while the value inserted by Flume is
> >>
> >> the timestamp at which the sink read the event from the channel.
> >>
> >> Depending on how long the network and HBase takes, these timestamps
> >>
> >> can vary. If you want 1+timestamp as row key then you should configure
> it:
> >>
> >>
> >>
> >> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> >>
> >> appended as-is to the suffix you choose.
> >>
> >>
> >>
> >> 2- Although I am not using "incRow", it stills appear in the table
> >>
> >> with some value. Why so and what is this value??
> >>
> >>
> >>
> >> The SimpleHBaseEventSerializer is only an example class. For custom
> >>
> >> use cases you can write your own serializer by implementing
> >>
> >> HbaseEventSerializer. In this case, you have specified
> >>
> >> incrementColumn, which causes an increment on the column specified.
> >>
> >> Simply don't specify that config and that row will not appear.
> >>
> >>
> >>
> >> 3- How can avoid the last row??
> >>
> >>
> >>
> >> See above.
> >>
> >>
> >>
> >>
> >>
> >> I am still in the learning phase so please pardon my ignorance..Many
> >> thanks.
> >>
> >>
> >>
> >> No problem. Much of this is documented
> >>
> >> here:
> >>
> >> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Regards,
> >>
> >> Mohammad Tariq
> >>
> >>
> >>
> >>
> >>
> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이
> 잘못
> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >>
> >> This E-mail may contain confidential information and/or copyright
> >> material. This email is intended for the use of the addressee only. If
> you
> >> receive this email by mistake, please either delete it without
> reproducing,
> >> distributing or retaining copies thereof or notify the sender
> immediately.
> >>
> >>
> >>
> >>
> >>
> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이
> 잘못
> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> This E-mail may contain confidential information and/or copyright
> >> material. This email is intended for the use of the addressee only. If
> you
> >> receive this email by mistake, please either delete it without
> reproducing,
> >> distributing or retaining copies thereof or notify the sender
> immediately.
> >>
> >>
> >>
> >>
> >>
> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이
> 잘못
> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> This E-mail may contain confidential information and/or copyright
> >> material. This email is intended for the use of the addressee only. If
> you
> >> receive this email by mistake, please either delete it without
> reproducing,
> >> distributing or retaining copies thereof or notify the sender
> immediately.
> >>
> >>
> >>
> >>
> >>
> >> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> >> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이
> 잘못
> >> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >> This E-mail may contain confidential information and/or copyright
> >> material. This email is intended for the use of the addressee only. If
> you
> >> receive this email by mistake, please either delete it without
> reproducing,
> >> distributing or retaining copies thereof or notify the sender
> immediately.
> >
> >
>

Re: Hbase-sink behavior

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Rahul,

          This normally happens when there is some problem in the
configuration file.Create a file called hbase-agent inside your
FLUME_HOME/conf directory and copy this content into it:
hbase-agent.sources = tail
hbase-agent.sinks = sink1
hbase-agent.channels = ch1

hbase-agent.sources.tail.type = exec
hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
hbase-agent.sources.tail.channels = ch1

hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
hbase-agent.sinks.sink1.channel = ch1
hbase-agent.sinks.sink1.table = demo
hbase-agent.sinks.sink1.columnFamily = cf

hbase-agent.sinks.sink1.serializer =
org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
hbase-agent.sinks.sink1.serializer.payloadColumn = col1

hbase-agent.sinks.sink1.serializer.keyType = timestamp
hbase-agent.sinks.sink1.serializer.rowPrefix = 1
hbase-agent.sinks.sink1.serializer.suffix = timestamp

hbase-agent.channels.ch1.type=memory

Then start the agent and see if it works for you. It worked for me.

Regards,
    Mohammad Tariq


On Thu, Jun 21, 2012 at 4:14 PM, Will McQueen <wi...@cloudera.com> wrote:
> Hi Sharma,
>
> So I assume that your command looks something like this:
>      flume-ng agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf
> -c /etc/flume-ng/conf
>
> ...?
>
> Hari, I saw your comment:
>
>>>I am not sure if HBase changed their wire protocol between these versions.
> Do you have any other advice about troubleshooting a possible hbase protocol
> mismatch issue?
>
> Cheers,
> Will
>
>
>
> On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀) <sh...@kt.com>
> wrote:
>>
>> Hi Will,
>>
>>
>>
>> I installed flume as part of CDH3u4 version 1.1 using yum install
>> flume-ng. One more point, I am using flume-ng hbase sink downloaded from:
>> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
>>
>>
>>
>> Now, I ran the agent with -conf parameter with updated log4j.properties. I
>> don't see any error in the log. Please see the below from the log file:
>>
>>
>>
>> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor: Starting
>> lifecycle supervisor 1
>>
>> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting -
>> hbase-agent
>>
>> 2012-06-21 18:25:08,146 INFO nodemanager.DefaultLogicalNodeManager: Node
>> manager starting
>>
>> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor: Starting
>> lifecycle supervisor 9
>>
>> 2012-06-21 18:25:08,146 INFO
>> properties.PropertiesFileConfigurationProvider: Configuration provider
>> starting
>>
>> 2012-06-21 18:25:08,148 DEBUG nodemanager.DefaultLogicalNodeManager: Node
>> manager started
>>
>> 2012-06-21 18:25:08,148 DEBUG
>> properties.PropertiesFileConfigurationProvider: Configuration provider
>> started
>>
>> 2012-06-21 18:25:08,149 DEBUG
>> properties.PropertiesFileConfigurationProvider: Checking
>> file:/home/hadoop/flumeng/hbaseagent.conf for changes
>>
>> 2012-06-21 18:25:08,149 INFO
>> properties.PropertiesFileConfigurationProvider: Reloading configuration
>> file:/home/hadoop/flumeng/hbaseagent.conf
>>
>> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added sinks: sink1
>> Agent: hbase-agent
>>
>> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created context for
>> sink1: serializer.rowPrefix
>>
>> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting validation
>> of configuration for agent: hbase-agent, initial-configuration:
>> AgentConfiguration[hbase-agent]
>>
>> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
>> channels=ch1, type=exec} }}
>>
>> CHANNELS: {ch1={ parameters:{type=memory} }}
>>
>> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>> serializer.keyType=timestamp,
>> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
>> batchSize=1, columnFamily=cf1, table=test,
>> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>> serializer.suffix=timestamp} }}
>>
>>
>>
>> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created channel ch1
>>
>> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating sink:
>> sink1 using OTHER
>>
>> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post validation
>> configuration for hbase-agent
>>
>> AgentConfiguration created without Configuration stubs for which only
>> basic syntactical validation was performed[hbase-agent]
>>
>> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
>> channels=ch1, type=exec} }}
>>
>> CHANNELS: {ch1={ parameters:{type=memory} }}
>>
>> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
>> serializer.keyType=timestamp,
>> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
>> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
>> batchSize=1, columnFamily=cf1, table=test,
>> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
>> serializer.suffix=timestamp} }}
>>
>> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Channels:ch1
>>
>>
>>
>> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks sink1
>>
>>
>>
>> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources tail
>>
>>
>>
>> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration: Post-validation
>> flume configuration contains configuration  for agents: [hbase-agent]
>>
>> 2012-06-21 18:25:08,171 INFO
>> properties.PropertiesFileConfigurationProvider: Creating channels
>>
>> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory: Creating
>> instance of channel ch1 type memory
>>
>> 2012-06-21 18:25:08,175 INFO
>> properties.PropertiesFileConfigurationProvider: created channel ch1
>>
>> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory: Creating
>> instance of source tail, type exec
>>
>> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating instance of
>> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>>
>> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
>> org.apache.flume.sink.hbase.HBaseSink is a custom type
>>
>> 2012-06-21 18:25:08,298 INFO nodemanager.DefaultLogicalNodeManager: Node
>> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
>> source:org.apache.flume.source.ExecSource@1fd0fafc }}
>> sinkRunners:{sink1=SinkRunner: {
>> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5 counterGroup:{
>> name:null counters:{} } }}
>> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
>>
>> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source starting with
>> command:tail -f /home/hadoop/demo.txt
>>
>> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source started
>>
>>
>>
>> Output of the which Flume-ng is:
>>
>> /usr/bin/flume-ng
>>
>>
>>
>>
>>
>> ----------------------------------------
>>
>> ----------------------------------------
>>
>> Thanks & Regards,
>>
>> Ashutosh Sharma
>>
>> Cell: 010-7300-0150
>>
>> Email: sharma.ashutosh@kt.com
>>
>> ----------------------------------------
>>
>>
>>
>> From: Will McQueen [mailto:will@cloudera.com]
>> Sent: Thursday, June 21, 2012 6:07 PM
>>
>>
>> To: flume-user@incubator.apache.org
>> Subject: Re: Hbase-sink behavior
>>
>>
>>
>> Hi Sharma,
>>
>>
>>
>> Could you please describe how you installed flume? Also, I see you're
>> getting this warning:
>>
>> >> Warning: No configuration directory set! Use --conf <dir> to override.
>>
>>
>>
>> The log4j.properties that flume provides is stored in the conf dir. If you
>> specify the flume conf dir, flume can pick it up. So for troubleshooting you
>> can try:
>>
>>
>> 1) modifying the log4j.properties within flume's conf dir so that the top
>> reads:
>> #flume.root.logger=DEBUG,console
>> flume.root.logger=DEBUG,LOGFILE
>> flume.log.dir=.
>> flume.log.file=flume.log
>>
>> 2) Run the flume agent while specifying the flume conf dir (--conf <dir>)
>>
>> 3) What's the output of 'which flume-ng'?
>>
>> Cheers,
>> Will
>>
>> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
>> <sh...@kt.com> wrote:
>>
>> Hi Hari,
>>
>>
>>
>> I checked, agent is successfully tailing the file which I mentioned. Yes,
>> you are right, agent has started properly without any error. Because there
>> is no further movement, so it's hard for me to identify the issue. I also
>> used tail -F also, but no success.
>>
>> Can you suggest me some technique to troubleshoot it, so I could identify
>> the issue and resolve the same. Does flume record some log anywhere?
>>
>>
>>
>> ----------------------------------------
>>
>> ----------------------------------------
>>
>> Thanks & Regards,
>>
>> Ashutosh Sharma
>>
>> Cell: 010-7300-0150
>>
>> Email: sharma.ashutosh@kt.com
>>
>> ----------------------------------------
>>
>>
>>
>> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
>> Sent: Thursday, June 21, 2012 5:25 PM
>>
>>
>> To: flume-user@incubator.apache.org
>> Subject: Re: Hbase-sink behavior
>>
>>
>>
>> I am not sure if HBase changed their wire protocol between these versions.
>> Looks like your agent has started properly. Are you sure data is being
>> written into the file being tailed? I suggest using tail -F. The log being
>> stuck here is ok, that is probably because nothing specific is required(or
>> your log file rotated).
>>
>>
>>
>> Thanks
>>
>> Hari
>>
>>
>>
>> --
>>
>> Hari Shreedharan
>>
>>
>>
>> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
>>
>> Hi Hari,
>>
>>
>>
>> Thanks for your prompt reply. I already created the table in Hbase with a
>> column family and hadoop/hbase library is available to hadoop. I noticed
>> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
>>
>> Please see the below lines captured while running the flume agent:
>>
>>
>>
>> >>> flume-ng  agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf
>>
>> Warning: No configuration directory set! Use --conf <dir> to override.
>>
>> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
>> access
>>
>> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from
>> classpath
>>
>> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from
>> classpath
>>
>> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
>> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
>> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
>> org.apache.flume.node.Application -n hbase-agent -f
>> /home/hadoop/flumeng/hbaseagent.conf
>>
>> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
>> supervisor 1
>>
>> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting - hbase-agent
>>
>> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node manager
>> starting
>>
>> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
>> supervisor 10
>>
>> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
>> Configuration provider starting
>>
>> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
>> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent:
>> hbase-agent
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>>
>> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume
>> configuration contains configuration  for agents: [hbase-agent]
>>
>> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
>> Creating channels
>>
>> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
>> created channel ch1
>>
>> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of sink
>> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>>
>> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
>> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
>> source:org.apache.flume.source.ExecSource@1ed0af9b }}
>> sinkRunners:{sink1=SinkRunner: {
>> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb counterGroup:{
>> name:null counters:{} } }}
>> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
>>
>> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with
>> command:tail -f /home/hadoop/demo.txt
>>
>>
>>
>> Screen stuck here....no movement.
>>
>>
>>
>> ----------------------------------------
>>
>> ----------------------------------------
>>
>> Thanks & Regards,
>>
>> Ashutosh Sharma
>>
>> ----------------------------------------
>>
>>
>>
>> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
>> Sent: Thursday, June 21, 2012 5:01 PM
>> To: flume-user@incubator.apache.org
>> Subject: Re: Hbase-sink behavior
>>
>>
>>
>> Hi Ashutosh,
>>
>>
>>
>> The sink will not create the table or column family. Make sure you have
>> the table and column family. Also please make sure you have
>> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your
>> class path).
>>
>>
>>
>>
>>
>> Thanks
>>
>> Hari
>>
>>
>>
>> --
>>
>> Hari Shreedharan
>>
>>
>>
>> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
>>
>> Hi,
>>
>>
>>
>> I have used and followed the same steps which is mentioned in below mails
>> to get start with the hbasesink. But agent is not storing any data into
>> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase
>> information. Even I am able to connect to the hbase server from that agent
>> machine.
>>
>>
>>
>> Now, I am unable to understand and troubleshoot this problem. Seeking
>> advice from the community members....
>>
>>
>>
>> ----------------------------------------
>>
>> ----------------------------------------
>>
>> Thanks & Regards,
>>
>> Ashutosh Sharma
>>
>> ----------------------------------------
>>
>>
>>
>> -----Original Message-----
>>
>> From: Mohammad Tariq [mailto:dontariq@gmail.com]
>>
>> Sent: Friday, June 15, 2012 9:02 AM
>>
>> To: flume-user@incubator.apache.org
>>
>> Subject: Re: Hbase-sink behavior
>>
>>
>>
>> Thank you so much Hari for the valuable response..I'll follow the
>> guidelines provided by you.
>>
>>
>>
>> Regards,
>>
>> Mohammad Tariq
>>
>>
>>
>>
>>
>> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
>> <hs...@cloudera.com> wrote:
>>
>> Hi Mohammad,
>>
>>
>>
>> My answers are inline.
>>
>>
>>
>> --
>>
>> Hari Shreedharan
>>
>>
>>
>> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>>
>>
>>
>> Hello list,
>>
>>
>>
>> I am trying to use hbase-sink to collect data from a local file and
>>
>> dump it into an Hbase table..But there are a few things I am not able
>>
>> to understand and need some guidance.
>>
>>
>>
>> This is the content of my conf file :
>>
>>
>>
>> hbase-agent.sources = tail
>>
>> hbase-agent..sinks = sink1
>>
>> hbase-agent.channels = ch1
>>
>> hbase-agent.sources.tail.type = exec
>>
>> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>>
>> hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
>>
>> org.apache.flume.sink.hbase.HBaseSink
>>
>> hbase-agent.sinks.sink1.channel = ch1
>>
>> hbase-agent.sinks.sink1.table = test3
>>
>> hbase-agent.sinks.sink1.columnFamily = testing
>>
>> hbase-agent.sinks.sink1.column = foo
>>
>> hbase-agent.sinks.sink1.serializer =
>>
>> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>>
>> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>>
>> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>>
>> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>>
>> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>>
>> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>>
>> hbase-agent.channels.ch1.type=memory
>>
>>
>>
>> Right now I am taking just some simple text from a file which has
>>
>> following content -
>>
>>
>>
>> value1
>>
>> value2
>>
>> value3
>>
>> value4
>>
>> value5
>>
>> value6
>>
>>
>>
>> And my Hbase table looks like -
>>
>>
>>
>> hbase(main):217:0> scan 'test3'
>>
>> ROW COLUMN+CELL
>>
>> 11339716704561 column=testing:col1,
>>
>> timestamp=1339716707569, value=value1
>>
>> 11339716704562 column=testing:col1,
>>
>> timestamp=1339716707571, value=value4
>>
>> 11339716846594 column=testing:col1,
>>
>> timestamp=1339716849608, value=value2
>>
>> 11339716846595 column=testing:col1,
>>
>> timestamp=1339716849610, value=value1
>>
>> 11339716846596 column=testing:col1,
>>
>> timestamp=1339716849611, value=value6
>>
>> 11339716846597 column=testing:col1,
>>
>> timestamp=1339716849614, value=value6
>>
>> 11339716846598 column=testing:col1,
>>
>> timestamp=1339716849615, value=value5
>>
>> 11339716846599 column=testing:col1,
>>
>> timestamp=1339716849615, value=value6
>>
>> incRow column=testing:col1,
>>
>> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
>>
>> 9 row(s) in 0.0580 seconds
>>
>>
>>
>> Now I have following questions -
>>
>>
>>
>> 1- Why the timestamp value is different from the row key?(I was trying
>>
>> to make "1+timestamp" as the rowkey)
>>
>>
>>
>> The value shown by hbase shell as timestamp is the time at which the
>>
>> value was inserted into Hbase, while the value inserted by Flume is
>>
>> the timestamp at which the sink read the event from the channel.
>>
>> Depending on how long the network and HBase takes, these timestamps
>>
>> can vary. If you want 1+timestamp as row key then you should configure it:
>>
>>
>>
>> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
>>
>> appended as-is to the suffix you choose.
>>
>>
>>
>> 2- Although I am not using "incRow", it stills appear in the table
>>
>> with some value. Why so and what is this value??
>>
>>
>>
>> The SimpleHBaseEventSerializer is only an example class. For custom
>>
>> use cases you can write your own serializer by implementing
>>
>> HbaseEventSerializer. In this case, you have specified
>>
>> incrementColumn, which causes an increment on the column specified.
>>
>> Simply don't specify that config and that row will not appear.
>>
>>
>>
>> 3- How can avoid the last row??
>>
>>
>>
>> See above.
>>
>>
>>
>>
>>
>> I am still in the learning phase so please pardon my ignorance..Many
>> thanks.
>>
>>
>>
>> No problem. Much of this is documented
>>
>> here:
>>
>> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
>>
>>
>>
>>
>>
>>
>>
>> Regards,
>>
>> Mohammad Tariq
>>
>>
>>
>>
>>
>> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
>> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
>> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>>
>> This E-mail may contain confidential information and/or copyright
>> material. This email is intended for the use of the addressee only. If you
>> receive this email by mistake, please either delete it without reproducing,
>> distributing or retaining copies thereof or notify the sender immediately.
>>
>>
>>
>>
>>
>> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
>> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
>> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> This E-mail may contain confidential information and/or copyright
>> material. This email is intended for the use of the addressee only. If you
>> receive this email by mistake, please either delete it without reproducing,
>> distributing or retaining copies thereof or notify the sender immediately.
>>
>>
>>
>>
>>
>> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
>> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
>> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> This E-mail may contain confidential information and/or copyright
>> material. This email is intended for the use of the addressee only. If you
>> receive this email by mistake, please either delete it without reproducing,
>> distributing or retaining copies thereof or notify the sender immediately.
>>
>>
>>
>>
>>
>> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
>> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
>> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>> This E-mail may contain confidential information and/or copyright
>> material. This email is intended for the use of the addressee only. If you
>> receive this email by mistake, please either delete it without reproducing,
>> distributing or retaining copies thereof or notify the sender immediately.
>
>

Re: Hbase-sink behavior

Posted by Will McQueen <wi...@cloudera.com>.
Hi Sharma,

So I assume that your command looks something like this:
     flume-ng agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf
-c /etc/flume-ng/conf

...?

Hari, I saw your comment:
>>I am not sure if HBase changed their wire protocol between these versions.
Do you have any other advice about troubleshooting a possible hbase
protocol mismatch issue?

Cheers,
Will


On Thu, Jun 21, 2012 at 2:35 AM, ashutosh(오픈플랫폼개발팀)
<sh...@kt.com>wrote:

>  Hi Will,
>
>
>
> I installed flume as part of CDH3u4 version 1.1 using yum install
> flume-ng. One more point, I am using flume-ng hbase sink downloaded from:
> https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar
>
>
>
> Now, I ran the agent with -conf parameter with updated log4j.properties. I
> don't see any error in the log. Please see the below from the log file:
>
>
>
> 2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor: Starting
> lifecycle supervisor 1
>
> 2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting -
> hbase-agent
>
> 2012-06-21 18:25:08,146 INFO nodemanager.DefaultLogicalNodeManager: Node
> manager starting
>
> 2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor: Starting
> lifecycle supervisor 9
>
> 2012-06-21 18:25:08,146 INFO
> properties.PropertiesFileConfigurationProvider: Configuration provider
> starting
>
> 2012-06-21 18:25:08,148 DEBUG nodemanager.DefaultLogicalNodeManager: Node
> manager started
>
> 2012-06-21 18:25:08,148 DEBUG
> properties.PropertiesFileConfigurationProvider: Configuration provider
> started
>
> 2012-06-21 18:25:08,149 DEBUG
> properties.PropertiesFileConfigurationProvider: Checking
> file:/home/hadoop/flumeng/hbaseagent.conf for changes
>
> 2012-06-21 18:25:08,149 INFO
> properties.PropertiesFileConfigurationProvider: Reloading configuration
> file:/home/hadoop/flumeng/hbaseagent.conf
>
> 2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added sinks: sink1
> Agent: hbase-agent
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created context for
> sink1: serializer.rowPrefix
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
>
> 2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting validation
> of configuration for agent: hbase-agent, initial-configuration:
> AgentConfiguration[hbase-agent]
>
> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
> channels=ch1, type=exec} }}
>
> CHANNELS: {ch1={ parameters:{type=memory} }}
>
> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> serializer.keyType=timestamp,
> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
> batchSize=1, columnFamily=cf1, table=test,
> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> serializer.suffix=timestamp} }}
>
>
>
> 2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created channel ch1
>
> 2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating sink:
> sink1 using OTHER
>
> 2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post validation
> configuration for hbase-agent
>
> AgentConfiguration created without Configuration stubs for which only
> basic syntactical validation was performed[hbase-agent]
>
> SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt,
> channels=ch1, type=exec} }}
>
> CHANNELS: {ch1={ parameters:{type=memory} }}
>
> SINKS: {sink1={ parameters:{serializer.payloadColumn=col1,
> serializer.keyType=timestamp,
> serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer,
> serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1,
> batchSize=1, columnFamily=cf1, table=test,
> type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1,
> serializer.suffix=timestamp} }}
>
> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Channels:ch1
>
>
>
> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks sink1
>
>
>
> 2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources tail
>
>
>
> 2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration: Post-validation
> flume configuration contains configuration  for agents: [hbase-agent]
>
> 2012-06-21 18:25:08,171 INFO
> properties.PropertiesFileConfigurationProvider: Creating channels
>
> 2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory: Creating
> instance of channel ch1 type memory
>
> 2012-06-21 18:25:08,175 INFO
> properties.PropertiesFileConfigurationProvider: created channel ch1
>
> 2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory: Creating
> instance of source tail, type exec
>
> 2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating instance of
> sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>
> 2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type
> org.apache.flume.sink.hbase.HBaseSink is a custom type
>
> 2012-06-21 18:25:08,298 INFO nodemanager.DefaultLogicalNodeManager: Node
> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
> source:org.apache.flume.source.ExecSource@1fd0fafc }}
> sinkRunners:{sink1=SinkRunner: {
> policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5 counterGroup:{
> name:null counters:{} } }}
> channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
>
> 2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source starting with
> command:tail -f /home/hadoop/demo.txt
>
> 2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source started
>
>
>
> *Output of the which Flume-ng is:*
>
> /usr/bin/flume-ng
>
>
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> Cell: 010-7300-0150
>
> Email: sharma.ashutosh@kt.com
>
> ----------------------------------------
>
>
>
> *From:* Will McQueen [mailto:will@cloudera.com]
> *Sent:* Thursday, June 21, 2012 6:07 PM
>
> *To:* flume-user@incubator.apache.org
> *Subject:* Re: Hbase-sink behavior
>
>
>
> Hi Sharma,
>
>
> Could you please describe how you installed flume? Also, I see you're
> getting this warning:
>
> >> Warning: No configuration directory set! Use --conf <dir> to override.
>
>
>
> The log4j.properties that flume provides is stored in the conf dir. If you
> specify the flume conf dir, flume can pick it up. So for troubleshooting
> you can try:
>
>
> 1) modifying the log4j.properties within flume's conf dir so that the top
> reads:
> #flume.root.logger=DEBUG,console
> flume.root.logger=DEBUG,LOGFILE
> flume.log.dir=.
> flume.log.file=flume.log
>
> 2) Run the flume agent while specifying the flume conf dir (--conf <dir>)
>
> 3) What's the output of 'which flume-ng'?
>
> Cheers,
> Will
>
> On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀) <
> sharma.ashutosh@kt.com> wrote:
>
> Hi Hari,
>
>
>
> I checked, agent is successfully tailing the file which I mentioned. Yes,
> you are right, agent has started properly without any error. Because there
> is no further movement, so it's hard for me to identify the issue. I also
> used tail -F also, but no success.
>
> Can you suggest me some technique to troubleshoot it, so I could identify
> the issue and resolve the same. Does flume record some log anywhere?
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> Cell: 010-7300-0150
>
> Email: sharma.ashutosh@kt.com
>
> ----------------------------------------
>
>
>
> *From:* Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> *Sent:* Thursday, June 21, 2012 5:25 PM
>
>
> *To:* flume-user@incubator.apache.org
> *Subject:* Re: Hbase-sink behavior
>
>
>
> I am not sure if HBase changed their wire protocol between these versions.
> Looks like your agent has started properly. Are you sure data is being
> written into the file being tailed? I suggest using tail -F. The log being
> stuck here is ok, that is probably because nothing specific is required(or
> your log file rotated).
>
>
>
> Thanks
>
> Hari
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
>
>   Hi Hari,
>
>
>
> Thanks for your prompt reply. I already created the table in Hbase with a
> column family and hadoop/hbase library is available to hadoop. I noticed
> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
>
> Please see the below lines captured while running the flume agent:
>
>
>
> >>> flume-ng  agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf
>
> Warning: No configuration directory set! Use --conf <dir> to override.
>
> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
> access
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from classpath
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from
> classpath
>
> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> org.apache.flume.node.Application -n hbase-agent -f
> /home/hadoop/flumeng/hbaseagent.conf
>
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> supervisor 1
>
> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting - hbase-agent
>
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node manager
> starting
>
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> supervisor 10
>
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> Configuration provider starting
>
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent:
> hbase-agent
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume
> configuration contains configuration  for agents: [hbase-agent]
>
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> Creating channels
>
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> created channel ch1
>
> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of sink
> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
> source:org.apache.flume.source.ExecSource@1ed0af9b }}
> sinkRunners:{sink1=SinkRunner: {
> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb counterGroup:{
> name:null counters:{} } }}
> channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }
>
> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with
> command:tail -f /home/hadoop/demo.txt
>
>
>
> Screen stuck here....no movement.
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> ----------------------------------------
>
>
>
> *From:* Hari Shreedharan [mailto:hshreedharan@cloudera.com<hs...@cloudera.com>]
>
> *Sent:* Thursday, June 21, 2012 5:01 PM
> *To:* flume-user@incubator.apache.org
> *Subject:* Re: Hbase-sink behavior
>
>
>
> Hi Ashutosh,
>
>
>
> The sink will not create the table or column family. Make sure you have
> the table and column family. Also please make sure you have
> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your
> class path).
>
>
>
>
>
> Thanks
>
> Hari
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
>
>   Hi,
>
>
>
> I have used and followed the same steps which is mentioned in below mails
> to get start with the hbasesink. But agent is not storing any data into
> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase
> information. Even I am able to connect to the hbase server from that agent
> machine.
>
>
>
> Now, I am unable to understand and troubleshoot this problem. Seeking
> advice from the community members....
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> ----------------------------------------
>
>
>
> -----Original Message-----
>
> From: Mohammad Tariq [mailto:dontariq@gmail.com <do...@gmail.com>]
>
> Sent: Friday, June 15, 2012 9:02 AM
>
> To: flume-user@incubator.apache.org
>
> Subject: Re: Hbase-sink behavior
>
>
>
> Thank you so much Hari for the valuable response..I'll follow the
> guidelines provided by you.
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <
> hshreedharan@cloudera.com> wrote:
>
>  Hi Mohammad,
>
>
>
> My answers are inline.
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>
>
>
> Hello list,
>
>
>
> I am trying to use hbase-sink to collect data from a local file and
>
> dump it into an Hbase table..But there are a few things I am not able
>
> to understand and need some guidance.
>
>
>
> This is the content of my conf file :
>
>
>
> hbase-agent.sources = tail
>
> hbase-agent..sinks = sink1
>
> hbase-agent.channels = ch1
>
> hbase-agent.sources.tail.type = exec
>
> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>
> hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
>
> org.apache.flume.sink.hbase.HBaseSink
>
> hbase-agent.sinks.sink1.channel = ch1
>
> hbase-agent.sinks.sink1.table = test3
>
> hbase-agent.sinks.sink1.columnFamily = testing
>
> hbase-agent.sinks.sink1.column = foo
>
> hbase-agent.sinks.sink1.serializer =
>
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>
> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>
> hbase-agent.channels.ch1.type=memory
>
>
>
> Right now I am taking just some simple text from a file which has
>
> following content -
>
>
>
> value1
>
> value2
>
> value3
>
> value4
>
> value5
>
> value6
>
>
>
> And my Hbase table looks like -
>
>
>
> hbase(main):217:0> scan 'test3'
>
> ROW COLUMN+CELL
>
> 11339716704561 column=testing:col1,
>
> timestamp=1339716707569, value=value1
>
> 11339716704562 column=testing:col1,
>
> timestamp=1339716707571, value=value4
>
> 11339716846594 column=testing:col1,
>
> timestamp=1339716849608, value=value2
>
> 11339716846595 column=testing:col1,
>
> timestamp=1339716849610, value=value1
>
> 11339716846596 column=testing:col1,
>
> timestamp=1339716849611, value=value6
>
> 11339716846597 column=testing:col1,
>
> timestamp=1339716849614, value=value6
>
> 11339716846598 column=testing:col1,
>
> timestamp=1339716849615, value=value5
>
> 11339716846599 column=testing:col1,
>
> timestamp=1339716849615, value=value6
>
> incRow column=testing:col1,
>
> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
>
> 9 row(s) in 0.0580 seconds
>
>
>
> Now I have following questions -
>
>
>
> 1- Why the timestamp value is different from the row key?(I was trying
>
> to make "1+timestamp" as the rowkey)
>
>
>
> The value shown by hbase shell as timestamp is the time at which the
>
> value was inserted into Hbase, while the value inserted by Flume is
>
> the timestamp at which the sink read the event from the channel.
>
> Depending on how long the network and HBase takes, these timestamps
>
> can vary. If you want 1+timestamp as row key then you should configure it:
>
>
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
>
> appended as-is to the suffix you choose.
>
>
>
> 2- Although I am not using "incRow", it stills appear in the table
>
> with some value. Why so and what is this value??
>
>
>
> The SimpleHBaseEventSerializer is only an example class. For custom
>
> use cases you can write your own serializer by implementing
>
> HbaseEventSerializer. In this case, you have specified
>
> incrementColumn, which causes an increment on the column specified.
>
> Simply don't specify that config and that row will not appear.
>
>
>
> 3- How can avoid the last row??
>
>
>
> See above.
>
>
>
>
>
> I am still in the learning phase so please pardon my ignorance..Many
> thanks.
>
>
>
> No problem. Much of this is documented
>
> here:
>
> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html<https://builds.apache..org/job/flume-trunk/site/apidocs/index.html>
>
>
>
>
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>

RE: Hbase-sink behavior

Posted by "ashutosh (오픈플랫폼개발팀)" <sh...@kt.com>.
Hi Will,

I installed flume as part of CDH3u4 version 1.1 using yum install flume-ng. One more point, I am using flume-ng hbase sink downloaded from: https://repository.cloudera.com/artifactory/cdh-releases-rcs/org/apache/flume/flume-ng-sinks/flume-ng-hbase-sink/1.1.0-cdh3u5-SNAPSHOT/flume-ng-hbase-sink-1.1.0-cdh3u5-20120620.072350-29.jar

Now, I ran the agent with –conf parameter with updated log4j.properties. I don’t see any error in the log. Please see the below from the log file:

2012-06-21 18:25:08,142 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1
2012-06-21 18:25:08,144 INFO node.FlumeNode: Flume node starting - hbase-agent
2012-06-21 18:25:08,146 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting
2012-06-21 18:25:08,146 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 9
2012-06-21 18:25:08,146 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting
2012-06-21 18:25:08,148 DEBUG nodemanager.DefaultLogicalNodeManager: Node manager started
2012-06-21 18:25:08,148 DEBUG properties.PropertiesFileConfigurationProvider: Configuration provider started
2012-06-21 18:25:08,149 DEBUG properties.PropertiesFileConfigurationProvider: Checking file:/home/hadoop/flumeng/hbaseagent.conf for changes
2012-06-21 18:25:08,149 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
2012-06-21 18:25:08,152 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent: hbase-agent
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,153 DEBUG conf.FlumeConfiguration: Created context for sink1: serializer.rowPrefix
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,153 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 INFO conf.FlumeConfiguration: Processing:sink1
2012-06-21 18:25:08,154 DEBUG conf.FlumeConfiguration: Starting validation of configuration for agent: hbase-agent, initial-configuration: AgentConfiguration[hbase-agent]
SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt, channels=ch1, type=exec} }}
CHANNELS: {ch1={ parameters:{type=memory} }}
SINKS: {sink1={ parameters:{serializer.payloadColumn=col1, serializer.keyType=timestamp, serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer, serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1, batchSize=1, columnFamily=cf1, table=test, type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1, serializer.suffix=timestamp} }}

2012-06-21 18:25:08,158 DEBUG conf.FlumeConfiguration: Created channel ch1
2012-06-21 18:25:08,169 DEBUG conf.FlumeConfiguration: Creating sink: sink1 using OTHER
2012-06-21 18:25:08,170 DEBUG conf.FlumeConfiguration: Post validation configuration for hbase-agent
AgentConfiguration created without Configuration stubs for which only basic syntactical validation was performed[hbase-agent]
SOURCES: {tail={ parameters:{command=tail -f /home/hadoop/demo.txt, channels=ch1, type=exec} }}
CHANNELS: {ch1={ parameters:{type=memory} }}
SINKS: {sink1={ parameters:{serializer.payloadColumn=col1, serializer.keyType=timestamp, serializer=org.apache.flume.sink.hbase.SimpleHbaseEventSerializer, serializer.incrementColumn=col1, column=foo, serializer.rowPrefix=1, batchSize=1, columnFamily=cf1, table=test, type=org.apache.flume.sink.hbase.HBaseSink, channel=ch1, serializer.suffix=timestamp} }}
2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Channels:ch1

2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sinks sink1

2012-06-21 18:25:08,171 DEBUG conf.FlumeConfiguration: Sources tail

2012-06-21 18:25:08,171 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration  for agents: [hbase-agent]
2012-06-21 18:25:08,171 INFO properties.PropertiesFileConfigurationProvider: Creating channels
2012-06-21 18:25:08,171 DEBUG channel.DefaultChannelFactory: Creating instance of channel ch1 type memory
2012-06-21 18:25:08,175 INFO properties.PropertiesFileConfigurationProvider: created channel ch1
2012-06-21 18:25:08,175 DEBUG source.DefaultSourceFactory: Creating instance of source tail, type exec
2012-06-21 18:25:08,180 INFO sink.DefaultSinkFactory: Creating instance of sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
2012-06-21 18:25:08,180 DEBUG sink.DefaultSinkFactory: Sink type org.apache.flume.sink.hbase.HBaseSink is a custom type
2012-06-21 18:25:08,298 INFO nodemanager.DefaultLogicalNodeManager: Node configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource@1fd0fafc }} sinkRunners:{sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@510dc6b5 counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel@5f70bea5} }
2012-06-21 18:25:08,304 INFO source.ExecSource: Exec source starting with command:tail -f /home/hadoop/demo.txt
2012-06-21 18:25:08,306 DEBUG source.ExecSource: Exec source started

Output of the which Flume-ng is:
/usr/bin/flume-ng


----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
Cell: 010-7300-0150
Email: sharma.ashutosh@kt.com
----------------------------------------

From: Will McQueen [mailto:will@cloudera.com]
Sent: Thursday, June 21, 2012 6:07 PM
To: flume-user@incubator.apache.org
Subject: Re: Hbase-sink behavior

Hi Sharma,

Could you please describe how you installed flume? Also, I see you're getting this warning:

>> Warning: No configuration directory set! Use --conf <dir> to override.



The log4j.properties that flume provides is stored in the conf dir. If you specify the flume conf dir, flume can pick it up. So for troubleshooting you can try:

1) modifying the log4j.properties within flume's conf dir so that the top reads:
#flume.root.logger=DEBUG,console
flume.root.logger=DEBUG,LOGFILE
flume.log.dir=.
flume.log.file=flume.log

2) Run the flume agent while specifying the flume conf dir (--conf <dir>)

3) What's the output of 'which flume-ng'?

Cheers,
Will
On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀) <sh...@kt.com>> wrote:
Hi Hari,

I checked, agent is successfully tailing the file which I mentioned. Yes, you are right, agent has started properly without any error. Because there is no further movement, so it’s hard for me to identify the issue. I also used tail –F also, but no success.
Can you suggest me some technique to troubleshoot it, so I could identify the issue and resolve the same. Does flume record some log anywhere?

----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
Cell: 010-7300-0150
Email: sharma.ashutosh@kt.com<ht...@kt.com>
----------------------------------------

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com<ma...@cloudera.com>]
Sent: Thursday, June 21, 2012 5:25 PM

To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior

I am not sure if HBase changed their wire protocol between these versions. Looks like your agent has started properly. Are you sure data is being written into the file being tailed? I suggest using tail -F. The log being stuck here is ok, that is probably because nothing specific is required(or your log file rotated).

Thanks
Hari

--
Hari Shreedharan


On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:

Hi Hari,



Thanks for your prompt reply. I already created the table in Hbase with a column family and hadoop/hbase library is available to hadoop. I noticed that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?

Please see the below lines captured while running the flume agent:



>>> flume-ng  agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf

Warning: No configuration directory set! Use --conf <dir> to override.

Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access

Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from classpath

Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from classpath

+ exec /home/hadoop/jdk16/bin/java -Xmx20m -cp '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0..20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0..20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4..jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5..12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar' -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64 org.apache.flume.node.Application -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf

12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1

12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting - hbase-agent

12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting

12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 10

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent: hbase-agent

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration  for agents: [hbase-agent]

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Creating channels

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: created channel ch1

12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink

12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource@1ed0af9b }} sinkRunners:{sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume..channel.MemoryChannel@49de17f4} }

12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with command:tail -f /home/hadoop/demo.txt



Screen stuck here….no movement.



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------



From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Thursday, June 21, 2012 5:01 PM
To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior



Hi Ashutosh,



The sink will not create the table or column family. Make sure you have the table and column family. Also please make sure you have HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your class path).





Thanks

Hari



--

Hari Shreedharan



On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:

Hi,



I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.



Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members....



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------



-----Original Message-----

From: Mohammad Tariq [mailto:dontariq@gmail.com]

Sent: Friday, June 15, 2012 9:02 AM

To: flume-user@incubator.apache.org<ma...@incubator.apache.org>

Subject: Re: Hbase-sink behavior



Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you.



Regards,

Mohammad Tariq





On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hs...@cloudera.com>> wrote:

Hi Mohammad,



My answers are inline.



--

Hari Shreedharan



On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:



Hello list,



I am trying to use hbase-sink to collect data from a local file and

dump it into an Hbase table..But there are a few things I am not able

to understand and need some guidance.



This is the content of my conf file :



hbase-agent.sources = tail

hbase-agent..sinks = sink1

hbase-agent.channels = ch1

hbase-agent.sources.tail.type = exec

hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt

hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =

org.apache.flume.sink.hbase.HBaseSink

hbase-agent.sinks.sink1.channel = ch1

hbase-agent.sinks.sink1.table = test3

hbase-agent.sinks.sink1.columnFamily = testing

hbase-agent.sinks.sink1.column = foo

hbase-agent.sinks.sink1.serializer =

org.apache.flume.sink.hbase.SimpleHbaseEventSerializer

hbase-agent.sinks.sink1.serializer.payloadColumn = col1

hbase-agent.sinks.sink1.serializer.incrementColumn = col1

hbase-agent.sinks.sink1.serializer.keyType = timestamp

hbase-agent.sinks.sink1.serializer.rowPrefix = 1

hbase-agent.sinks.sink1.serializer.suffix = timestamp

hbase-agent.channels.ch1.type=memory



Right now I am taking just some simple text from a file which has

following content -



value1

value2

value3

value4

value5

value6



And my Hbase table looks like -



hbase(main):217:0> scan 'test3'

ROW COLUMN+CELL

11339716704561 column=testing:col1,

timestamp=1339716707569, value=value1

11339716704562 column=testing:col1,

timestamp=1339716707571, value=value4

11339716846594 column=testing:col1,

timestamp=1339716849608, value=value2

11339716846595 column=testing:col1,

timestamp=1339716849610, value=value1

11339716846596 column=testing:col1,

timestamp=1339716849611, value=value6

11339716846597 column=testing:col1,

timestamp=1339716849614, value=value6

11339716846598 column=testing:col1,

timestamp=1339716849615, value=value5

11339716846599 column=testing:col1,

timestamp=1339716849615, value=value6

incRow column=testing:col1,

timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C

9 row(s) in 0.0580 seconds



Now I have following questions -



1- Why the timestamp value is different from the row key?(I was trying

to make "1+timestamp" as the rowkey)



The value shown by hbase shell as timestamp is the time at which the

value was inserted into Hbase, while the value inserted by Flume is

the timestamp at which the sink read the event from the channel.

Depending on how long the network and HBase takes, these timestamps

can vary. If you want 1+timestamp as row key then you should configure it:



hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is

appended as-is to the suffix you choose.



2- Although I am not using "incRow", it stills appear in the table

with some value. Why so and what is this value??



The SimpleHBaseEventSerializer is only an example class. For custom

use cases you can write your own serializer by implementing

HbaseEventSerializer. In this case, you have specified

incrementColumn, which causes an increment on the column specified.

Simply don't specify that config and that row will not appear.



3- How can avoid the last row??



See above.





I am still in the learning phase so please pardon my ignorance..Many thanks.



No problem. Much of this is documented

here:

https://builds.apache.org/job/flume-trunk/site/apidocs/index.html<https://builds.apache..org/job/flume-trunk/site/apidocs/index.html>







Regards,

Mohammad Tariq





이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.

This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.




이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.

Re: Hbase-sink behavior

Posted by Will McQueen <wi...@cloudera.com>.
Hi Sharma,

Could you please describe how you installed flume? Also, I see you're
getting this warning:

>> Warning: No configuration directory set! Use --conf <dir> to override.


The log4j.properties that flume provides is stored in the conf dir. If you
specify the flume conf dir, flume can pick it up. So for troubleshooting
you can try:

1) modifying the log4j.properties within flume's conf dir so that the top
reads:
#flume.root.logger=DEBUG,console
flume.root.logger=DEBUG,LOGFILE
flume.log.dir=.
flume.log.file=flume.log

2) Run the flume agent while specifying the flume conf dir (--conf <dir>)

3) What's the output of 'which flume-ng'?

Cheers,
Will

On Thu, Jun 21, 2012 at 1:34 AM, ashutosh(오픈플랫폼개발팀)
<sh...@kt.com>wrote:

>  Hi Hari,
>
>
>
> I checked, agent is successfully tailing the file which I mentioned. Yes,
> you are right, agent has started properly without any error. Because there
> is no further movement, so it's hard for me to identify the issue. I also
> used tail -F also, but no success.
>
> Can you suggest me some technique to troubleshoot it, so I could identify
> the issue and resolve the same. Does flume record some log anywhere?
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> Cell: 010-7300-0150
>
> Email: sharma.ashutosh@kt.com
>
> ----------------------------------------
>
>
>
> *From:* Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> *Sent:* Thursday, June 21, 2012 5:25 PM
>
> *To:* flume-user@incubator.apache.org
> *Subject:* Re: Hbase-sink behavior
>
>
>
> I am not sure if HBase changed their wire protocol between these versions.
> Looks like your agent has started properly. Are you sure data is being
> written into the file being tailed? I suggest using tail -F. The log being
> stuck here is ok, that is probably because nothing specific is required(or
> your log file rotated).
>
>
>
> Thanks
>
> Hari
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:
>
>   Hi Hari,
>
>
>
> Thanks for your prompt reply. I already created the table in Hbase with a
> column family and hadoop/hbase library is available to hadoop. I noticed
> that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
>
> Please see the below lines captured while running the flume agent:
>
>
>
> >>> flume-ng  agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf
>
> Warning: No configuration directory set! Use --conf <dir> to override.
>
> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS
> access
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from classpath
>
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from
> classpath
>
> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp
> '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar'
> -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> org.apache.flume.node.Application -n hbase-agent -f
> /home/hadoop/flumeng/hbaseagent.conf
>
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> supervisor 1
>
> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting - hbase-agent
>
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node manager
> starting
>
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle
> supervisor 10
>
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> Configuration provider starting
>
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent:
> hbase-agent
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume
> configuration contains configuration  for agents: [hbase-agent]
>
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> Creating channels
>
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider:
> created channel ch1
>
> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of sink
> sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node
> configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: {
> source:org.apache.flume.source.ExecSource@1ed0af9b }}
> sinkRunners:{sink1=SinkRunner: {
> policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb counterGroup:{
> name:null counters:{} } }}
> channels:{ch1=org.apache.flume.channel.MemoryChannel@49de17f4} }
>
> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with
> command:tail -f /home/hadoop/demo.txt
>
>
>
> Screen stuck here....no movement.
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> ----------------------------------------
>
>
>
> *From:* Hari Shreedharan [mailto:hshreedharan@cloudera.com<hs...@cloudera.com>]
>
> *Sent:* Thursday, June 21, 2012 5:01 PM
> *To:* flume-user@incubator.apache.org
> *Subject:* Re: Hbase-sink behavior
>
>
>
> Hi Ashutosh,
>
>
>
> The sink will not create the table or column family. Make sure you have
> the table and column family. Also please make sure you have
> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your
> class path).
>
>
>
>
>
> Thanks
>
> Hari
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
>
>   Hi,
>
>
>
> I have used and followed the same steps which is mentioned in below mails
> to get start with the hbasesink. But agent is not storing any data into
> hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase
> information. Even I am able to connect to the hbase server from that agent
> machine.
>
>
>
> Now, I am unable to understand and troubleshoot this problem. Seeking
> advice from the community members...
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> ----------------------------------------
>
>
>
> -----Original Message-----
>
> From: Mohammad Tariq [mailto:dontariq@gmail.com <do...@gmail.com>]
>
> Sent: Friday, June 15, 2012 9:02 AM
>
> To: flume-user@incubator.apache.org
>
> Subject: Re: Hbase-sink behavior
>
>
>
> Thank you so much Hari for the valuable response..I'll follow the
> guidelines provided by you.
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <
> hshreedharan@cloudera.com> wrote:
>
>  Hi Mohammad,
>
>
>
> My answers are inline.
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>
>
>
> Hello list,
>
>
>
> I am trying to use hbase-sink to collect data from a local file and
>
> dump it into an Hbase table..But there are a few things I am not able
>
> to understand and need some guidance.
>
>
>
> This is the content of my conf file :
>
>
>
> hbase-agent.sources = tail
>
> hbase-agent..sinks = sink1
>
> hbase-agent.channels = ch1
>
> hbase-agent.sources.tail.type = exec
>
> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>
> hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
>
> org.apache.flume.sink.hbase.HBaseSink
>
> hbase-agent.sinks.sink1.channel = ch1
>
> hbase-agent.sinks.sink1.table = test3
>
> hbase-agent.sinks.sink1.columnFamily = testing
>
> hbase-agent.sinks.sink1.column = foo
>
> hbase-agent.sinks.sink1.serializer =
>
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>
> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>
> hbase-agent.channels.ch1.type=memory
>
>
>
> Right now I am taking just some simple text from a file which has
>
> following content -
>
>
>
> value1
>
> value2
>
> value3
>
> value4
>
> value5
>
> value6
>
>
>
> And my Hbase table looks like -
>
>
>
> hbase(main):217:0> scan 'test3'
>
> ROW COLUMN+CELL
>
> 11339716704561 column=testing:col1,
>
> timestamp=1339716707569, value=value1
>
> 11339716704562 column=testing:col1,
>
> timestamp=1339716707571, value=value4
>
> 11339716846594 column=testing:col1,
>
> timestamp=1339716849608, value=value2
>
> 11339716846595 column=testing:col1,
>
> timestamp=1339716849610, value=value1
>
> 11339716846596 column=testing:col1,
>
> timestamp=1339716849611, value=value6
>
> 11339716846597 column=testing:col1,
>
> timestamp=1339716849614, value=value6
>
> 11339716846598 column=testing:col1,
>
> timestamp=1339716849615, value=value5
>
> 11339716846599 column=testing:col1,
>
> timestamp=1339716849615, value=value6
>
> incRow column=testing:col1,
>
> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
>
> 9 row(s) in 0.0580 seconds
>
>
>
> Now I have following questions -
>
>
>
> 1- Why the timestamp value is different from the row key?(I was trying
>
> to make "1+timestamp" as the rowkey)
>
>
>
> The value shown by hbase shell as timestamp is the time at which the
>
> value was inserted into Hbase, while the value inserted by Flume is
>
> the timestamp at which the sink read the event from the channel.
>
> Depending on how long the network and HBase takes, these timestamps
>
> can vary. If you want 1+timestamp as row key then you should configure it:
>
>
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
>
> appended as-is to the suffix you choose.
>
>
>
> 2- Although I am not using "incRow", it stills appear in the table
>
> with some value. Why so and what is this value??
>
>
>
> The SimpleHBaseEventSerializer is only an example class. For custom
>
> use cases you can write your own serializer by implementing
>
> HbaseEventSerializer. In this case, you have specified
>
> incrementColumn, which causes an increment on the column specified.
>
> Simply don't specify that config and that row will not appear.
>
>
>
> 3- How can avoid the last row??
>
>
>
> See above.
>
>
>
>
>
> I am still in the learning phase so please pardon my ignorance..Many
> thanks.
>
>
>
> No problem. Much of this is documented
>
> here:
>
> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html<https://builds.apache..org/job/flume-trunk/site/apidocs/index.html>
>
>
>
>
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> This E-mail may contain confidential information and/or copyright
> material. This email is intended for the use of the addressee only. If you
> receive this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>

RE: Hbase-sink behavior

Posted by "ashutosh (오픈플랫폼개발팀)" <sh...@kt.com>.
Hi Hari,

I checked, agent is successfully tailing the file which I mentioned. Yes, you are right, agent has started properly without any error. Because there is no further movement, so it’s hard for me to identify the issue. I also used tail –F also, but no success.
Can you suggest me some technique to troubleshoot it, so I could identify the issue and resolve the same. Does flume record some log anywhere?

----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
Cell: 010-7300-0150
Email: sharma.ashutosh@kt.com
----------------------------------------

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Thursday, June 21, 2012 5:25 PM
To: flume-user@incubator.apache.org
Subject: Re: Hbase-sink behavior

I am not sure if HBase changed their wire protocol between these versions. Looks like your agent has started properly. Are you sure data is being written into the file being tailed? I suggest using tail -F. The log being stuck here is ok, that is probably because nothing specific is required(or your log file rotated).

Thanks
Hari

--
Hari Shreedharan


On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:

Hi Hari,



Thanks for your prompt reply. I already created the table in Hbase with a column family and hadoop/hbase library is available to hadoop. I noticed that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?

Please see the below lines captured while running the flume agent:



>>> flume-ng  agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf

Warning: No configuration directory set! Use --conf <dir> to override.

Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access

Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from classpath

Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from classpath

+ exec /home/hadoop/jdk16/bin/java -Xmx20m -cp '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar' -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64 org.apache.flume.node.Application -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf

12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1

12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting - hbase-agent

12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting

12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 10

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf<file:///\\home\hadoop\flumeng\hbaseagent.conf>

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent: hbase-agent

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1

12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration  for agents: [hbase-agent]

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Creating channels

12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: created channel ch1

12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink

12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource@1ed0af9b }} sinkRunners:{sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel@49de17f4} }

12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with command:tail -f /home/hadoop/demo.txt



Screen stuck here….no movement.



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------



From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Thursday, June 21, 2012 5:01 PM
To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior



Hi Ashutosh,



The sink will not create the table or column family. Make sure you have the table and column family. Also please make sure you have HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your class path).





Thanks

Hari



--

Hari Shreedharan



On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:

Hi,



I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.



Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members...



----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------



-----Original Message-----

From: Mohammad Tariq [mailto:dontariq@gmail.com]

Sent: Friday, June 15, 2012 9:02 AM

To: flume-user@incubator.apache.org<ma...@incubator.apache.org>

Subject: Re: Hbase-sink behavior



Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you.



Regards,

Mohammad Tariq





On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hs...@cloudera.com>> wrote:

Hi Mohammad,



My answers are inline.



--

Hari Shreedharan



On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:



Hello list,



I am trying to use hbase-sink to collect data from a local file and

dump it into an Hbase table..But there are a few things I am not able

to understand and need some guidance.



This is the content of my conf file :



hbase-agent.sources = tail

hbase-agent..sinks = sink1

hbase-agent.channels = ch1

hbase-agent.sources.tail.type = exec

hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt

hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =

org.apache.flume.sink.hbase.HBaseSink

hbase-agent.sinks.sink1.channel = ch1

hbase-agent.sinks.sink1.table = test3

hbase-agent.sinks.sink1.columnFamily = testing

hbase-agent.sinks.sink1.column = foo

hbase-agent.sinks.sink1.serializer =

org.apache.flume.sink.hbase.SimpleHbaseEventSerializer

hbase-agent.sinks.sink1.serializer.payloadColumn = col1

hbase-agent.sinks.sink1.serializer.incrementColumn = col1

hbase-agent.sinks.sink1.serializer.keyType = timestamp

hbase-agent.sinks.sink1.serializer.rowPrefix = 1

hbase-agent.sinks.sink1.serializer.suffix = timestamp

hbase-agent.channels.ch1.type=memory



Right now I am taking just some simple text from a file which has

following content -



value1

value2

value3

value4

value5

value6



And my Hbase table looks like -



hbase(main):217:0> scan 'test3'

ROW COLUMN+CELL

11339716704561 column=testing:col1,

timestamp=1339716707569, value=value1

11339716704562 column=testing:col1,

timestamp=1339716707571, value=value4

11339716846594 column=testing:col1,

timestamp=1339716849608, value=value2

11339716846595 column=testing:col1,

timestamp=1339716849610, value=value1

11339716846596 column=testing:col1,

timestamp=1339716849611, value=value6

11339716846597 column=testing:col1,

timestamp=1339716849614, value=value6

11339716846598 column=testing:col1,

timestamp=1339716849615, value=value5

11339716846599 column=testing:col1,

timestamp=1339716849615, value=value6

incRow column=testing:col1,

timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C

9 row(s) in 0.0580 seconds



Now I have following questions -



1- Why the timestamp value is different from the row key?(I was trying

to make "1+timestamp" as the rowkey)



The value shown by hbase shell as timestamp is the time at which the

value was inserted into Hbase, while the value inserted by Flume is

the timestamp at which the sink read the event from the channel.

Depending on how long the network and HBase takes, these timestamps

can vary. If you want 1+timestamp as row key then you should configure it:



hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is

appended as-is to the suffix you choose.



2- Although I am not using "incRow", it stills appear in the table

with some value. Why so and what is this value??



The SimpleHBaseEventSerializer is only an example class. For custom

use cases you can write your own serializer by implementing

HbaseEventSerializer. In this case, you have specified

incrementColumn, which causes an increment on the column specified.

Simply don't specify that config and that row will not appear.



3- How can avoid the last row??



See above.





I am still in the learning phase so please pardon my ignorance..Many thanks.



No problem. Much of this is documented

here:

https://builds.apache.org/job/flume-trunk/site/apidocs/index.html<https://builds.apache..org/job/flume-trunk/site/apidocs/index.html>







Regards,

Mohammad Tariq





이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.

This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.




이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.

Re: Hbase-sink behavior

Posted by Hari Shreedharan <hs...@cloudera.com>.
I am not sure if HBase changed their wire protocol between these versions. Looks like your agent has started properly. Are you sure data is being written into the file being tailed? I suggest using tail -F. The log being stuck here is ok, that is probably because nothing specific is required(or your log file rotated).  

Thanks
Hari

--  
Hari Shreedharan


On Thursday, June 21, 2012 at 1:19 AM, ashutosh(오픈플랫폼개발팀) wrote:

>  
> Hi Hari,
>  
>  
>   
>  
>  
> Thanks for your prompt reply. I already created the table in Hbase with a column family and hadoop/hbase library is available to hadoop. I noticed that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?  
>  
>  
> Please see the below lines captured while running the flume agent:
>  
>  
>   
>  
>  
> >>> flume-ng  agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf
>  
>  
> Warning: No configuration directory set! Use --conf <dir> to override.
>  
>  
> Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access
>  
>  
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from classpath
>  
>  
> Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from classpath
>  
>  
> + exec /home/hadoop/jdk16/bin/java -Xmx20m -cp '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar' -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64 org.apache.flume.node.Application -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf
>  
>  
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1
>  
>  
> 12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting - hbase-agent
>  
>  
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting
>  
>  
> 12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 10
>  
>  
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting
>  
>  
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent: hbase-agent
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
>  
>  
> 12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration  for agents: [hbase-agent]
>  
>  
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Creating channels
>  
>  
> 12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: created channel ch1
>  
>  
> 12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
>  
>  
> 12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource@1ed0af9b }} sinkRunners:{sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel@49de17f4} }
>  
>  
> 12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with command:tail -f /home/hadoop/demo.txt
>  
>  
>   
>  
>  
> Screen stuck here….no movement.
>  
>  
>   
>  
>  
> ----------------------------------------
>  
>  
> ----------------------------------------
>  
>  
> Thanks & Regards,
>  
>  
> Ashutosh Sharma
>  
>  
> ----------------------------------------
>  
>  
>   
>  
>  
> From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]  
> Sent: Thursday, June 21, 2012 5:01 PM
> To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> Subject: Re: Hbase-sink behavior
>  
>  
>  
>   
>  
>  
> Hi Ashutosh,
>  
>  
>  
>   
>  
>  
>  
> The sink will not create the table or column family. Make sure you have the table and column family. Also please make sure you have HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your class path).   
>  
>  
>  
>   
>  
>  
>  
>   
>  
>  
>  
> Thanks
>  
>  
>  
> Hari
>  
>  
>  
>   
>  
>  
>  
> --  
>  
>  
>  
> Hari Shreedharan
>  
>  
>  
>   
>  
>  
>  
> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
> >  
> > Hi,
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members...
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > ----------------------------------------
> >  
> >  
> >  
> > ----------------------------------------
> >  
> >  
> >  
> > Thanks & Regards,
> >  
> >  
> >  
> > Ashutosh Sharma
> >  
> >  
> >  
> > ----------------------------------------
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > -----Original Message-----
> >  
> >  
> >  
> > From: Mohammad Tariq [mailto:dontariq@gmail.com]
> >  
> >  
> >  
> > Sent: Friday, June 15, 2012 9:02 AM
> >  
> >  
> >  
> > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> >  
> >  
> >  
> > Subject: Re: Hbase-sink behavior
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you.
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > Regards,
> >  
> >  
> >  
> > Mohammad Tariq
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)> wrote:
> >  
> >  
> > >  
> > > Hi Mohammad,
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > My answers are inline.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > --
> > >  
> > >  
> > >  
> > > Hari Shreedharan
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > Hello list,
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > I am trying to use hbase-sink to collect data from a local file and
> > >  
> > >  
> > >  
> > > dump it into an Hbase table..But there are a few things I am not able
> > >  
> > >  
> > >  
> > > to understand and need some guidance.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > This is the content of my conf file :
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > hbase-agent.sources = tail
> > >  
> > >  
> > >  
> > > hbase-agent..sinks = sink1
> > >  
> > >  
> > >  
> > > hbase-agent.channels = ch1
> > >  
> > >  
> > >  
> > > hbase-agent.sources.tail.type = exec
> > >  
> > >  
> > >  
> > > hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> > >  
> > >  
> > >  
> > > hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
> > >  
> > >  
> > >  
> > > org.apache.flume.sink.hbase.HBaseSink
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.channel = ch1
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.table = test3
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.columnFamily = testing
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.column = foo
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer =
> > >  
> > >  
> > >  
> > > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.suffix = timestamp
> > >  
> > >  
> > >  
> > > hbase-agent.channels.ch1.type=memory
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > Right now I am taking just some simple text from a file which has
> > >  
> > >  
> > >  
> > > following content -
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > value1
> > >  
> > >  
> > >  
> > > value2
> > >  
> > >  
> > >  
> > > value3
> > >  
> > >  
> > >  
> > > value4
> > >  
> > >  
> > >  
> > > value5
> > >  
> > >  
> > >  
> > > value6
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > And my Hbase table looks like -
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > hbase(main):217:0> scan 'test3'
> > >  
> > >  
> > >  
> > > ROW COLUMN+CELL
> > >  
> > >  
> > >  
> > > 11339716704561 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716707569, value=value1
> > >  
> > >  
> > >  
> > > 11339716704562 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716707571, value=value4
> > >  
> > >  
> > >  
> > > 11339716846594 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849608, value=value2
> > >  
> > >  
> > >  
> > > 11339716846595 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849610, value=value1
> > >  
> > >  
> > >  
> > > 11339716846596 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849611, value=value6
> > >  
> > >  
> > >  
> > > 11339716846597 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849614, value=value6
> > >  
> > >  
> > >  
> > > 11339716846598 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849615, value=value5
> > >  
> > >  
> > >  
> > > 11339716846599 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849615, value=value6
> > >  
> > >  
> > >  
> > > incRow column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> > >  
> > >  
> > >  
> > > 9 row(s) in 0.0580 seconds
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > Now I have following questions -
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > 1- Why the timestamp value is different from the row key?(I was trying
> > >  
> > >  
> > >  
> > > to make "1+timestamp" as the rowkey)
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > The value shown by hbase shell as timestamp is the time at which the
> > >  
> > >  
> > >  
> > > value was inserted into Hbase, while the value inserted by Flume is
> > >  
> > >  
> > >  
> > > the timestamp at which the sink read the event from the channel.
> > >  
> > >  
> > >  
> > > Depending on how long the network and HBase takes, these timestamps
> > >  
> > >  
> > >  
> > > can vary. If you want 1+timestamp as row key then you should configure it:
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> > >  
> > >  
> > >  
> > > appended as-is to the suffix you choose.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > 2- Although I am not using "incRow", it stills appear in the table
> > >  
> > >  
> > >  
> > > with some value. Why so and what is this value??
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > The SimpleHBaseEventSerializer is only an example class. For custom
> > >  
> > >  
> > >  
> > > use cases you can write your own serializer by implementing
> > >  
> > >  
> > >  
> > > HbaseEventSerializer. In this case, you have specified
> > >  
> > >  
> > >  
> > > incrementColumn, which causes an increment on the column specified.
> > >  
> > >  
> > >  
> > > Simply don't specify that config and that row will not appear.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > 3- How can avoid the last row??
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > See above.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > I am still in the learning phase so please pardon my ignorance..Many thanks.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > No problem. Much of this is documented
> > >  
> > >  
> > >  
> > > here:
> > >  
> > >  
> > >  
> > > https://builds.apache.org/job/flume-trunk/site/apidocs/index.html (https://builds.apache..org/job/flume-trunk/site/apidocs/index.html)
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > Regards,
> > >  
> > >  
> > >  
> > > Mohammad Tariq
> > >  
> > >  
> > >  
> >  
> >  
> >   
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >  
> >  
> >  
> > This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.
> >  
> >  
> >  
> >  
>  
>  
>   
>  
>  
>  
>   
>  
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.  
> This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.
>  
>  
>  
>  



RE: Hbase-sink behavior

Posted by "ashutosh (오픈플랫폼개발팀)" <sh...@kt.com>.
Hi Hari,

Thanks for your prompt reply. I already created the table in Hbase with a column family and hadoop/hbase library is available to hadoop. I noticed that I am using Hbase 0.90.4. Do I need to upgrade it to 0.92?
Please see the below lines captured while running the flume agent:

>>> flume-ng  agent -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf
Warning: No configuration directory set! Use --conf <dir> to override.
Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access
Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar from classpath
Info: Excluding /usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar from classpath
+ exec /home/hadoop/jdk16/bin/java -Xmx20m -cp '/usr/lib/flume-ng/lib/*:/usr/lib/hadoop-0.20/conf:/home/hadoop/jdk16/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u4.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar' -Djava.library.path=:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64 org.apache.flume.node.Application -n hbase-agent -f /home/hadoop/flumeng/hbaseagent.conf
12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1
12/06/21 16:40:42 INFO node.FlumeNode: Flume node starting - hbase-agent
12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting
12/06/21 16:40:42 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 10
12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting
12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:/home/hadoop/flumeng/hbaseagent.conf
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Added sinks: sink1 Agent: hbase-agent
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Processing:sink1
12/06/21 16:40:42 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration  for agents: [hbase-agent]
12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: Creating channels
12/06/21 16:40:42 INFO properties.PropertiesFileConfigurationProvider: created channel ch1
12/06/21 16:40:42 INFO sink.DefaultSinkFactory: Creating instance of sink sink1 typeorg.apache.flume.sink.hbase.HBaseSink
12/06/21 16:40:42 INFO nodemanager.DefaultLogicalNodeManager: Node configuration change:{ sourceRunners:{tail=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource@1ed0af9b }} sinkRunners:{sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@16b8f8eb counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel@49de17f4} }
12/06/21 16:40:42 INFO source.ExecSource: Exec source starting with command:tail -f /home/hadoop/demo.txt

Screen stuck here….no movement.

----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
----------------------------------------

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Thursday, June 21, 2012 5:01 PM
To: flume-user@incubator.apache.org
Subject: Re: Hbase-sink behavior

Hi Ashutosh,

The sink will not create the table or column family. Make sure you have the table and column family. Also please make sure you have HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your class path).


Thanks
Hari

--
Hari Shreedharan


On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
Hi,

I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.

Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members...

----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
----------------------------------------

-----Original Message-----
From: Mohammad Tariq [mailto:dontariq@gmail.com]
Sent: Friday, June 15, 2012 9:02 AM
To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
Subject: Re: Hbase-sink behavior

Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you.

Regards,
Mohammad Tariq


On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hs...@cloudera.com>> wrote:
Hi Mohammad,

My answers are inline.

--
Hari Shreedharan

On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:

Hello list,

I am trying to use hbase-sink to collect data from a local file and
dump it into an Hbase table..But there are a few things I am not able
to understand and need some guidance.

This is the content of my conf file :

hbase-agent.sources = tail
hbase-agent..sinks = sink1
hbase-agent.channels = ch1
hbase-agent.sources.tail.type = exec
hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
org.apache.flume.sink.hbase.HBaseSink
hbase-agent.sinks.sink1.channel = ch1
hbase-agent.sinks.sink1.table = test3
hbase-agent.sinks.sink1.columnFamily = testing
hbase-agent.sinks.sink1.column = foo
hbase-agent.sinks.sink1.serializer =
org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
hbase-agent.sinks.sink1.serializer.payloadColumn = col1
hbase-agent.sinks.sink1.serializer.incrementColumn = col1
hbase-agent.sinks.sink1.serializer.keyType = timestamp
hbase-agent.sinks.sink1.serializer.rowPrefix = 1
hbase-agent.sinks.sink1.serializer.suffix = timestamp
hbase-agent.channels.ch1.type=memory

Right now I am taking just some simple text from a file which has
following content -

value1
value2
value3
value4
value5
value6

And my Hbase table looks like -

hbase(main):217:0> scan 'test3'
ROW COLUMN+CELL
11339716704561 column=testing:col1,
timestamp=1339716707569, value=value1
11339716704562 column=testing:col1,
timestamp=1339716707571, value=value4
11339716846594 column=testing:col1,
timestamp=1339716849608, value=value2
11339716846595 column=testing:col1,
timestamp=1339716849610, value=value1
11339716846596 column=testing:col1,
timestamp=1339716849611, value=value6
11339716846597 column=testing:col1,
timestamp=1339716849614, value=value6
11339716846598 column=testing:col1,
timestamp=1339716849615, value=value5
11339716846599 column=testing:col1,
timestamp=1339716849615, value=value6
incRow column=testing:col1,
timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
9 row(s) in 0.0580 seconds

Now I have following questions -

1- Why the timestamp value is different from the row key?(I was trying
to make "1+timestamp" as the rowkey)

The value shown by hbase shell as timestamp is the time at which the
value was inserted into Hbase, while the value inserted by Flume is
the timestamp at which the sink read the event from the channel.
Depending on how long the network and HBase takes, these timestamps
can vary. If you want 1+timestamp as row key then you should configure it:

hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
appended as-is to the suffix you choose.

2- Although I am not using "incRow", it stills appear in the table
with some value. Why so and what is this value??

The SimpleHBaseEventSerializer is only an example class. For custom
use cases you can write your own serializer by implementing
HbaseEventSerializer. In this case, you have specified
incrementColumn, which causes an increment on the column specified.
Simply don't specify that config and that row will not appear.

3- How can avoid the last row??

See above.


I am still in the learning phase so please pardon my ignorance..Many thanks.

No problem. Much of this is documented
here:
https://builds.apache.org/job/flume-trunk/site/apidocs/index.html<https://builds.apache..org/job/flume-trunk/site/apidocs/index.html>



Regards,
Mohammad Tariq


이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.

Re: 答复: Hbase-sink behavior

Posted by Rahul Patodi <pa...@gmail.com>.
Hi Mohammad Tariq,
I have done all the configuration you have specified in 1st mail of this
mail threat
But console is getting stuck after:
*12/06/21 15:42:14 INFO source.ExecSource: Exec source starting with
command:tail -F /tmp/test05*

What extra configuration have you done ?

On Thu, Jun 21, 2012 at 1:54 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello all,
>
>        I am also using hbase-0.90.4 and luckily it worked for me.
> Regards,
>     Mohammad Tariq
>
>
> On Thu, Jun 21, 2012 at 1:51 PM, Hari Shreedharan
> <hs...@cloudera.com> wrote:
> > Some channels like the FileChannel, Recoverable Memory Channel too
> require
> > the Hadoop classes. If you are using these, please make sure they are
> > available.
> >
> > --
> > Hari Shreedharan
> >
> > On Thursday, June 21, 2012 at 1:07 AM, Shara Shi wrote:
> >
> > I got similar problem , when HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set
> > NOT correctly.
> >
> > The data is not getting to move to sink.
> >
> > But my sink is not HDFS/HBASE , I don’t think HADOOP_HOME/HADOOP_PREFIX
> and
> > HBASE_HOME are necessary…
> >
> >
> >
> > Regards
> >
> > Ruihong
> >
> >
> >
> >
> >
> >
> >
> > 发件人: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> > 发送时间: 2012年6月21日 16:01
> > 收件人: flume-user@incubator.apache.org
> > 主题: Re: Hbase-sink behavior
> >
> >
> >
> > Hi Ashutosh,
> >
> >
> >
> > The sink will not create the table or column family. Make sure you have
> the
> > table and column family. Also please make sure you have
> > HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in
> your
> > class path).
> >
> >
> >
> >
> >
> > Thanks
> >
> > Hari
> >
> >
> >
> > --
> >
> > Hari Shreedharan
> >
> >
> >
> > On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
> >
> > Hi,
> >
> >
> >
> > I have used and followed the same steps which is mentioned in below
> mails to
> > get start with the hbasesink. But agent is not storing any data into
> hbase.
> > I added the hbase-site.xml in $CLASSPATH variable to pick the hbase
> > information. Even I am able to connect to the hbase server from that
> agent
> > machine.
> >
> >
> >
> > Now, I am unable to understand and troubleshoot this problem. Seeking
> advice
> > from the community members...
> >
> >
> >
> > ----------------------------------------
> >
> > ----------------------------------------
> >
> > Thanks & Regards,
> >
> > Ashutosh Sharma
> >
> > ----------------------------------------
> >
> >
> >
> > -----Original Message-----
> >
> > From: Mohammad Tariq [mailto:dontariq@gmail.com]
> >
> > Sent: Friday, June 15, 2012 9:02 AM
> >
> > To: flume-user@incubator.apache.org
> >
> > Subject: Re: Hbase-sink behavior
> >
> >
> >
> > Thank you so much Hari for the valuable response..I'll follow the
> guidelines
> > provided by you.
> >
> >
> >
> > Regards,
> >
> > Mohammad Tariq
> >
> >
> >
> >
> >
> > On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
> > <hs...@cloudera.com> wrote:
> >
> > Hi Mohammad,
> >
> >
> >
> > My answers are inline.
> >
> >
> >
> > --
> >
> > Hari Shreedharan
> >
> >
> >
> > On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
> >
> >
> >
> > Hello list,
> >
> >
> >
> > I am trying to use hbase-sink to collect data from a local file and
> >
> > dump it into an Hbase table..But there are a few things I am not able
> >
> > to understand and need some guidance.
> >
> >
> >
> > This is the content of my conf file :
> >
> >
> >
> > hbase-agent.sources = tail
> >
> > hbase-agent.sinks = sink1
> >
> > hbase-agent.channels = ch1
> >
> > hbase-agent.sources.tail.type = exec
> >
> > hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> >
> > hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
> >
> > org.apache.flume.sink.hbase.HBaseSink
> >
> > hbase-agent.sinks.sink1.channel = ch1
> >
> > hbase-agent.sinks.sink1.table = test3
> >
> > hbase-agent.sinks.sink1.columnFamily = testing
> >
> > hbase-agent.sinks.sink1.column = foo
> >
> > hbase-agent.sinks.sink1.serializer =
> >
> > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> >
> > hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> >
> > hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> >
> > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> >
> > hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> >
> > hbase-agent.sinks.sink1.serializer.suffix = timestamp
> >
> > hbase-agent.channels.ch1.type=memory
> >
> >
> >
> > Right now I am taking just some simple text from a file which has
> >
> > following content -
> >
> >
> >
> > value1
> >
> > value2
> >
> > value3
> >
> > value4
> >
> > value5
> >
> > value6
> >
> >
> >
> > And my Hbase table looks like -
> >
> >
> >
> > hbase(main):217:0> scan 'test3'
> >
> > ROW COLUMN+CELL
> >
> > 11339716704561 column=testing:col1,
> >
> > timestamp=1339716707569, value=value1
> >
> > 11339716704562 column=testing:col1,
> >
> > timestamp=1339716707571, value=value4
> >
> > 11339716846594 column=testing:col1,
> >
> > timestamp=1339716849608, value=value2
> >
> > 11339716846595 column=testing:col1,
> >
> > timestamp=1339716849610, value=value1
> >
> > 11339716846596 column=testing:col1,
> >
> > timestamp=1339716849611, value=value6
> >
> > 11339716846597 column=testing:col1,
> >
> > timestamp=1339716849614, value=value6
> >
> > 11339716846598 column=testing:col1,
> >
> > timestamp=1339716849615, value=value5
> >
> > 11339716846599 column=testing:col1,
> >
> > timestamp=1339716849615, value=value6
> >
> > incRow column=testing:col1,
> >
> > timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> >
> > 9 row(s) in 0.0580 seconds
> >
> >
> >
> > Now I have following questions -
> >
> >
> >
> > 1- Why the timestamp value is different from the row key?(I was trying
> >
> > to make "1+timestamp" as the rowkey)
> >
> >
> >
> > The value shown by hbase shell as timestamp is the time at which the
> >
> > value was inserted into Hbase, while the value inserted by Flume is
> >
> > the timestamp at which the sink read the event from the channel.
> >
> > Depending on how long the network and HBase takes, these timestamps
> >
> > can vary. If you want 1+timestamp as row key then you should configure
> it:
> >
> >
> >
> > hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> >
> > appended as-is to the suffix you choose.
> >
> >
> >
> > 2- Although I am not using "incRow", it stills appear in the table
> >
> > with some value. Why so and what is this value??
> >
> >
> >
> > The SimpleHBaseEventSerializer is only an example class. For custom
> >
> > use cases you can write your own serializer by implementing
> >
> > HbaseEventSerializer. In this case, you have specified
> >
> > incrementColumn, which causes an increment on the column specified.
> >
> > Simply don't specify that config and that row will not appear.
> >
> >
> >
> > 3- How can avoid the last row??
> >
> >
> >
> > See above.
> >
> >
> >
> >
> >
> > I am still in the learning phase so please pardon my ignorance..Many
> thanks.
> >
> >
> >
> > No problem. Much of this is documented
> >
> > here:
> >
> > https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
> >
> >
> >
> >
> >
> >
> >
> > Regards,
> >
> > Mohammad Tariq
> >
> >
> >
> >
> >
> > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에
> 포함된
> > 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못
> 전송된
> > 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >
> > This E-mail may contain confidential information and/or copyright
> material.
> > This email is intended for the use of the addressee only. If you receive
> > this email by mistake, please either delete it without reproducing,
> > distributing or retaining copies thereof or notify the sender
> immediately.
> >
> >
> >
> >
>



-- 
*Regards*,
Rahul Patodi

Re: 答复: Hbase-sink behavior

Posted by Mohammad Tariq <do...@gmail.com>.
Hello all,

        I am also using hbase-0.90.4 and luckily it worked for me.
Regards,
    Mohammad Tariq


On Thu, Jun 21, 2012 at 1:51 PM, Hari Shreedharan
<hs...@cloudera.com> wrote:
> Some channels like the FileChannel, Recoverable Memory Channel too require
> the Hadoop classes. If you are using these, please make sure they are
> available.
>
> --
> Hari Shreedharan
>
> On Thursday, June 21, 2012 at 1:07 AM, Shara Shi wrote:
>
> I got similar problem , when HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set
> NOT correctly.
>
> The data is not getting to move to sink.
>
> But my sink is not HDFS/HBASE , I don’t think HADOOP_HOME/HADOOP_PREFIX and
> HBASE_HOME are necessary…
>
>
>
> Regards
>
> Ruihong
>
>
>
>
>
>
>
> 发件人: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
> 发送时间: 2012年6月21日 16:01
> 收件人: flume-user@incubator.apache.org
> 主题: Re: Hbase-sink behavior
>
>
>
> Hi Ashutosh,
>
>
>
> The sink will not create the table or column family. Make sure you have the
> table and column family. Also please make sure you have
> HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your
> class path).
>
>
>
>
>
> Thanks
>
> Hari
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
>
> Hi,
>
>
>
> I have used and followed the same steps which is mentioned in below mails to
> get start with the hbasesink. But agent is not storing any data into hbase.
> I added the hbase-site.xml in $CLASSPATH variable to pick the hbase
> information. Even I am able to connect to the hbase server from that agent
> machine.
>
>
>
> Now, I am unable to understand and troubleshoot this problem. Seeking advice
> from the community members...
>
>
>
> ----------------------------------------
>
> ----------------------------------------
>
> Thanks & Regards,
>
> Ashutosh Sharma
>
> ----------------------------------------
>
>
>
> -----Original Message-----
>
> From: Mohammad Tariq [mailto:dontariq@gmail.com]
>
> Sent: Friday, June 15, 2012 9:02 AM
>
> To: flume-user@incubator.apache.org
>
> Subject: Re: Hbase-sink behavior
>
>
>
> Thank you so much Hari for the valuable response..I'll follow the guidelines
> provided by you.
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
> <hs...@cloudera.com> wrote:
>
> Hi Mohammad,
>
>
>
> My answers are inline.
>
>
>
> --
>
> Hari Shreedharan
>
>
>
> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>
>
>
> Hello list,
>
>
>
> I am trying to use hbase-sink to collect data from a local file and
>
> dump it into an Hbase table..But there are a few things I am not able
>
> to understand and need some guidance.
>
>
>
> This is the content of my conf file :
>
>
>
> hbase-agent.sources = tail
>
> hbase-agent.sinks = sink1
>
> hbase-agent.channels = ch1
>
> hbase-agent.sources.tail.type = exec
>
> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
>
> hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
>
> org.apache.flume.sink.hbase.HBaseSink
>
> hbase-agent.sinks.sink1.channel = ch1
>
> hbase-agent.sinks.sink1.table = test3
>
> hbase-agent.sinks.sink1.columnFamily = testing
>
> hbase-agent.sinks.sink1.column = foo
>
> hbase-agent.sinks.sink1.serializer =
>
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
>
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
>
> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
>
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
>
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
>
> hbase-agent.channels.ch1.type=memory
>
>
>
> Right now I am taking just some simple text from a file which has
>
> following content -
>
>
>
> value1
>
> value2
>
> value3
>
> value4
>
> value5
>
> value6
>
>
>
> And my Hbase table looks like -
>
>
>
> hbase(main):217:0> scan 'test3'
>
> ROW COLUMN+CELL
>
> 11339716704561 column=testing:col1,
>
> timestamp=1339716707569, value=value1
>
> 11339716704562 column=testing:col1,
>
> timestamp=1339716707571, value=value4
>
> 11339716846594 column=testing:col1,
>
> timestamp=1339716849608, value=value2
>
> 11339716846595 column=testing:col1,
>
> timestamp=1339716849610, value=value1
>
> 11339716846596 column=testing:col1,
>
> timestamp=1339716849611, value=value6
>
> 11339716846597 column=testing:col1,
>
> timestamp=1339716849614, value=value6
>
> 11339716846598 column=testing:col1,
>
> timestamp=1339716849615, value=value5
>
> 11339716846599 column=testing:col1,
>
> timestamp=1339716849615, value=value6
>
> incRow column=testing:col1,
>
> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
>
> 9 row(s) in 0.0580 seconds
>
>
>
> Now I have following questions -
>
>
>
> 1- Why the timestamp value is different from the row key?(I was trying
>
> to make "1+timestamp" as the rowkey)
>
>
>
> The value shown by hbase shell as timestamp is the time at which the
>
> value was inserted into Hbase, while the value inserted by Flume is
>
> the timestamp at which the sink read the event from the channel.
>
> Depending on how long the network and HBase takes, these timestamps
>
> can vary. If you want 1+timestamp as row key then you should configure it:
>
>
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
>
> appended as-is to the suffix you choose.
>
>
>
> 2- Although I am not using "incRow", it stills appear in the table
>
> with some value. Why so and what is this value??
>
>
>
> The SimpleHBaseEventSerializer is only an example class. For custom
>
> use cases you can write your own serializer by implementing
>
> HbaseEventSerializer. In this case, you have specified
>
> incrementColumn, which causes an increment on the column specified.
>
> Simply don't specify that config and that row will not appear.
>
>
>
> 3- How can avoid the last row??
>
>
>
> See above.
>
>
>
>
>
> I am still in the learning phase so please pardon my ignorance..Many thanks.
>
>
>
> No problem. Much of this is documented
>
> here:
>
> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
>
>
>
>
>
>
>
> Regards,
>
> Mohammad Tariq
>
>
>
>
>
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된
> 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된
> 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
>
> This E-mail may contain confidential information and/or copyright material.
> This email is intended for the use of the addressee only. If you receive
> this email by mistake, please either delete it without reproducing,
> distributing or retaining copies thereof or notify the sender immediately.
>
>
>
>

Re: 答复: Hbase-sink behavior

Posted by Hari Shreedharan <hs...@cloudera.com>.
Some channels like the FileChannel, Recoverable Memory Channel too require the Hadoop classes. If you are using these, please make sure they are available.  

--  
Hari Shreedharan


On Thursday, June 21, 2012 at 1:07 AM, Shara Shi wrote:

>  
> I got similar problem , when HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set NOT correctly.
>  
>  
> The data is not getting to move to sink.  
>  
>  
> But my sink is not HDFS/HBASE , I don’t think HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME are necessary…
>  
>  
>   
>  
>  
> Regards
>  
>  
> Ruihong
>  
>  
>   
>  
>  
>   
>  
>  
>   
>  
>  
> 发件人: Hari Shreedharan [mailto:hshreedharan@cloudera.com]  
> 发送时间: 2012年6月21日 16:01
> 收件人: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> 主题: Re: Hbase-sink behavior
>  
>  
>  
>   
>  
>  
> Hi Ashutosh,
>  
>  
>  
>   
>  
>  
>  
> The sink will not create the table or column family. Make sure you have the table and column family. Also please make sure you have HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your class path).   
>  
>  
>  
>   
>  
>  
>  
>   
>  
>  
>  
> Thanks
>  
>  
>  
> Hari
>  
>  
>  
>   
>  
>  
>  
> --  
>  
>  
>  
> Hari Shreedharan
>  
>  
>  
>   
>  
>  
>  
> On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:
> >  
> > Hi,
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members...
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > ----------------------------------------
> >  
> >  
> >  
> > ----------------------------------------
> >  
> >  
> >  
> > Thanks & Regards,
> >  
> >  
> >  
> > Ashutosh Sharma
> >  
> >  
> >  
> > ----------------------------------------
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > -----Original Message-----
> >  
> >  
> >  
> > From: Mohammad Tariq [mailto:dontariq@gmail.com]
> >  
> >  
> >  
> > Sent: Friday, June 15, 2012 9:02 AM
> >  
> >  
> >  
> > To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> >  
> >  
> >  
> > Subject: Re: Hbase-sink behavior
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you.
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > Regards,
> >  
> >  
> >  
> > Mohammad Tariq
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)> wrote:
> >  
> >  
> > >  
> > > Hi Mohammad,
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > My answers are inline.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > --
> > >  
> > >  
> > >  
> > > Hari Shreedharan
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > Hello list,
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > I am trying to use hbase-sink to collect data from a local file and
> > >  
> > >  
> > >  
> > > dump it into an Hbase table..But there are a few things I am not able
> > >  
> > >  
> > >  
> > > to understand and need some guidance.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > This is the content of my conf file :
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > hbase-agent.sources = tail
> > >  
> > >  
> > >  
> > > hbase-agent.sinks = sink1
> > >  
> > >  
> > >  
> > > hbase-agent.channels = ch1
> > >  
> > >  
> > >  
> > > hbase-agent.sources.tail.type = exec
> > >  
> > >  
> > >  
> > > hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> > >  
> > >  
> > >  
> > > hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
> > >  
> > >  
> > >  
> > > org.apache.flume.sink.hbase.HBaseSink
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.channel = ch1
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.table = test3
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.columnFamily = testing
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.column = foo
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer =
> > >  
> > >  
> > >  
> > > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.suffix = timestamp
> > >  
> > >  
> > >  
> > > hbase-agent.channels.ch1.type=memory
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > Right now I am taking just some simple text from a file which has
> > >  
> > >  
> > >  
> > > following content -
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > value1
> > >  
> > >  
> > >  
> > > value2
> > >  
> > >  
> > >  
> > > value3
> > >  
> > >  
> > >  
> > > value4
> > >  
> > >  
> > >  
> > > value5
> > >  
> > >  
> > >  
> > > value6
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > And my Hbase table looks like -
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > hbase(main):217:0> scan 'test3'
> > >  
> > >  
> > >  
> > > ROW COLUMN+CELL
> > >  
> > >  
> > >  
> > > 11339716704561 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716707569, value=value1
> > >  
> > >  
> > >  
> > > 11339716704562 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716707571, value=value4
> > >  
> > >  
> > >  
> > > 11339716846594 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849608, value=value2
> > >  
> > >  
> > >  
> > > 11339716846595 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849610, value=value1
> > >  
> > >  
> > >  
> > > 11339716846596 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849611, value=value6
> > >  
> > >  
> > >  
> > > 11339716846597 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849614, value=value6
> > >  
> > >  
> > >  
> > > 11339716846598 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849615, value=value5
> > >  
> > >  
> > >  
> > > 11339716846599 column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849615, value=value6
> > >  
> > >  
> > >  
> > > incRow column=testing:col1,
> > >  
> > >  
> > >  
> > > timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> > >  
> > >  
> > >  
> > > 9 row(s) in 0.0580 seconds
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > Now I have following questions -
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > 1- Why the timestamp value is different from the row key?(I was trying
> > >  
> > >  
> > >  
> > > to make "1+timestamp" as the rowkey)
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > The value shown by hbase shell as timestamp is the time at which the
> > >  
> > >  
> > >  
> > > value was inserted into Hbase, while the value inserted by Flume is
> > >  
> > >  
> > >  
> > > the timestamp at which the sink read the event from the channel.
> > >  
> > >  
> > >  
> > > Depending on how long the network and HBase takes, these timestamps
> > >  
> > >  
> > >  
> > > can vary. If you want 1+timestamp as row key then you should configure it:
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> > >  
> > >  
> > >  
> > > appended as-is to the suffix you choose.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > 2- Although I am not using "incRow", it stills appear in the table
> > >  
> > >  
> > >  
> > > with some value. Why so and what is this value??
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > The SimpleHBaseEventSerializer is only an example class. For custom
> > >  
> > >  
> > >  
> > > use cases you can write your own serializer by implementing
> > >  
> > >  
> > >  
> > > HbaseEventSerializer. In this case, you have specified
> > >  
> > >  
> > >  
> > > incrementColumn, which causes an increment on the column specified.
> > >  
> > >  
> > >  
> > > Simply don't specify that config and that row will not appear.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > 3- How can avoid the last row??
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > See above.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > I am still in the learning phase so please pardon my ignorance..Many thanks.
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > No problem. Much of this is documented
> > >  
> > >  
> > >  
> > > here:
> > >  
> > >  
> > >  
> > > https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > >   
> > >  
> > >  
> > >  
> > > Regards,
> > >  
> > >  
> > >  
> > > Mohammad Tariq
> > >  
> > >  
> > >  
> >  
> >  
> >   
> >  
> >  
> >  
> >   
> >  
> >  
> >  
> > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> >  
> >  
> >  
> > This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.
> >  
> >  
> >  
> >  
>  
>  
>   
>  
>  
>  
>  
>  



答复: Hbase-sink behavior

Posted by Shara Shi <sh...@dhgate.com>.
I got similar problem , when HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set NOT correctly.

The data is not getting to move to sink. 

But my sink is not HDFS/HBASE , I don’t think HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME are necessary…

 

Regards

Ruihong

 

 

 

发件人: Hari Shreedharan [mailto:hshreedharan@cloudera.com] 
发送时间: 2012年6月21日 16:01
收件人: flume-user@incubator.apache.org
主题: Re: Hbase-sink behavior

 

Hi Ashutosh,

 

The sink will not create the table or column family. Make sure you have the table and column family. Also please make sure you have HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your class path).  

 

 

Thanks

Hari

 

-- 

Hari Shreedharan

 

On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:

Hi,

 

I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.

 

Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members...

 

----------------------------------------

----------------------------------------

Thanks & Regards,

Ashutosh Sharma

----------------------------------------

 

-----Original Message-----

From: Mohammad Tariq [mailto:dontariq@gmail.com]

Sent: Friday, June 15, 2012 9:02 AM

To: flume-user@incubator.apache.org

Subject: Re: Hbase-sink behavior

 

Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you.

 

Regards,

Mohammad Tariq

 

 

On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hs...@cloudera.com> wrote:

Hi Mohammad,

 

My answers are inline.

 

--

Hari Shreedharan

 

On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:

 

Hello list,

 

I am trying to use hbase-sink to collect data from a local file and

dump it into an Hbase table..But there are a few things I am not able

to understand and need some guidance.

 

This is the content of my conf file :

 

hbase-agent.sources = tail

hbase-agent.sinks = sink1

hbase-agent.channels = ch1

hbase-agent.sources.tail.type = exec

hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt

hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =

org.apache.flume.sink.hbase.HBaseSink

hbase-agent.sinks.sink1.channel = ch1

hbase-agent.sinks.sink1.table = test3

hbase-agent.sinks.sink1.columnFamily = testing

hbase-agent.sinks.sink1.column = foo

hbase-agent.sinks.sink1.serializer =

org.apache.flume.sink.hbase.SimpleHbaseEventSerializer

hbase-agent.sinks.sink1.serializer.payloadColumn = col1

hbase-agent.sinks.sink1.serializer.incrementColumn = col1

hbase-agent.sinks.sink1.serializer.keyType = timestamp

hbase-agent.sinks.sink1.serializer.rowPrefix = 1

hbase-agent.sinks.sink1.serializer.suffix = timestamp

hbase-agent.channels.ch1.type=memory

 

Right now I am taking just some simple text from a file which has

following content -

 

value1

value2

value3

value4

value5

value6

 

And my Hbase table looks like -

 

hbase(main):217:0> scan 'test3'

ROW COLUMN+CELL

11339716704561 column=testing:col1,

timestamp=1339716707569, value=value1

11339716704562 column=testing:col1,

timestamp=1339716707571, value=value4

11339716846594 column=testing:col1,

timestamp=1339716849608, value=value2

11339716846595 column=testing:col1,

timestamp=1339716849610, value=value1

11339716846596 column=testing:col1,

timestamp=1339716849611, value=value6

11339716846597 column=testing:col1,

timestamp=1339716849614, value=value6

11339716846598 column=testing:col1,

timestamp=1339716849615, value=value5

11339716846599 column=testing:col1,

timestamp=1339716849615, value=value6

incRow column=testing:col1,

timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C

9 row(s) in 0.0580 seconds

 

Now I have following questions -

 

1- Why the timestamp value is different from the row key?(I was trying

to make "1+timestamp" as the rowkey)

 

The value shown by hbase shell as timestamp is the time at which the

value was inserted into Hbase, while the value inserted by Flume is

the timestamp at which the sink read the event from the channel.

Depending on how long the network and HBase takes, these timestamps

can vary. If you want 1+timestamp as row key then you should configure it:

 

hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is

appended as-is to the suffix you choose.

 

2- Although I am not using "incRow", it stills appear in the table

with some value. Why so and what is this value??

 

The SimpleHBaseEventSerializer is only an example class. For custom

use cases you can write your own serializer by implementing

HbaseEventSerializer. In this case, you have specified

incrementColumn, which causes an increment on the column specified.

Simply don't specify that config and that row will not appear.

 

3- How can avoid the last row??

 

See above.

 

 

I am still in the learning phase so please pardon my ignorance..Many thanks.

 

No problem. Much of this is documented

here:

https://builds.apache.org/job/flume-trunk/site/apidocs/index.html

 

 

 

Regards,

Mohammad Tariq

 

 

이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.

This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.

 


Re: Hbase-sink behavior

Posted by Hari Shreedharan <hs...@cloudera.com>.
Hi Ashutosh,

The sink will not create the table or column family. Make sure you have the table and column family. Also please make sure you have HADOOP_HOME/HADOOP_PREFIX and HBASE_HOME set correctly(or they are in your class path).   


Thanks
Hari

--  
Hari Shreedharan


On Thursday, June 21, 2012 at 12:52 AM, ashutosh(오픈플랫폼개발팀) wrote:

> Hi,
>  
> I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.
>  
> Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members...
>  
> ----------------------------------------
> ----------------------------------------
> Thanks & Regards,
> Ashutosh Sharma
> ----------------------------------------
>  
> -----Original Message-----
> From: Mohammad Tariq [mailto:dontariq@gmail.com]
> Sent: Friday, June 15, 2012 9:02 AM
> To: flume-user@incubator.apache.org (mailto:flume-user@incubator.apache.org)
> Subject: Re: Hbase-sink behavior
>  
> Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you.
>  
> Regards,
> Mohammad Tariq
>  
>  
> On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)> wrote:
> > Hi Mohammad,
> >  
> > My answers are inline.
> >  
> > --
> > Hari Shreedharan
> >  
> > On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
> >  
> > Hello list,
> >  
> > I am trying to use hbase-sink to collect data from a local file and
> > dump it into an Hbase table..But there are a few things I am not able
> > to understand and need some guidance.
> >  
> > This is the content of my conf file :
> >  
> > hbase-agent.sources = tail
> > hbase-agent.sinks = sink1
> > hbase-agent.channels = ch1
> > hbase-agent.sources.tail.type = exec
> > hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> > hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
> > org.apache.flume.sink.hbase.HBaseSink
> > hbase-agent.sinks.sink1.channel = ch1
> > hbase-agent.sinks.sink1.table = test3
> > hbase-agent.sinks.sink1.columnFamily = testing
> > hbase-agent.sinks.sink1.column = foo
> > hbase-agent.sinks.sink1.serializer =
> > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> > hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> > hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> > hbase-agent.sinks.sink1.serializer.keyType = timestamp
> > hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> > hbase-agent.sinks.sink1.serializer.suffix = timestamp
> > hbase-agent.channels.ch1.type=memory
> >  
> > Right now I am taking just some simple text from a file which has
> > following content -
> >  
> > value1
> > value2
> > value3
> > value4
> > value5
> > value6
> >  
> > And my Hbase table looks like -
> >  
> > hbase(main):217:0> scan 'test3'
> > ROW COLUMN+CELL
> > 11339716704561 column=testing:col1,
> > timestamp=1339716707569, value=value1
> > 11339716704562 column=testing:col1,
> > timestamp=1339716707571, value=value4
> > 11339716846594 column=testing:col1,
> > timestamp=1339716849608, value=value2
> > 11339716846595 column=testing:col1,
> > timestamp=1339716849610, value=value1
> > 11339716846596 column=testing:col1,
> > timestamp=1339716849611, value=value6
> > 11339716846597 column=testing:col1,
> > timestamp=1339716849614, value=value6
> > 11339716846598 column=testing:col1,
> > timestamp=1339716849615, value=value5
> > 11339716846599 column=testing:col1,
> > timestamp=1339716849615, value=value6
> > incRow column=testing:col1,
> > timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> > 9 row(s) in 0.0580 seconds
> >  
> > Now I have following questions -
> >  
> > 1- Why the timestamp value is different from the row key?(I was trying
> > to make "1+timestamp" as the rowkey)
> >  
> > The value shown by hbase shell as timestamp is the time at which the
> > value was inserted into Hbase, while the value inserted by Flume is
> > the timestamp at which the sink read the event from the channel.
> > Depending on how long the network and HBase takes, these timestamps
> > can vary. If you want 1+timestamp as row key then you should configure it:
> >  
> > hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> > appended as-is to the suffix you choose.
> >  
> > 2- Although I am not using "incRow", it stills appear in the table
> > with some value. Why so and what is this value??
> >  
> > The SimpleHBaseEventSerializer is only an example class. For custom
> > use cases you can write your own serializer by implementing
> > HbaseEventSerializer. In this case, you have specified
> > incrementColumn, which causes an increment on the column specified.
> > Simply don't specify that config and that row will not appear.
> >  
> > 3- How can avoid the last row??
> >  
> > See above.
> >  
> >  
> > I am still in the learning phase so please pardon my ignorance..Many thanks.
> >  
> > No problem. Much of this is documented
> > here:
> > https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
> >  
> >  
> >  
> > Regards,
> > Mohammad Tariq
> >  
>  
>  
>  
> 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
> This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.
>  
>  



RE: Hbase-sink behavior

Posted by "ashutosh (오픈플랫폼개발팀)" <sh...@kt.com>.
Hi,

I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.

Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members...

----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
----------------------------------------

-----Original Message-----
From: Mohammad Tariq [mailto:dontariq@gmail.com]
Sent: Friday, June 15, 2012 9:02 AM
To: flume-user@incubator.apache.org
Subject: Re: Hbase-sink behavior

Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you.

Regards,
    Mohammad Tariq


On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hs...@cloudera.com> wrote:
> Hi Mohammad,
>
> My answers are inline.
>
> --
> Hari Shreedharan
>
> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>
> Hello list,
>
> I am trying to use hbase-sink to collect data from a local file and
> dump it into an Hbase table..But there are a few things I am not able
> to understand and need some guidance.
>
> This is the content of my conf file :
>
> hbase-agent.sources = tail
> hbase-agent.sinks = sink1
> hbase-agent.channels = ch1
> hbase-agent.sources.tail.type = exec
> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type =
> org.apache.flume.sink.hbase.HBaseSink
> hbase-agent.sinks.sink1.channel = ch1
> hbase-agent.sinks.sink1.table = test3
> hbase-agent.sinks.sink1.columnFamily = testing
> hbase-agent.sinks.sink1.column = foo
> hbase-agent.sinks.sink1.serializer =
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
> hbase-agent.channels.ch1.type=memory
>
> Right now I am taking just some simple text from a file which has
> following content -
>
> value1
> value2
> value3
> value4
> value5
> value6
>
> And my Hbase table looks like -
>
> hbase(main):217:0> scan 'test3'
> ROW COLUMN+CELL
> 11339716704561 column=testing:col1,
> timestamp=1339716707569, value=value1
> 11339716704562 column=testing:col1,
> timestamp=1339716707571, value=value4
> 11339716846594 column=testing:col1,
> timestamp=1339716849608, value=value2
> 11339716846595 column=testing:col1,
> timestamp=1339716849610, value=value1
> 11339716846596 column=testing:col1,
> timestamp=1339716849611, value=value6
> 11339716846597 column=testing:col1,
> timestamp=1339716849614, value=value6
> 11339716846598 column=testing:col1,
> timestamp=1339716849615, value=value5
> 11339716846599 column=testing:col1,
> timestamp=1339716849615, value=value6
> incRow column=testing:col1,
> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> 9 row(s) in 0.0580 seconds
>
> Now I have following questions -
>
> 1- Why the timestamp value is different from the row key?(I was trying
> to make "1+timestamp" as the rowkey)
>
> The value shown by hbase shell as timestamp is the time at which the
> value was inserted into Hbase, while the value inserted by Flume is
> the timestamp at which the sink read the event from the channel.
> Depending on how long the network and HBase takes, these timestamps
> can vary. If you want 1+timestamp as row key then you should configure it:
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is
> appended as-is to the suffix you choose.
>
> 2- Although I am not using "incRow", it stills appear in the table
> with some value. Why so and what is this value??
>
> The SimpleHBaseEventSerializer is only an example class. For custom
> use cases you can write your own serializer by implementing
> HbaseEventSerializer. In this case, you have specified
> incrementColumn, which causes an increment on the column specified.
> Simply don't specify that config and that row will not appear.
>
> 3- How can avoid the last row??
>
> See above.
>
>
> I am still in the learning phase so please pardon my ignorance..Many thanks.
>
> No problem.  Much of this is documented
> here:
> https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
>
>
>
> Regards,
>     Mohammad Tariq
>
>


이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.

Re: Hbase-sink behavior

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you so much Hari for the valuable response..I'll follow the
guidelines provided by you.

Regards,
    Mohammad Tariq


On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan
<hs...@cloudera.com> wrote:
> Hi Mohammad,
>
> My answers are inline.
>
> --
> Hari Shreedharan
>
> On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:
>
> Hello list,
>
> I am trying to use hbase-sink to collect data from a local file
> and dump it into an Hbase table..But there are a few things I am not
> able to understand and need some guidance.
>
> This is the content of my conf file :
>
> hbase-agent.sources = tail
> hbase-agent.sinks = sink1
> hbase-agent.channels = ch1
> hbase-agent.sources.tail.type = exec
> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> hbase-agent.sources.tail.channels = ch1
> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
> hbase-agent.sinks.sink1.channel = ch1
> hbase-agent.sinks.sink1.table = test3
> hbase-agent.sinks.sink1.columnFamily = testing
> hbase-agent.sinks.sink1.column = foo
> hbase-agent.sinks.sink1.serializer =
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
> hbase-agent.channels.ch1.type=memory
>
> Right now I am taking just some simple text from a file which has
> following content -
>
> value1
> value2
> value3
> value4
> value5
> value6
>
> And my Hbase table looks like -
>
> hbase(main):217:0> scan 'test3'
> ROW COLUMN+CELL
> 11339716704561 column=testing:col1,
> timestamp=1339716707569, value=value1
> 11339716704562 column=testing:col1,
> timestamp=1339716707571, value=value4
> 11339716846594 column=testing:col1,
> timestamp=1339716849608, value=value2
> 11339716846595 column=testing:col1,
> timestamp=1339716849610, value=value1
> 11339716846596 column=testing:col1,
> timestamp=1339716849611, value=value6
> 11339716846597 column=testing:col1,
> timestamp=1339716849614, value=value6
> 11339716846598 column=testing:col1,
> timestamp=1339716849615, value=value5
> 11339716846599 column=testing:col1,
> timestamp=1339716849615, value=value6
> incRow column=testing:col1,
> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> 9 row(s) in 0.0580 seconds
>
> Now I have following questions -
>
> 1- Why the timestamp value is different from the row key?(I was trying
> to make "1+timestamp" as the rowkey)
>
> The value shown by hbase shell as timestamp is the time at which the value
> was inserted into Hbase, while the value inserted by Flume is the timestamp
> at which the sink read the event from the channel. Depending on how long the
> network and HBase takes, these timestamps can vary. If you want 1+timestamp
> as row key then you should configure it:
>
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1+
> This prefix is appended as-is to the suffix you choose.
>
> 2- Although I am not using "incRow", it stills appear in the table
> with some value. Why so and what is this value??
>
> The SimpleHBaseEventSerializer is only an example class. For custom use
> cases you can write your own serializer by implementing
> HbaseEventSerializer. In this case, you have specified incrementColumn,
> which causes an increment on the column specified. Simply don't specify that
> config and that row will not appear.
>
> 3- How can avoid the last row??
>
> See above.
>
>
> I am still in the learning phase so please pardon my ignorance..Many thanks.
>
> No problem.  Much of this is documented
> here: https://builds.apache.org/job/flume-trunk/site/apidocs/index.html
>
>
>
> Regards,
>     Mohammad Tariq
>
>

Re: Hbase-sink behavior

Posted by Hari Shreedharan <hs...@cloudera.com>.
Hi Mohammad, 

My answers are inline. 

-- 
Hari Shreedharan


On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote:

> Hello list,
> 
> I am trying to use hbase-sink to collect data from a local file
> and dump it into an Hbase table..But there are a few things I am not
> able to understand and need some guidance.
> 
> This is the content of my conf file :
> 
> hbase-agent.sources = tail
> hbase-agent.sinks = sink1
> hbase-agent.channels = ch1
> hbase-agent.sources.tail.type = exec
> hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt
> hbase-agent.sources.tail.channels = ch1
> hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
> hbase-agent.sinks.sink1.channel = ch1
> hbase-agent.sinks.sink1.table = test3
> hbase-agent.sinks.sink1.columnFamily = testing
> hbase-agent.sinks.sink1.column = foo
> hbase-agent.sinks.sink1.serializer =
> org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
> hbase-agent.sinks.sink1.serializer.payloadColumn = col1
> hbase-agent.sinks.sink1.serializer.incrementColumn = col1
> hbase-agent.sinks.sink1.serializer.keyType = timestamp
> hbase-agent.sinks.sink1.serializer.rowPrefix = 1
> hbase-agent.sinks.sink1.serializer.suffix = timestamp
> hbase-agent.channels.ch1.type=memory
> 
> Right now I am taking just some simple text from a file which has
> following content -
> 
> value1
> value2
> value3
> value4
> value5
> value6
> 
> And my Hbase table looks like -
> 
> hbase(main):217:0> scan 'test3'
> ROW COLUMN+CELL
> 11339716704561 column=testing:col1,
> timestamp=1339716707569, value=value1
> 11339716704562 column=testing:col1,
> timestamp=1339716707571, value=value4
> 11339716846594 column=testing:col1,
> timestamp=1339716849608, value=value2
> 11339716846595 column=testing:col1,
> timestamp=1339716849610, value=value1
> 11339716846596 column=testing:col1,
> timestamp=1339716849611, value=value6
> 11339716846597 column=testing:col1,
> timestamp=1339716849614, value=value6
> 11339716846598 column=testing:col1,
> timestamp=1339716849615, value=value5
> 11339716846599 column=testing:col1,
> timestamp=1339716849615, value=value6
> incRow column=testing:col1,
> timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C
> 9 row(s) in 0.0580 seconds
> 
> Now I have following questions -
> 
> 1- Why the timestamp value is different from the row key?(I was trying
> to make "1+timestamp" as the rowkey)
> 
> 

The value shown by hbase shell as timestamp is the time at which the value was inserted into Hbase, while the value inserted by Flume is the timestamp at which the sink read the event from the channel. Depending on how long the network and HBase takes, these timestamps can vary. If you want 1+timestamp as row key then you should configure it: 

hbase-agent.sinks.sink1.serializer.rowPrefix = 1+
This prefix is appended as-is to the suffix you choose.
> 2- Although I am not using "incRow", it stills appear in the table
> with some value. Why so and what is this value??
> 
> 

The SimpleHBaseEventSerializer is only an example class. For custom use cases you can write your own serializer by implementing HbaseEventSerializer. In this case, you have specified incrementColumn, which causes an increment on the column specified. Simply don't specify that config and that row will not appear.
> 3- How can avoid the last row??
> 
> 

See above. 
> 
> I am still in the learning phase so please pardon my ignorance..Many thanks.
No problem.  Much of this is documented here: https://builds.apache.org/job/flume-trunk/site/apidocs/index.html


> 
> Regards,
>     Mohammad Tariq
> 
>