You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2008/11/11 21:09:58 UTC
[Hadoop Wiki] Update of "Sending information to Chukwa" by Jerome Boulon
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by Jerome Boulon:
http://wiki.apache.org/hadoop/Sending_information_to_Chukwa
------------------------------------------------------------------------------
== Add a new dataSource (Source Input) ==
=== Using Log4J ===
- Chukwa comes with a Log4J Appender. Here the steps that you need to fallow in order to use it:
+ Chukwa comes with a Log4J Appender. Here the steps that you need to follow in order to use it:
- 1. Create a log4j.properties file that contains the fallowing information:
+ 1. Create a log4j.properties file that contains the following information:
log4j.rootLogger=INFO, chukwa
log4j.appender.chukwa=org.apache.hadoop.chukwa.inputtools.log4j.ChukwaDailyRollingFileAppender
@@ -20, +20 @@
log4j.appender.chukwa.layout=org.apache.log4j.PatternLayout
log4j.appender.chukwa.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
- 1. Add those parameters to your java command line:
+ 1. Add these parameters to your java command line:
* -DCHUKWA_HOME=${CHUKWA_HOME} -DRECORD_TYPE=<YourRecordType_Here> -Dlog4j.configuration=log4j.properties
* -DRECORD_TYPE=<YourRecordType_Here> is the most important parameter.
- * You can only store one record type per file, so if you need to split your logs into different record types,just create one appender per data type (%T% see hadoop logs4j configuration file)
+ * You can only store one record type per file, so if you need to split your logs into different record types just create one appender per data type (%T% see hadoop logs4j configuration file)
- 1. Start your program, now all you log statements should be written in ${CHUKWA_HOME}/logs/<YourRecordType_Here>.log
+ 1. Start your program. Now all your log statements should be written in ${CHUKWA_HOME}/logs/<YourRecordType_Here>.log
=== Static file like /var/log/messages ===
@@ -39, +39 @@
1. Open a socket from your application to the ChukwaLocalAgent
1. Write this line to the socket
- * add org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLine <RecordType> <StartOffset> <fileName> <StartOffset>
+ * Add org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLine <RecordType> <StartOffset> <fileName> <StartOffset>
* Where <RecordType> is the data type that will identify your data
* Where <StartOffset> is the start offset
* Where <fileName> is the local path on your machine
- 1. close the socket
+ 1. Close the socket
== Extract information from this new dataSource ==
@@ -58, +58 @@
Your log will be automatically available from the Web Log viewer under the <YourRecordTypeHere> directory
=== Using a specific Parser ===
- If you want to extract some specific information and more processing you need to write your own parser.
+ If you want to extract some specific information and perform more processing you need to write your own parser.
Like any M/R program, your have to write at least the Map side for your parser. The reduce side is Identity by default.
==== MAP side of the parser ====
- Your can either write your own from strach or extends the AbstractProcessor class that hide all the low level action on the chunk.
+ Your can write your own parser from scratch or extend the AbstractProcessor class that hides all the low level action on the chunk.
- then you have to register your parser to the demux (link between the RecordType and the parser)
+ Then you have to register your parser to the demux (link between the RecordType and the parser)
==== Parser registration ====
- * Edit ${CHUKWA_HOME}/conf/chukwa-demux-conf.xml and add the fallowing lines
+ * Edit ${CHUKWA_HOME}/conf/chukwa-demux-conf.xml and add the following lines
<property>
<name><YourRecordType_Here></name>
@@ -147, +147 @@
==== Parser key field ====
Your data is going to be sorted by RecordType then by the key field.
- The default implementation use the fallowing grouping for all records:
+ The default implementation use the following grouping for all records:
1. Time partition (Time up to the hour)
1. Machine name (physical input source)
1. Record timestamp