You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2008/11/11 21:09:58 UTC

[Hadoop Wiki] Update of "Sending information to Chukwa" by Jerome Boulon

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by Jerome Boulon:
http://wiki.apache.org/hadoop/Sending_information_to_Chukwa

------------------------------------------------------------------------------
  
  == Add a new dataSource (Source Input) ==
  === Using Log4J ===
- Chukwa comes with a Log4J Appender. Here the steps that you need to fallow in order to use it:
+ Chukwa comes with a Log4J Appender. Here the steps that you need to follow in order to use it:
  
-   1.  Create a log4j.properties file that contains the fallowing information:
+   1.  Create a log4j.properties file that contains the following information:
  
      log4j.rootLogger=INFO, chukwa
      log4j.appender.chukwa=org.apache.hadoop.chukwa.inputtools.log4j.ChukwaDailyRollingFileAppender
@@ -20, +20 @@

      log4j.appender.chukwa.layout=org.apache.log4j.PatternLayout
      log4j.appender.chukwa.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
  
-   1.  Add those parameters to your java command line:
+   1.  Add these parameters to your java command line:
        * -DCHUKWA_HOME=${CHUKWA_HOME} -DRECORD_TYPE=&lt;YourRecordType_Here&gt; -Dlog4j.configuration=log4j.properties
        * -DRECORD_TYPE=&lt;YourRecordType_Here&gt; is the most important parameter. 
-       * You can only store one record type per file, so if you need to split your logs into different record types,just create one appender per data type     (%T% see hadoop logs4j configuration file)
+       * You can only store one record type per file, so if you need to split your logs into different record types just create one appender per data type     (%T% see hadoop logs4j configuration file)
  
-   1.  Start your program, now all you log statements should be written in ${CHUKWA_HOME}/logs/<YourRecordType_Here>.log
+   1.  Start your program. Now all your log statements should be written in ${CHUKWA_HOME}/logs/<YourRecordType_Here>.log
  
  === Static file like /var/log/messages ===
  
@@ -39, +39 @@

  
     1. Open a socket from your application to the ChukwaLocalAgent
     1. Write this line to the socket
-       * add org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLine <RecordType> <StartOffset> <fileName> <StartOffset>
+       * Add org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLine <RecordType> <StartOffset> <fileName> <StartOffset>
        * Where <RecordType> is the data type that will identify your data
        * Where <StartOffset> is the start offset
        * Where <fileName> is the local path on your machine
-    1. close the socket
+    1. Close the socket
  
  
  == Extract information from this new dataSource ==
@@ -58, +58 @@

  Your log will be automatically available from the Web Log viewer under the <YourRecordTypeHere> directory 
   
  === Using a specific Parser ===
- If you want to extract some specific information and more processing you need to write your own parser.
+ If you want to extract some specific information and perform more processing you need to write your own parser.
  Like any M/R program, your have to write at least the Map side for your parser. The reduce side is Identity by default.
  
  ==== MAP side of the parser ====
- Your can either write your own from strach or extends the AbstractProcessor class that hide all the low level action on the chunk.
+ Your can write your own parser from scratch or extend the AbstractProcessor class that hides all the low level action on the chunk.
- then you have to register your parser to the demux (link between the RecordType and the parser)
+ Then you have to register your parser to the demux (link between the RecordType and the parser)
  
  ==== Parser registration ====
-    * Edit ${CHUKWA_HOME}/conf/chukwa-demux-conf.xml and add the fallowing lines
+    * Edit ${CHUKWA_HOME}/conf/chukwa-demux-conf.xml and add the following lines
  
     <property>
      <name><YourRecordType_Here></name>
@@ -147, +147 @@

  ==== Parser key field ====
  
  Your data is going to be sorted by RecordType then by the key field.
- The default implementation use the fallowing grouping for all records:
+ The default implementation use the following grouping for all records:
     1. Time partition (Time up to the hour)
     1. Machine name (physical input source)
     1. Record timestamp