You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/02/02 21:36:49 UTC
[Hadoop Wiki] Trivial Update of "Chukwa_Processes_and_Data_Flow" by BillGraham
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Chukwa_Processes_and_Data_Flow" page has been changed by BillGraham.
http://wiki.apache.org/hadoop/Chukwa_Processes_and_Data_Flow?action=diff&rev1=1&rev2=2
--------------------------------------------------
+ <<TableOfContents>>
+
+ == Overview ==
This document describes how Chukwa data is stored in HDFS and the processes that act on it.
- '''HDFS File System Structure'''
+ == HDFS File System Structure ==
The general layout of the Chukwa filesystem is as follows.
@@ -19, +22 @@
temp/
}}}
- '''Raw Log Collection and Aggregation Workflow'''
+ == Raw Log Collection and Aggregation Workflow ==
What data is stored where is best described by stepping through the Chukwa workflow.
@@ -52, +55 @@
* to: {{{archivesProcessing/mrOutput}}}
* to: {{{finalArchives/[yyyyMMdd]/*/chukwaArchive-part-*}}}
- '''Log Directories Requiring Cleanup'''
+ == Log Directories Requiring Cleanup ==
The following directories will grow over time and will need to be periodically pruned: