You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chukwa.apache.org by "Saisai Shao (JIRA)" <ji...@apache.org> on 2012/09/13 07:24:08 UTC

[jira] [Updated] (CHUKWA-663) SystemMetrics parse error

     [ https://issues.apache.org/jira/browse/CHUKWA-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Saisai Shao updated CHUKWA-663:
-------------------------------

    Description: '  (was: In some OS environment, Chukwa Agent cannot fetch all the system data, but in the Chukwa Collect, the format is hard coded, so this will lead to a NullPointException, and this NullPointException will lead to other Exception.

For example:
In my environment, swap information will no be fetched, so in this parse class (org.apache.hadoop.chukwa.extraction.demux.processor.mapper.SystemMetrics.java)
a exception will be thrown:
java.lang.NullPointerException
	at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.SystemMetrics.parse(SystemMetrics.java:156)
	at org.apache.hadoop.chukwa.extraction.demux.processor.mapper.AbstractProcessor.process(AbstractProcessor.java:82)
	at org.apache.hadoop.chukwa.datacollection.writer.hbase.HBaseWriter.add(HBaseWriter.java:194)
	at org.apache.hadoop.chukwa.datacollection.writer.SocketTeeWriter.add(SocketTeeWriter.java:252)
	at org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter.add(PipelineStageWriter.java:40)
	at org.apache.hadoop.chukwa.datacollection.collector.servlet.ServletCollector.accept(ServletCollector.java:159)
	at org.apache.hadoop.chukwa.datacollection.collector.servlet.ServletCollector.doPost(ServletCollector.java:208)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
	at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
	at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
	at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
	at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
	at org.mortbay.jetty.Server.handle(Server.java:326)
	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
	at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
	at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
	at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:451)
 In this line: JSONObject swap = (JSONObject) json.get("swap");
the swap object in null, which will lead to next line's exception. And AbstractProcessor class will catch this error in this line: saveChunkInError(e), this error will also be packaged in a KeyValue format, and insert in HBase. But HBase table schema has no column family named "SystemMetricsInError", so another exception will be introduced:

WARN btpool0-54 HBaseWriter - org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: DoNotRetryIOException: 1 time, servers with issues: sr116:60020,
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1591)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
        at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:784)
        at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:402)
        at org.apache.hadoop.chukwa.datacollection.writer.hbase.HBaseWriter.add(HBaseWriter.java:195)
        at org.apache.hadoop.chukwa.datacollection.writer.SocketTeeWriter.add(SocketTeeWriter.java:252)
        at org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter.add(PipelineStageWriter.java:40)
        at org.apache.hadoop.chukwa.datacollection.collector.servlet.ServletCollector.accept(ServletCollector.java:159)
        at org.apache.hadoop.chukwa.datacollection.collector.servlet.ServletCollector.doPost(ServletCollector.java:208)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:451)

so there exists two problems:
1. until you meet parse error, you have already collect a part of metric data. But after you throw exception, you do not continue parse left parts of data, this will lead to half-integrity data in this timestamp saved in HBase, if your half-integrity data do not include "system:ctags", you will not retrieve cluster name, this really is the problem.
2. org.apache.hadoop.chukwa.extraction.demux.processor.mapper.SystemMetrics    class is not fine designed, some json keys are hard coded in the file, but there is no any assurance that these keys is existed, miss protection code.  
)
    
> SystemMetrics parse error
> -------------------------
>
>                 Key: CHUKWA-663
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-663
>             Project: Chukwa
>          Issue Type: Bug
>          Components: Data Collection
>    Affects Versions: 0.6.0
>         Environment: RHEL6.1 Hadoop 1.0.3 HBase 0.94.0
>            Reporter: Saisai Shao
>
> '

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira