You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Spark Enthusiast <sp...@yahoo.in> on 2015/08/12 06:53:53 UTC

Not seeing Log messages

I wrote a small python program :
def parseLogs(self):
    """ Read and parse log file """
    self._logger.debug("Parselogs() start")
    self.parsed_logs = (self._sc
                        .textFile(self._logFile)
                        .map(self._parseApacheLogLine)
                        .cache())

    self.access_logs = (self.parsed_logs
                        .filter(lambda s: s[1] == 1)
                        .map(lambda s: s[0])
                        .cache())

    self.failed_logs = (self.parsed_logs
                        .filter(lambda s: s[1] == 0)
                        .map(lambda s: s[0]))
    failed_logs_count = self.failed_logs.count()
    if failed_logs_count > 0:
        self._logger.debug('Number of invalid logline: %d' % self.failed_logs.count())

        for line in self.failed_logs.take(20):
            self._logger.debug('Invalid logline: %s' % line)

    
    self._logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % \
          (self.parsed_logs.count(), self.access_logs.count(), self.failed_logs.count()))
    
    
    return (self.parsed_logs, self.access_logs, self.failed_logs)
def main(argv):
    try:
        logger = createLogger("pyspark", logging.DEBUG, "LogAnalyzer.log", "./")
        logger.debug("Starting LogAnalyzer")
        myLogAnalyzer =  ApacheLogAnalyzer(logger)
        (parsed_logs, access_logs, failed_logs) = myLogAnalyzer.parseLogs()
    except Exception as e:
        print "Encountered Exception %s" %str(e)

    logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % 
                   (parsed_logs.count(), access_logs.count(), failed_logs.count()))
    logger.info("DONE. ALL TESTS PASSED")

I see some log messages:"Starting LogAnalyzer""Parselogs() start""DONE. ALL TESTS PASSED"
But do not see some log messages:Read %d lines, successfully parsed %d lines, failed to parse %d lines'
But, This line:logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % 
                   (parsed_logs.count(), access_logs.count(), failed_logs.count()))I get the following error :
Encountered Exception Cannot pickle files that are not opened for reading
Do not have a clue as to what's happening. Any help will be appreciated.


Re: Not seeing Log messages

Posted by Spark Enthusiast <sp...@yahoo.in>.
Forgot to mention. Here is how I run the program :
 ./bin/spark-submit --conf "spark.app.master"="local[1]" ~/workspace/spark-python/ApacheLogWebServerAnalysis.py


     On Wednesday, 12 August 2015 10:28 AM, Spark Enthusiast <sp...@yahoo.in> wrote:
   

 I wrote a small python program :
def parseLogs(self):
    """ Read and parse log file """
    self._logger.debug("Parselogs() start")
    self.parsed_logs = (self._sc
                        .textFile(self._logFile)
                        .map(self._parseApacheLogLine)
                        .cache())

    self.access_logs = (self.parsed_logs
                        .filter(lambda s: s[1] == 1)
                        .map(lambda s: s[0])
                        .cache())

    self.failed_logs = (self.parsed_logs
                        .filter(lambda s: s[1] == 0)
                        .map(lambda s: s[0]))
    failed_logs_count = self.failed_logs.count()
    if failed_logs_count > 0:
        self._logger.debug('Number of invalid logline: %d' % self.failed_logs.count())

        for line in self.failed_logs.take(20):
            self._logger.debug('Invalid logline: %s' % line)

    
    self._logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % \
          (self.parsed_logs.count(), self.access_logs.count(), self.failed_logs.count()))
    
    
    return (self.parsed_logs, self.access_logs, self.failed_logs)
def main(argv):
    try:
        logger = createLogger("pyspark", logging.DEBUG, "LogAnalyzer.log", "./")
        logger.debug("Starting LogAnalyzer")
        myLogAnalyzer =  ApacheLogAnalyzer(logger)
        (parsed_logs, access_logs, failed_logs) = myLogAnalyzer.parseLogs()
    except Exception as e:
        print "Encountered Exception %s" %str(e)

    logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % 
                   (parsed_logs.count(), access_logs.count(), failed_logs.count()))
    logger.info("DONE. ALL TESTS PASSED")

I see some log messages:"Starting LogAnalyzer""Parselogs() start""DONE. ALL TESTS PASSED"
But do not see some log messages:Read %d lines, successfully parsed %d lines, failed to parse %d lines'
But, This line:logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % 
                   (parsed_logs.count(), access_logs.count(), failed_logs.count()))I get the following error :
Encountered Exception Cannot pickle files that are not opened for reading
Do not have a clue as to what's happening. Any help will be appreciated.