You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Spark Enthusiast <sp...@yahoo.in> on 2015/08/12 06:53:53 UTC
Not seeing Log messages
I wrote a small python program :
def parseLogs(self):
""" Read and parse log file """
self._logger.debug("Parselogs() start")
self.parsed_logs = (self._sc
.textFile(self._logFile)
.map(self._parseApacheLogLine)
.cache())
self.access_logs = (self.parsed_logs
.filter(lambda s: s[1] == 1)
.map(lambda s: s[0])
.cache())
self.failed_logs = (self.parsed_logs
.filter(lambda s: s[1] == 0)
.map(lambda s: s[0]))
failed_logs_count = self.failed_logs.count()
if failed_logs_count > 0:
self._logger.debug('Number of invalid logline: %d' % self.failed_logs.count())
for line in self.failed_logs.take(20):
self._logger.debug('Invalid logline: %s' % line)
self._logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % \
(self.parsed_logs.count(), self.access_logs.count(), self.failed_logs.count()))
return (self.parsed_logs, self.access_logs, self.failed_logs)
def main(argv):
try:
logger = createLogger("pyspark", logging.DEBUG, "LogAnalyzer.log", "./")
logger.debug("Starting LogAnalyzer")
myLogAnalyzer = ApacheLogAnalyzer(logger)
(parsed_logs, access_logs, failed_logs) = myLogAnalyzer.parseLogs()
except Exception as e:
print "Encountered Exception %s" %str(e)
logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' %
(parsed_logs.count(), access_logs.count(), failed_logs.count()))
logger.info("DONE. ALL TESTS PASSED")
I see some log messages:"Starting LogAnalyzer""Parselogs() start""DONE. ALL TESTS PASSED"
But do not see some log messages:Read %d lines, successfully parsed %d lines, failed to parse %d lines'
But, This line:logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' %
(parsed_logs.count(), access_logs.count(), failed_logs.count()))I get the following error :
Encountered Exception Cannot pickle files that are not opened for reading
Do not have a clue as to what's happening. Any help will be appreciated.
Re: Not seeing Log messages
Posted by Spark Enthusiast <sp...@yahoo.in>.
Forgot to mention. Here is how I run the program :
./bin/spark-submit --conf "spark.app.master"="local[1]" ~/workspace/spark-python/ApacheLogWebServerAnalysis.py
On Wednesday, 12 August 2015 10:28 AM, Spark Enthusiast <sp...@yahoo.in> wrote:
I wrote a small python program :
def parseLogs(self):
""" Read and parse log file """
self._logger.debug("Parselogs() start")
self.parsed_logs = (self._sc
.textFile(self._logFile)
.map(self._parseApacheLogLine)
.cache())
self.access_logs = (self.parsed_logs
.filter(lambda s: s[1] == 1)
.map(lambda s: s[0])
.cache())
self.failed_logs = (self.parsed_logs
.filter(lambda s: s[1] == 0)
.map(lambda s: s[0]))
failed_logs_count = self.failed_logs.count()
if failed_logs_count > 0:
self._logger.debug('Number of invalid logline: %d' % self.failed_logs.count())
for line in self.failed_logs.take(20):
self._logger.debug('Invalid logline: %s' % line)
self._logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % \
(self.parsed_logs.count(), self.access_logs.count(), self.failed_logs.count()))
return (self.parsed_logs, self.access_logs, self.failed_logs)
def main(argv):
try:
logger = createLogger("pyspark", logging.DEBUG, "LogAnalyzer.log", "./")
logger.debug("Starting LogAnalyzer")
myLogAnalyzer = ApacheLogAnalyzer(logger)
(parsed_logs, access_logs, failed_logs) = myLogAnalyzer.parseLogs()
except Exception as e:
print "Encountered Exception %s" %str(e)
logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' %
(parsed_logs.count(), access_logs.count(), failed_logs.count()))
logger.info("DONE. ALL TESTS PASSED")
I see some log messages:"Starting LogAnalyzer""Parselogs() start""DONE. ALL TESTS PASSED"
But do not see some log messages:Read %d lines, successfully parsed %d lines, failed to parse %d lines'
But, This line:logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' %
(parsed_logs.count(), access_logs.count(), failed_logs.count()))I get the following error :
Encountered Exception Cannot pickle files that are not opened for reading
Do not have a clue as to what's happening. Any help will be appreciated.