You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Amjad ALSHABANI <as...@gmail.com> on 2016/02/22 09:32:04 UTC
Loading file into executor classpath
Hello everybody,
I ve implemented a Loganalyzer program in spark, which takes the logs from
an apache log file and translate it to a given object,
The regex of the log file is GROK, so I m using GROK library to extract the
desired field
When running the application locally, it succeded without any problem, but
when deploying it to yarn (with multiple nodes) I m having an issue with
the pattern file that could not be found
file:/hadoop-disk1/yarn/local/usercache/hadoop/appcache/application_1454418114641_7429/container_1454418114641_7429_01_000002/./myFatJat-jar-with-dependencies.jar!/haproxy_pattern.txt
where the haproxy_pattern.txt is the GROK file
I submit my jar as the following:
$ spark-submit --master yarn-client --class
com.vsct.dt.bigdata.cdn.app.MainRunner --conf
spark.driver.extraClassPath=conf/ --conf
spark.executor.extraClassPath=conf/ myFatJat-jar-with-dependencies.jar
My haproxy_pattern.txt file existe in the sub-directory conf/
More details:
th grok API I m using is :
<groupId>io.thekraken</groupId>
<artifactId>grok</artifactId>
<version>0.1.1</version>
My code looks like:
the map code:
JavaRDD<String> rawLog = sc.textFile(configuration.getInput());
JavaRDD<LogEntry> logEntryRDD = rawLog.map(new Function<String,
LogEntry>() {
private static final long serialVersionUID = 1L;
@Override
public LogEntry call(String raw_line) throws Exception {
grokReader = new SparkGrokReader(configuration);
*LogEntry logEntry = grokReader.read(raw_line);*
return logEntry;
}
}).cache();
the method which will extract the fields from the grok:
public LogEntry read(String raw_line) {
LogEntry logEntry = null;
try {
Match gm = grok.match(raw_line);
gm.captures();
logEntry = buildLogentry(gm.toJson());
} catch (NullPointerException npe) {
logger.warn("Line could not be parsed by GROK: {}", raw_line);
}
return logEntry;
}