You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by ni...@free.fr on 2015/10/10 18:16:00 UTC

Applicative logs in YARN

Hello,
I have some spark streaming jobs listening Kafka and deployed over YARN.
I used the variable ${spark.yarn.app.container.log.dir} in my log4j in order to write my logs.
It works fine , the logs are well written, and well agregated in my HDFS.
But, I have 2 issues with this approach :
1/ I need to retrieve in realtime those logs to load them inside ELK and have Kibana dashboard.
Usually I use syslog & logstash to do that but as the directories of my logs change everytime it is not possible
2/ The logs aggregated inside HDFS is not easily readable , I must use "yarn logs" 

So, what is the best practice to do my requirement :
Write logs from each datanode and load them inside ELK

Tks a lot for your support
Nicolas