You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@livy.apache.org by Dimosthenis Masouros <de...@gmail.com> on 2017/07/04 20:09:39 UTC

Livy Automatically Deletes Log Files

Good evening,

Before I state my problem, for reference reasons, my livy.conf file looks
like this:

livy.spark.master = yarn
livy.spark.deployMode = cluster
livy.server.recovery.mode = recovery
livy.server.recovery.state-store = filesystem
livy.server.recovery.state-store.url = file:///root/hdfs/livy
livy.server.request-log-retain.days = 5

I have set up a livy server on a hadoop namenode. When i submit a new job
everything operates normally and the job is submitted successfully.

curl -X POST --data '{"file": "/pi.py", "args": ["10"], "name": "test
livy"}' -H "Content-Type: application/json" namenode:8998/batches

After the job is submitted i am able to see the status of the application
by running:

curl -X POST namenode:8998/batches/id

which returns a json file with the id of the job etc. If i list the files
of the /root/hdfs/livy folder on the namenode, during the job's execution I
am able to see the exact same file as the replied from the curl request.
However, the problem i am experiencing is that when the job finishes its
execution, after some period of time (~10-15 mins) the log file of the job
is deleted automatically from the host. Is there any way i can keep this
log file? Or alternatively, is there any way to find an application's
status submitted with livy, using a curl request (or anything familiar)?


Thanks for your support.

Re: Livy Automatically Deletes Log Files

Posted by Saisai Shao <sa...@gmail.com>.
IIUC, you want to get batch session state even when application was
finished long ago, so you're trying to read the state from recovery log. I
think this is not a proper way, anyway session metadata and session
recovery log will be deleted after timeout. If you want to extend this
expiry date, you could set "livy.server.session.state-retain.sec" to a
large value, by default it is 600 seconds.

On Wed, Jul 5, 2017 at 4:09 AM, Dimosthenis Masouros <
demo.masouros@gmail.com> wrote:

> Good evening,
>
> Before I state my problem, for reference reasons, my livy.conf file looks
> like this:
>
> livy.spark.master = yarn
> livy.spark.deployMode = cluster
> livy.server.recovery.mode = recovery
> livy.server.recovery.state-store = filesystem
> livy.server.recovery.state-store.url = file:///root/hdfs/livy
> livy.server.request-log-retain.days = 5
>
> I have set up a livy server on a hadoop namenode. When i submit a new job
> everything operates normally and the job is submitted successfully.
>
> curl -X POST --data '{"file": "/pi.py", "args": ["10"], "name": "test
> livy"}' -H "Content-Type: application/json" namenode:8998/batches
>
> After the job is submitted i am able to see the status of the application
> by running:
>
> curl -X POST namenode:8998/batches/id
>
> which returns a json file with the id of the job etc. If i list the files
> of the /root/hdfs/livy folder on the namenode, during the job's execution I
> am able to see the exact same file as the replied from the curl request.
> However, the problem i am experiencing is that when the job finishes its
> execution, after some period of time (~10-15 mins) the log file of the job
> is deleted automatically from the host. Is there any way i can keep this
> log file? Or alternatively, is there any way to find an application's
> status submitted with livy, using a curl request (or anything familiar)?
>
>
> Thanks for your support.
>