You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lens.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2015/07/13 08:13:04 UTC

[jira] [Commented] (LENS-602) Easy access to per-query lens server logs

    [ https://issues.apache.org/jira/browse/LENS-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14624253#comment-14624253 ] 

Amareshwari Sriramadasu commented on LENS-602:
----------------------------------------------

Now that we have log segregation ids for each request, the api can be generic enough to get logs for a <logsegregatonid>. logsegregatonid is unique id for each REST request and is same as queryhandle for queries, prepare handle for prepared queries.

*API* :
I'm thinking about adding the api as :
GET on <lens-url>/logs/{logsegregatonid} : would return output stream.

*Implementation approaches* :

Here are the approaches I'm thinking about :

# API call will actually do a grep  <logsegregatonid> <${lens.log.dir}/<lens-log-file>.* and return result as output stream. 
## I still have to explore the options for streaming shell output as output stream as Http Response.
## This approach will allow users to give other parameters as well instead of logsegregatonid - such as timestamp, threadID/threadName or any other grep pattern.
# Logs corresponding to each <logsegregatonid> are logged into a separate file. And the file served as attachment on api call
## We would need a purging policy for these files. And if the number of files is huge - we might run out of max files in a directory in underlying OS. We will have to think of putting them in separate directories based on time or number of files.

*Feature turnoff*:

We should have a way to turn off  the feature : "Serving logs over REST " at server deployment - some deployments might want to turn this off to save request serving threads. As its serving huge content (the logs can be huge) over network and to control use of bandwidth, some deployments can turn off the feature.

* Get underlying driver's execution logs:*

For getting underlying driver's execution logs, if the driver provides the log, we should be able to serve. HiveServer2 provides thrift api to fetch logs corresponding to an operation, before closing the operation. So, HiveDriver can fetch and log operation logs in lens server with logsegregationid(the query handle) added for the logs fetched. Again all this would configurable - because it can cause HiveServer2 to get overloaded.

Thoughts?

I'm inclined to the option of grep on lens.log.dir and serving it as output stream on REST - trying out the same first. 

Everyone's thoughts, comments and suggestions are welcome.

> Easy access to per-query lens server logs
> -----------------------------------------
>
>                 Key: LENS-602
>                 URL: https://issues.apache.org/jira/browse/LENS-602
>             Project: Apache Lens
>          Issue Type: New Feature
>            Reporter: Angad Singh
>            Assignee: Amareshwari Sriramadasu
>              Labels: Hackathon-July
>
> Right now one has to have access to the lens server machine and find a lens query's job logs manually in lensserver.log file. This is neither scalable nor user-friendly for a shared multi-tenanted lens server.
> Just throwing server-exceptions to the client or showing the job ID is also often not enough. Even when the query succeeds, one needs to see how candidate fact tables and their columns, etc. were pruned and how join chains were resolved, for example. That is only possible by seeing the lens server logs.
> Instead of that, this ticket is to propose that lens store logs for each lens query in a different log file on the server and that there be a REST end point to access a query's log (by query ID). That URL can be pasted on the client shell when the query is launched and the user can see a tailed log of all that lens is doing behind the scenes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)