You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@eagle.apache.org by Hao Chen <ha...@apache.org> on 2016/02/20 06:43:15 UTC

Re: Convert MapR's audit log to the format that Eagle support

Hi Daniel,

I think the 2nd approach is better. Eagle currently support community
version's audit log format, but it can be easily extend to other format
like MapR's, and also we should not assume MapR's hadoop audit log format
depends on other hadoop distribution's audit log format as well, so
directly provide additional audit log parser to parse the MapR's audit log,
and enhance eagle to be able to evaluate the new type of message.

Regards,
Hao

On Sat, Feb 20, 2016 at 7:58 AM, Daniel Zhou <Da...@dataguise.com>
wrote:

> Hi, all,
>
> I'm trying to test Eagle on MapR,  the problem is that MapR's hdfs audit
> log format is quite different from what Eagle support.
> What I'm trying to do now is to get the audit log message from MAPR and
> convert them to the format that Eagle currently support in real time. I
> come up with two approaches but need some help, or advice.
>
> Here is the problem:
> In Mapr's audit log, many parameters are saved as  IDs, for example:
>
> {"timestamp":{"$date":"2015-06-06T13:02:23.746Z"},"operation":"GETATTR","uid":"1","ipAddress":
> "10.10.104.53","srcFid":"2049.652.263696","volumeId":68048396,"status":0}
>
> To convert those IDs to human readable name,  MapR provides an utility
> called "expandaudit" to progress these audit log files, which  requires
> cluster resources like memory and CPU. BUT,  it is not a real time
> processing tool.
> After the conversion, the above json file become :
>
> {"timestamp":{"$date":"2015-06-06T13:02:23.746Z"},"operation":"GETATTR","user":
>
> "userA","uid":"1","ipAddress":"10.10.104.53","srcPath":"/customers/US_Western_Region.json",
>
> "srcFid":"2049.3296.268968","volumeName":"data_analysis","volumeId":68048396,"status":0}
>
>
> My two possible approaches:
>
> 1.       Get audit log and process them in real time, convert them to the
> format that Eagle support
>
> 2.       Or write a converter that when user set policie parameters, these
> huaman readable names will be transferred to IDs, so Eagle will evaluate
> these hdfs audit log message based on IDs.
>
> Any advice will be appreciated.
>
>
>
> Thanks and regards,
>
> Daniel
>
>

Re: Convert MapR's audit log to the format that Eagle support

Posted by Daniel Zhou <Da...@dataguise.com>.
Thank you for your advice.
Could you tell me where to find the files that store the relationships between ids and their human readable names?

Sent from my iPhone

> On Feb 19, 2016, at 9:43 PM, Hao Chen <ha...@apache.org> wrote:
> 
> Hi Daniel,
> 
> I think the 2nd approach is better. Eagle currently support community
> version's audit log format, but it can be easily extend to other format
> like MapR's, and also we should not assume MapR's hadoop audit log format
> depends on other hadoop distribution's audit log format as well, so
> directly provide additional audit log parser to parse the MapR's audit log,
> and enhance eagle to be able to evaluate the new type of message.
> 
> Regards,
> Hao
> 
> On Sat, Feb 20, 2016 at 7:58 AM, Daniel Zhou <Da...@dataguise.com>
> wrote:
> 
>> Hi, all,
>> 
>> I'm trying to test Eagle on MapR,  the problem is that MapR's hdfs audit
>> log format is quite different from what Eagle support.
>> What I'm trying to do now is to get the audit log message from MAPR and
>> convert them to the format that Eagle currently support in real time. I
>> come up with two approaches but need some help, or advice.
>> 
>> Here is the problem:
>> In Mapr's audit log, many parameters are saved as  IDs, for example:
>> 
>> {"timestamp":{"$date":"2015-06-06T13:02:23.746Z"},"operation":"GETATTR","uid":"1","ipAddress":
>> "10.10.104.53","srcFid":"2049.652.263696","volumeId":68048396,"status":0}
>> 
>> To convert those IDs to human readable name,  MapR provides an utility
>> called "expandaudit" to progress these audit log files, which  requires
>> cluster resources like memory and CPU. BUT,  it is not a real time
>> processing tool.
>> After the conversion, the above json file become :
>> 
>> {"timestamp":{"$date":"2015-06-06T13:02:23.746Z"},"operation":"GETATTR","user":
>> 
>> "userA","uid":"1","ipAddress":"10.10.104.53","srcPath":"/customers/US_Western_Region.json",
>> 
>> "srcFid":"2049.3296.268968","volumeName":"data_analysis","volumeId":68048396,"status":0}
>> 
>> 
>> My two possible approaches:
>> 
>> 1.       Get audit log and process them in real time, convert them to the
>> format that Eagle support
>> 
>> 2.       Or write a converter that when user set policie parameters, these
>> huaman readable names will be transferred to IDs, so Eagle will evaluate
>> these hdfs audit log message based on IDs.
>> 
>> Any advice will be appreciated.
>> 
>> 
>> 
>> Thanks and regards,
>> 
>> Daniel
>> 
>>