You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ranger.apache.org by Ramesh Mani <rm...@hortonworks.com> on 2018/01/27 02:05:19 UTC
Re: Review Request 63552: RANGER-1837:Enhance Ranger Audit to HDFS to
support ORC file format
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63552/
-----------------------------------------------------------
(Updated Jan. 27, 2018, 2:05 a.m.)
Review request for ranger, Don Bosco Durai and Madhan Neethiraj.
Changes
-------
Address review comments.
Repository: ranger
Description
-------
RANGER-1837:Enhance Ranger Audit to HDFS to support ORC file format
Diffs (updated)
-----
agents-audit/README.txt PRE-CREATION
agents-audit/pom.xml c8bd1d8
agents-audit/src/main/java/org/apache/ranger/audit/destination/AuditDestination.java 41d0e82
agents-audit/src/main/java/org/apache/ranger/audit/destination/HDFSAuditDestination.java 66d8504
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditFileCacheProvider.java 314b130
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditHandler.java 4ce31dd
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditProviderFactory.java 43107ba
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditWriterFactory.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/provider/BaseAuditHandler.java b095000
agents-audit/src/main/java/org/apache/ranger/audit/provider/DummyAuditProvider.java 05f882f
agents-audit/src/main/java/org/apache/ranger/audit/provider/MiscUtil.java eff3824
agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileCacheProviderSpool.java 41513ba
agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueue.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueueSpool.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/AbstractAuditWriter.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/JSONWriter.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCWriter.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/Writer.java PRE-CREATION
pom.xml 589cd6a
src/main/assembly/hbase-agent.xml 3ebc334
src/main/assembly/hdfs-agent.xml 5279a9a
src/main/assembly/hive-agent.xml ca65c80
src/main/assembly/knox-agent.xml 8357d49
src/main/assembly/plugin-atlas.xml fd98811
src/main/assembly/plugin-kafka.xml 95855d9
src/main/assembly/plugin-kms.xml 6d15f2a
src/main/assembly/plugin-solr.xml de30bfb
src/main/assembly/plugin-sqoop.xml d2bd69a
src/main/assembly/plugin-yarn.xml c6a48e8
src/main/assembly/storm-agent.xml 64224ec
Diff: https://reviews.apache.org/r/63552/diff/5/
Changes: https://reviews.apache.org/r/63552/diff/4-5/
Testing
-------
Testing done in local
Thanks,
Ramesh Mani
Re: Review Request 63552: RANGER-1837:Enhance Ranger Audit to HDFS to
support ORC file format
Posted by Kevin Risden <kr...@apache.org>.
> On Nov. 6, 2018, 9:31 a.m., Kevin Risden wrote:
> >
Changes look reasonable. Only 2 comments and more optimizations than anything else.
- Kevin
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63552/#review210350
-----------------------------------------------------------
On Jan. 26, 2018, 8:05 p.m., Ramesh Mani wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63552/
> -----------------------------------------------------------
>
> (Updated Jan. 26, 2018, 8:05 p.m.)
>
>
> Review request for ranger, Don Bosco Durai and Madhan Neethiraj.
>
>
> Repository: ranger
>
>
> Description
> -------
>
> RANGER-1837:Enhance Ranger Audit to HDFS to support ORC file format
>
>
> Diffs
> -----
>
> agents-audit/README.txt PRE-CREATION
> agents-audit/pom.xml c8bd1d8
> agents-audit/src/main/java/org/apache/ranger/audit/destination/AuditDestination.java 41d0e82
> agents-audit/src/main/java/org/apache/ranger/audit/destination/HDFSAuditDestination.java 66d8504
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditFileCacheProvider.java 314b130
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditHandler.java 4ce31dd
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditProviderFactory.java 43107ba
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditWriterFactory.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/provider/BaseAuditHandler.java b095000
> agents-audit/src/main/java/org/apache/ranger/audit/provider/DummyAuditProvider.java 05f882f
> agents-audit/src/main/java/org/apache/ranger/audit/provider/MiscUtil.java eff3824
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileCacheProviderSpool.java 41513ba
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueue.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueueSpool.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/AbstractAuditWriter.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/JSONWriter.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCWriter.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/Writer.java PRE-CREATION
> pom.xml 589cd6a
> src/main/assembly/hbase-agent.xml 3ebc334
> src/main/assembly/hdfs-agent.xml 5279a9a
> src/main/assembly/hive-agent.xml ca65c80
> src/main/assembly/knox-agent.xml 8357d49
> src/main/assembly/plugin-atlas.xml fd98811
> src/main/assembly/plugin-kafka.xml 95855d9
> src/main/assembly/plugin-kms.xml 6d15f2a
> src/main/assembly/plugin-solr.xml de30bfb
> src/main/assembly/plugin-sqoop.xml d2bd69a
> src/main/assembly/plugin-yarn.xml c6a48e8
> src/main/assembly/storm-agent.xml 64224ec
>
>
> Diff: https://reviews.apache.org/r/63552/diff/5/
>
>
> Testing
> -------
>
> Testing done in local
>
>
> Thanks,
>
> Ramesh Mani
>
>
Re: Review Request 63552: RANGER-1837:Enhance Ranger Audit to HDFS to
support ORC file format
Posted by Kevin Risden <kr...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63552/#review210350
-----------------------------------------------------------
agents-audit/README.txt
Lines 80 (patched)
<https://reviews.apache.org/r/63552/#comment295017>
Might be worth specifying partitioned by with the year/month/day. Otherwise, each query will need to search ALL orc files. I think nothing changes here in the fact that Ranger audits are written to HDFS in per day folders. right?
agents-audit/pom.xml
Lines 60 (patched)
<https://reviews.apache.org/r/63552/#comment295018>
Is there a chance this causes dependency conflict when put on the classpath of other projects. I thought hive-exec pulled in a bunch of extra dependencies.
- Kevin Risden
On Jan. 26, 2018, 8:05 p.m., Ramesh Mani wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63552/
> -----------------------------------------------------------
>
> (Updated Jan. 26, 2018, 8:05 p.m.)
>
>
> Review request for ranger, Don Bosco Durai and Madhan Neethiraj.
>
>
> Repository: ranger
>
>
> Description
> -------
>
> RANGER-1837:Enhance Ranger Audit to HDFS to support ORC file format
>
>
> Diffs
> -----
>
> agents-audit/README.txt PRE-CREATION
> agents-audit/pom.xml c8bd1d8
> agents-audit/src/main/java/org/apache/ranger/audit/destination/AuditDestination.java 41d0e82
> agents-audit/src/main/java/org/apache/ranger/audit/destination/HDFSAuditDestination.java 66d8504
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditFileCacheProvider.java 314b130
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditHandler.java 4ce31dd
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditProviderFactory.java 43107ba
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditWriterFactory.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/provider/BaseAuditHandler.java b095000
> agents-audit/src/main/java/org/apache/ranger/audit/provider/DummyAuditProvider.java 05f882f
> agents-audit/src/main/java/org/apache/ranger/audit/provider/MiscUtil.java eff3824
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileCacheProviderSpool.java 41513ba
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueue.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueueSpool.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/AbstractAuditWriter.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/JSONWriter.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCWriter.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/Writer.java PRE-CREATION
> pom.xml 589cd6a
> src/main/assembly/hbase-agent.xml 3ebc334
> src/main/assembly/hdfs-agent.xml 5279a9a
> src/main/assembly/hive-agent.xml ca65c80
> src/main/assembly/knox-agent.xml 8357d49
> src/main/assembly/plugin-atlas.xml fd98811
> src/main/assembly/plugin-kafka.xml 95855d9
> src/main/assembly/plugin-kms.xml 6d15f2a
> src/main/assembly/plugin-solr.xml de30bfb
> src/main/assembly/plugin-sqoop.xml d2bd69a
> src/main/assembly/plugin-yarn.xml c6a48e8
> src/main/assembly/storm-agent.xml 64224ec
>
>
> Diff: https://reviews.apache.org/r/63552/diff/5/
>
>
> Testing
> -------
>
> Testing done in local
>
>
> Thanks,
>
> Ramesh Mani
>
>
Re: Review Request 63552: RANGER-1837:Enhance Ranger Audit to HDFS to
support ORC file format
Posted by Velmurugan Periasamy <vp...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63552/#review222758
-----------------------------------------------------------
Ship it!
Ship It!
- Velmurugan Periasamy
On March 31, 2021, 6:01 p.m., Ramesh Mani wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63552/
> -----------------------------------------------------------
>
> (Updated March 31, 2021, 6:01 p.m.)
>
>
> Review request for ranger, Don Bosco Durai, Abhay Kulkarni, Madhan Neethiraj, Mehul Parikh, Selvamohan Neethiraj, Sailaja Polavarapu, and Velmurugan Periasamy.
>
>
> Bugs: RANGER-1837
> https://issues.apache.org/jira/browse/RANGER-1837
>
>
> Repository: ranger
>
>
> Description
> -------
>
> RANGER-1837:Enhance Ranger Audit to HDFS to support ORC file format
>
>
> Diffs
> -----
>
> agents-audit/pom.xml b9f6af27c
> agents-audit/src/main/java/org/apache/ranger/audit/destination/HDFSAuditDestination.java 5e6f40226
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditHandler.java 4ce31dd09
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditProviderFactory.java 6b7f4b00b
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditWriterFactory.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/provider/BaseAuditHandler.java 54f37644b
> agents-audit/src/main/java/org/apache/ranger/audit/provider/DummyAuditProvider.java 05f882ff3
> agents-audit/src/main/java/org/apache/ranger/audit/provider/MiscUtil.java e2b74489b
> agents-audit/src/main/java/org/apache/ranger/audit/provider/MultiDestAuditProvider.java 282f5abfa
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileCacheProviderSpool.java 41513ba40
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueue.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueueSpool.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/AbstractRangerAuditWriter.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerAuditWriter.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerJSONAuditWriter.java PRE-CREATION
> agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerORCAuditWriter.java PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/63552/diff/7/
>
>
> Testing
> -------
>
> Testing done in local
>
> ORC FILE FORMAT in HDFS Ranger Audit log with local audit file store as source for HDFS audit:
> NOTE: When this is done each records in the local file will be read for creating the ORC File.
>
> 1. Enable Ranger Audit to HDFS in ORC file format using AuditFileQueue
> - To enable Ranger Audit to HDFS with ORC format, we need to first enable AuditFileQueue to spool the audit to local first.
> * In Namenode host, create spool directory and make sure the path can be read/write/execute for owner of the Service for which Ranger plugin is enabled ( e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop ..etc)
>
> $ mkdir -p /var/log/hadoop/audit/staging/spool
> $ cd /var/log/hadoop/audit/staging/spool
> $ chown hdfs:hadoop spool
>
> * Enable AuditFileQueue via following params in ranger-<component>-audit.xml
> xasecure.audit.destination.hdfs.batch.queuetype=filequeue (NOTE: default = memqueue which is the behaviour where a memory queue / buffer is used instead of Local File buffer)
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300 ( This will determine the batch size for ORC file which is created)
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool ( This is the local staging directory for audit)
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.buffer.size=10000 ( This will determine batch size for ORC file creation alone with rollover.sec parameter)
>
> 2. Enable ORC fileformat for Ranger HDFS Audit.
> - This is done by having the following param in ranger-<component>-audit.xml. By default the value is "json"
>
> xasecure.audit.destination.hdfs.filetype=orc ( default = json )
>
> 3. Provision to control the compression techniques for ORC format. Default is 'snappy'
> xasecure.audit.destination.hdfs.orc.compression=snappy|lzo|zlip|none
>
> 4. Buffer Size and Stripe Size of ORC file batch. Default is '10000' bytes and '100000' bytes respectively. This will decide the batch size on ORC file in hdfs.
> xasecure.audit.destination.hdfs.orc.buffersize= (value in bytes)
> xasecure.audit.destination.hdfs.orc.stripesize= (value in bytes)
>
> 5. Hive Query to create ORC table with default 'snappy' compresssion.
>
> CREATE EXTERNAL TABLE ranger_audit_event (
> repositoryType int,
> repositoryName string,
> reqUser string,
> evtTime string,
> accessType string,
> resourcePath string,
> resourceType string,
> action string,
> accessResult string,
> agentId string,
> policyId bigint,
> resultReason string,
> aclEnforcer string,
> sessionId string,
> clientType string,
> clientIP string,
> requestData string,
> clusterName string
> )
> STORED AS ORC
> LOCATION '/ranger/audit/hdfs'
> TBLPROPERTIES ("orc.compress"="SNAPPY");
>
>
> -------------------------
>
> JSON FILE FORMAT in HDFS Ranger Audit log with local audit file store as source for HDFS audit:
> NOTE: When this is done each local file will be copied entirely into HDFS destination. This enables us to generate Ranger audit files in HDFS which are larger in size which is a preferred.
>
> 1. Enable Ranger Audit to HDFS in JSON file format using AuditFileQueue
> - To enable Ranger Audit to HDFS with JSON format and local file cached, we need to first enable AuditFileQueue to spool the audit to locally.
>
> * In Namenode host, create spool directory and make sure the path can be read/write/execute for owner of the Service for which Ranger plugin is enabled (e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop ..etc)
>
> $ mkdir -p /var/log/hadoop/audit/staging/spool
> $ cd /var/log/hadoop/audit/staging/spool
> $ chown hdfs:hadoop spool
>
> * Enable AuditFileQueue via following params in ranger-<component>-audit.xml
> xasecure.audit.destination.hdfs.batch.queuetype=filequeue ( NOTE: default = memqueue which is the behaviour where a memory queue / buffer is used instead of Local File buffer)
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300 ( This will determine the JSON file size which will be copied to HDFS)
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool ( This is the local staging directory for audit)
>
>
> Thanks,
>
> Ramesh Mani
>
>
Re: Review Request 63552: RANGER-1837:Enhance Ranger Audit to HDFS to
support ORC file format
Posted by Ramesh Mani <rm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63552/
-----------------------------------------------------------
(Updated March 31, 2021, 6:01 p.m.)
Review request for ranger, Don Bosco Durai, Abhay Kulkarni, Madhan Neethiraj, Mehul Parikh, Selvamohan Neethiraj, Sailaja Polavarapu, and Velmurugan Periasamy.
Changes
-------
Rebased to include HFlushCapableStream check
Bugs: RANGER-1837
https://issues.apache.org/jira/browse/RANGER-1837
Repository: ranger
Description
-------
RANGER-1837:Enhance Ranger Audit to HDFS to support ORC file format
Diffs (updated)
-----
agents-audit/pom.xml b9f6af27c
agents-audit/src/main/java/org/apache/ranger/audit/destination/HDFSAuditDestination.java 5e6f40226
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditHandler.java 4ce31dd09
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditProviderFactory.java 6b7f4b00b
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditWriterFactory.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/provider/BaseAuditHandler.java 54f37644b
agents-audit/src/main/java/org/apache/ranger/audit/provider/DummyAuditProvider.java 05f882ff3
agents-audit/src/main/java/org/apache/ranger/audit/provider/MiscUtil.java e2b74489b
agents-audit/src/main/java/org/apache/ranger/audit/provider/MultiDestAuditProvider.java 282f5abfa
agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileCacheProviderSpool.java 41513ba40
agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueue.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueueSpool.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/AbstractRangerAuditWriter.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerAuditWriter.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerJSONAuditWriter.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerORCAuditWriter.java PRE-CREATION
Diff: https://reviews.apache.org/r/63552/diff/7/
Changes: https://reviews.apache.org/r/63552/diff/6-7/
Testing
-------
Testing done in local
ORC FILE FORMAT in HDFS Ranger Audit log with local audit file store as source for HDFS audit:
NOTE: When this is done each records in the local file will be read for creating the ORC File.
1. Enable Ranger Audit to HDFS in ORC file format using AuditFileQueue
- To enable Ranger Audit to HDFS with ORC format, we need to first enable AuditFileQueue to spool the audit to local first.
* In Namenode host, create spool directory and make sure the path can be read/write/execute for owner of the Service for which Ranger plugin is enabled ( e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop ..etc)
$ mkdir -p /var/log/hadoop/audit/staging/spool
$ cd /var/log/hadoop/audit/staging/spool
$ chown hdfs:hadoop spool
* Enable AuditFileQueue via following params in ranger-<component>-audit.xml
xasecure.audit.destination.hdfs.batch.queuetype=filequeue (NOTE: default = memqueue which is the behaviour where a memory queue / buffer is used instead of Local File buffer)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300 ( This will determine the batch size for ORC file which is created)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool ( This is the local staging directory for audit)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.buffer.size=10000 ( This will determine batch size for ORC file creation alone with rollover.sec parameter)
2. Enable ORC fileformat for Ranger HDFS Audit.
- This is done by having the following param in ranger-<component>-audit.xml. By default the value is "json"
xasecure.audit.destination.hdfs.filetype=orc ( default = json )
3. Provision to control the compression techniques for ORC format. Default is 'snappy'
xasecure.audit.destination.hdfs.orc.compression=snappy|lzo|zlip|none
4. Buffer Size and Stripe Size of ORC file batch. Default is '10000' bytes and '100000' bytes respectively. This will decide the batch size on ORC file in hdfs.
xasecure.audit.destination.hdfs.orc.buffersize= (value in bytes)
xasecure.audit.destination.hdfs.orc.stripesize= (value in bytes)
5. Hive Query to create ORC table with default 'snappy' compresssion.
CREATE EXTERNAL TABLE ranger_audit_event (
repositoryType int,
repositoryName string,
reqUser string,
evtTime string,
accessType string,
resourcePath string,
resourceType string,
action string,
accessResult string,
agentId string,
policyId bigint,
resultReason string,
aclEnforcer string,
sessionId string,
clientType string,
clientIP string,
requestData string,
clusterName string
)
STORED AS ORC
LOCATION '/ranger/audit/hdfs'
TBLPROPERTIES ("orc.compress"="SNAPPY");
-------------------------
JSON FILE FORMAT in HDFS Ranger Audit log with local audit file store as source for HDFS audit:
NOTE: When this is done each local file will be copied entirely into HDFS destination. This enables us to generate Ranger audit files in HDFS which are larger in size which is a preferred.
1. Enable Ranger Audit to HDFS in JSON file format using AuditFileQueue
- To enable Ranger Audit to HDFS with JSON format and local file cached, we need to first enable AuditFileQueue to spool the audit to locally.
* In Namenode host, create spool directory and make sure the path can be read/write/execute for owner of the Service for which Ranger plugin is enabled (e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop ..etc)
$ mkdir -p /var/log/hadoop/audit/staging/spool
$ cd /var/log/hadoop/audit/staging/spool
$ chown hdfs:hadoop spool
* Enable AuditFileQueue via following params in ranger-<component>-audit.xml
xasecure.audit.destination.hdfs.batch.queuetype=filequeue ( NOTE: default = memqueue which is the behaviour where a memory queue / buffer is used instead of Local File buffer)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300 ( This will determine the JSON file size which will be copied to HDFS)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool ( This is the local staging directory for audit)
Thanks,
Ramesh Mani
Re: Review Request 63552: RANGER-1837:Enhance Ranger Audit to HDFS to
support ORC file format
Posted by Ramesh Mani <rm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63552/
-----------------------------------------------------------
(Updated Feb. 4, 2021, 12:49 a.m.)
Review request for ranger, Don Bosco Durai, Abhay Kulkarni, Madhan Neethiraj, Mehul Parikh, Selvamohan Neethiraj, Sailaja Polavarapu, and Velmurugan Periasamy.
Changes
-------
Revised patch based on comments and testing done.
Bugs: RANGER-1837
https://issues.apache.org/jira/browse/RANGER-1837
Repository: ranger
Description
-------
RANGER-1837:Enhance Ranger Audit to HDFS to support ORC file format
Diffs (updated)
-----
agents-audit/pom.xml 85dd550ad
agents-audit/src/main/java/org/apache/ranger/audit/destination/HDFSAuditDestination.java 906ff341f
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditHandler.java 4ce31dd09
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditProviderFactory.java f971a76f0
agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditWriterFactory.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/provider/BaseAuditHandler.java 6138ca0eb
agents-audit/src/main/java/org/apache/ranger/audit/provider/DummyAuditProvider.java 05f882ff3
agents-audit/src/main/java/org/apache/ranger/audit/provider/MiscUtil.java e2b74489b
agents-audit/src/main/java/org/apache/ranger/audit/provider/MultiDestAuditProvider.java 282f5abfa
agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileCacheProviderSpool.java 41513ba40
agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueue.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueueSpool.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/AbstractRangerAuditWriter.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerAuditWriter.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerJSONAuditWriter.java PRE-CREATION
agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerORCAuditWriter.java PRE-CREATION
Diff: https://reviews.apache.org/r/63552/diff/6/
Changes: https://reviews.apache.org/r/63552/diff/5-6/
Testing (updated)
-------
Testing done in local
ORC FILE FORMAT in HDFS Ranger Audit log with local audit file store as source for HDFS audit:
NOTE: When this is done each records in the local file will be read for creating the ORC File.
1. Enable Ranger Audit to HDFS in ORC file format using AuditFileQueue
- To enable Ranger Audit to HDFS with ORC format, we need to first enable AuditFileQueue to spool the audit to local first.
* In Namenode host, create spool directory and make sure the path can be read/write/execute for owner of the Service for which Ranger plugin is enabled ( e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop ..etc)
$ mkdir -p /var/log/hadoop/audit/staging/spool
$ cd /var/log/hadoop/audit/staging/spool
$ chown hdfs:hadoop spool
* Enable AuditFileQueue via following params in ranger-<component>-audit.xml
xasecure.audit.destination.hdfs.batch.queuetype=filequeue (NOTE: default = memqueue which is the behaviour where a memory queue / buffer is used instead of Local File buffer)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300 ( This will determine the batch size for ORC file which is created)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool ( This is the local staging directory for audit)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.buffer.size=10000 ( This will determine batch size for ORC file creation alone with rollover.sec parameter)
2. Enable ORC fileformat for Ranger HDFS Audit.
- This is done by having the following param in ranger-<component>-audit.xml. By default the value is "json"
xasecure.audit.destination.hdfs.filetype=orc ( default = json )
3. Provision to control the compression techniques for ORC format. Default is 'snappy'
xasecure.audit.destination.hdfs.orc.compression=snappy|lzo|zlip|none
4. Buffer Size and Stripe Size of ORC file batch. Default is '10000' bytes and '100000' bytes respectively. This will decide the batch size on ORC file in hdfs.
xasecure.audit.destination.hdfs.orc.buffersize= (value in bytes)
xasecure.audit.destination.hdfs.orc.stripesize= (value in bytes)
5. Hive Query to create ORC table with default 'snappy' compresssion.
CREATE EXTERNAL TABLE ranger_audit_event (
repositoryType int,
repositoryName string,
reqUser string,
evtTime string,
accessType string,
resourcePath string,
resourceType string,
action string,
accessResult string,
agentId string,
policyId bigint,
resultReason string,
aclEnforcer string,
sessionId string,
clientType string,
clientIP string,
requestData string,
clusterName string
)
STORED AS ORC
LOCATION '/ranger/audit/hdfs'
TBLPROPERTIES ("orc.compress"="SNAPPY");
-------------------------
JSON FILE FORMAT in HDFS Ranger Audit log with local audit file store as source for HDFS audit:
NOTE: When this is done each local file will be copied entirely into HDFS destination. This enables us to generate Ranger audit files in HDFS which are larger in size which is a preferred.
1. Enable Ranger Audit to HDFS in JSON file format using AuditFileQueue
- To enable Ranger Audit to HDFS with JSON format and local file cached, we need to first enable AuditFileQueue to spool the audit to locally.
* In Namenode host, create spool directory and make sure the path can be read/write/execute for owner of the Service for which Ranger plugin is enabled (e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop ..etc)
$ mkdir -p /var/log/hadoop/audit/staging/spool
$ cd /var/log/hadoop/audit/staging/spool
$ chown hdfs:hadoop spool
* Enable AuditFileQueue via following params in ranger-<component>-audit.xml
xasecure.audit.destination.hdfs.batch.queuetype=filequeue ( NOTE: default = memqueue which is the behaviour where a memory queue / buffer is used instead of Local File buffer)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300 ( This will determine the JSON file size which will be copied to HDFS)
xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool ( This is the local staging directory for audit)
Thanks,
Ramesh Mani