You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ranger.apache.org by Velmurugan Periasamy <vp...@hortonworks.com> on 2016/04/15 03:04:08 UTC
Re: Need Help to choose Apache Ranger

Thanks Bosco for the explanation.

Adding Ranger Dev group.

From: Don Bosco Durai <bo...@apache.org>>
Reply-To: "user@ranger.incubator.apache.org<ma...@ranger.incubator.apache.org>" <us...@ranger.incubator.apache.org>>
Date: Wednesday, April 13, 2016 at 2:30 PM
To: Rehan Ahmed Ch <ch...@gmail.com>>
Cc: "user@ranger.incubator.apache.org<ma...@ranger.incubator.apache.org>" <us...@ranger.incubator.apache.org>>
Subject: Re: Need Help to choose Apache Ranger

Copying Ranger user group...

Assuming you are referring for storing Audit records. Ranger currently supports 4 options (Solr, HDFS, DB and Log4Appender). The framework is extensible and you can write your custom destination. We also had a Kafka destination in the previous release. However, in this release we are asking users to use Log4Appender. E.g. You can use Kafka log4j appender and send it to Kafka. Similarly I know users who use log4j TCP appender to send to their custom app.

Regardless which destination you use,  the following features are available:

  1.  Ranger plugins have in-built mechanism to send the audits reliably to the destination. If the destination is down, it will write to local file and resume when it is available.
  2.  If the destination is slower than the rate the audits are generated, then it will spool to local file and throttle the writing. But it will eventually it will send the audits (local spool size is configurable and dependent on availability of disk space)
  3.  If you are using components like Hbase, Kafka or Solr which generate way too many audit records, then it will summarize the audits at the source based on unique user+request and send the summarized audits.
  4.  It uses different queues and spool file for each destination. So If you have destinations which support different speed (e.g. Solr v/s HDFS), you will not lose audits and also the faster destinations will get audit records sooner.

Saying that, you need to decide what you want to do with Audits and pick the appropriate destinations. From Ranger Admin UI point of view, we will only support Solr and DB. And we will drop support for DB in the next release. So if you are not going to use Ranger Admin to view the audit records, then you don't have to sent to Solr also.

We choose Solr for the following reasons:

  1.  Can scale to billions of documents
  2.  Transparent and native support for sharding and replications
  3.  Easy to add columns and also auto creates missing columns (like no sql). In RDBMS, alter table with large amount of data just doesn't work
  4.  Great searching capabilities
  5.  Native dashboard features like faceting, etc.
  6.  Easy to write your own custom application on top of it
  7.  Apache open source

In other words, Solr is a great product on its own :-)

Thanks

Bosco



From: Rehan Ahmed Ch <ch...@gmail.com>>
Date: Monday, April 11, 2016 at 2:10 AM
To: Don Bosco Durai <bo...@apache.org>>
Subject: Re: Need Help to choose Apache Ranger

Hi Don,

Can you please help to have some alternative of "Solr" in case we have opt to implement Ranger?

On Sun, Apr 10, 2016 at 12:09 AM, Rehan Ahmed Ch <ch...@gmail.com>> wrote:

Thank you very much dear Don Bosco for your outright response. Much obliged.

I will let you know in case if any your kind guidance will require. Thank you so much.

--
Truly,
Rehan Ahmed



--
Truly,
Rehan Ahmed