You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by John Omernik <jo...@omernik.com> on 2015/10/15 18:09:58 UTC

How to format paths in select: MapR Audit Logs

Hey all -

I am trying to demonstrate a neat use case. Using audit logs in MapR, I'd
like to be able to point Drill at the directory, and just go, no loading of
data, just go.

The problem I am having is how to describe the path.  First how logs are
stored.

>From the base of MapRFS

/var/mapr/local/node1/audit/*.json
/var/mapr/local/node2/audit/*.json
/var/mapr/local/node3/audit/*.json
/var/mapr/local/node4/audit/*.json
/var/mapr/local/node5/audit/*.json

So as you can see, they could be in multiple directories. I'd like to be
able to query all the logs at once without moving the files. (Not sure if
this is possible)

Anywho, here is what I've tried

use dfs.`default`;

select * from `var/mapr/local/*/audit/*.json` limit 10;
and
select * from `var/mapr/local/*/audit/*` limit 10;

This gave me an odd "Range must not be empty, but was [0ΓΆΓΏ0)" message. (I
do have empty json files... is this an issue?

Then I tried

select * from `var/mapr/local` where dir1 = 'audit' limit 10;

and that gave me Validation Error of "Relative path in absolute URI:
clustermetrics.2015-10.07.01:00:00"

This is interesting, as that file is in
/var/mapr/local/nodeX/metrics/  and based on the dir1 clause, shouldn't
even be checked, ( I wonder if this is related to
https://issues.apache.org/jira/browse/DRILL-3759?)

Note I've tried all the queries with and without a leading / (can't tell if
that is needed or not)

Any other thoughts on how I can query these files? MapR folks, this would
be an OUTSTANDING use case for showing off auditing and drill.  And I see
there is a blog at
https://www.mapr.com/blog/changing-game-when-it-comes-auditing-big-data-part-1

which teases at showing off the the power of drill + auditing, however I
see no part 2 and I am chompin at the bit to show this off as part of the
PoC :)