You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "bharath v (JIRA)" <ji...@apache.org> on 2018/01/09 01:29:01 UTC

[jira] [Resolved] (IMPALA-6348) Redact only sensitive fields in runtime profile

     [ https://issues.apache.org/jira/browse/IMPALA-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

bharath v resolved IMPALA-6348.
-------------------------------
       Resolution: Fixed
         Assignee: bharath v
    Fix Version/s: Impala 2.12.0

https://github.com/apache/impala/commit/6a87eb20a5d55b0a2f6f9102375ff8da4b98ccba

> Redact only sensitive fields in runtime profile
> -----------------------------------------------
>
>                 Key: IMPALA-6348
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6348
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.10.0, Impala 2.11.0
>            Reporter: bharath v
>            Assignee: bharath v
>              Labels: newbie, ramp-up
>             Fix For: Impala 2.12.0
>
>
> Currently, the redactor is run on every info string in the run-time profile.
> {noformat}
> void RuntimeProfile::AddInfoStringInternal(
>     const string& key, const string& value, bool append) {
>   // Values may contain sensitive data, such as a query.
>   const string& info = RedactCopy(value);  <-----
>   lock_guard<SpinLock> l(info_strings_lock_);
>   InfoStrings::iterator it = info_strings_.find(key);
>   if (it == info_strings_.end()) {
>     info_strings_.insert(make_pair(key, info));
>     info_strings_display_order_.push_back(key);
>   } else {
>     if (append) {
>       it->second += ", " + value;
>     } else {
>       it->second = info;
>     }
>   }
> }
> {noformat}
> For example, if the user tries to redact with the following regex with the intention that all emails in the query string to be redacted, the side effect of the bug is that it redacts the "User" and "Connected user" parts of the query profile.
> {noformat}
> {
>   "version": 1,
>   "rules": [
>     {
>       "description": "Email addresses",
>       "search": "\\b([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\\-\\._]*[A-Za-z0-9])@([A-Za-z0-9\\.]|[A-Za-z\\.][A-Za-z0-9\\-\\.]*[A-Za-z0-9\\.])+\\b",
>       "caseSensitive": true,
>       "replace": "email@redacted.host"
>     }
>   ]
> {noformat}
> {noformat}
> Query (id=e24f32fa563e2c5d:9ddefb2300000000)
>   Summary
>     Session ID: 634deaf67308fdd0:781af1fe76464ca9
>     Session Type: BEESWAX
>     Start Time: 2017-12-13 13:34:31.984911000
>     End Time: 2017-12-13 13:34:37.781489000
>     Query Type: QUERY
>     Query State: FINISHED
>     Query Status: OK
>     Impala Version: impalad version 2.10.0 RELEASE (build 871adff6d6e56b57de33059dec2d7fe38e2366bd)
>     User: email@redacted.host <================ not expected
>     Connected User: email@redacted.host <====== not expected
> {noformat}
> Expected fix: Redact only the sensitive fields. Do not redact anything else in the run-time profiles



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)