You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "bharath v (JIRA)" <ji...@apache.org> on 2017/12/20 01:47:00 UTC

[jira] [Created] (IMPALA-6348) Redact only the query string in runtime profile

bharath v created IMPALA-6348:
---------------------------------

             Summary: Redact only the query string in runtime profile
                 Key: IMPALA-6348
                 URL: https://issues.apache.org/jira/browse/IMPALA-6348
             Project: IMPALA
          Issue Type: Bug
    Affects Versions: Impala 2.10.0, Impala 2.8.0, Impala 2.7.0, Impala 2.11.0
            Reporter: bharath v


Currently, the redactor is run on every info string in the run-time profile.

{noformat}
void RuntimeProfile::AddInfoStringInternal(
    const string& key, const string& value, bool append) {
  // Values may contain sensitive data, such as a query.
  const string& info = RedactCopy(value);  <-----
  lock_guard<SpinLock> l(info_strings_lock_);
  InfoStrings::iterator it = info_strings_.find(key);
  if (it == info_strings_.end()) {
    info_strings_.insert(make_pair(key, info));
    info_strings_display_order_.push_back(key);
  } else {
    if (append) {
      it->second += ", " + value;
    } else {
      it->second = info;
    }
  }
}
{noformat}

For example, if the user tries to redact with the following regex with the intention that all emails in the query string to be redacted, the side effect of the bug is that it redacts the "User" and "Connected user" parts of the query profile.

{noformat}
{
  "version": 1,
  "rules": [
    {
      "description": "Email addresses",
      "search": "\\b([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\\-\\._]*[A-Za-z0-9])@([A-Za-z0-9\\.]|[A-Za-z\\.][A-Za-z0-9\\-\\.]*[A-Za-z0-9\\.])+\\b",
      "caseSensitive": true,
      "replace": "email@redacted.host"
    }
  ]
{noformat}

{noformat}
Query (id=e24f32fa563e2c5d:9ddefb2300000000)
  Summary
    Session ID: 634deaf67308fdd0:781af1fe76464ca9
    Session Type: BEESWAX
    Start Time: 2017-12-13 13:34:31.984911000
    End Time: 2017-12-13 13:34:37.781489000
    Query Type: QUERY
    Query State: FINISHED
    Query Status: OK
    Impala Version: impalad version 2.10.0 RELEASE (build 871adff6d6e56b57de33059dec2d7fe38e2366bd)
    User: email@redacted.host <================ not expected
    Connected User: email@redacted.host <====== not expected
{noformat}

Expected fix: Redact only the query-string. Do not redact anything else in the run-time profiles



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)