You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@superset.apache.org by vi...@apache.org on 2020/02/22 09:21:58 UTC

[incubator-superset] branch master updated: [SIP-29] Add support for row-level security (#8699)

This is an automated email from the ASF dual-hosted git repository.

villebro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-superset.git


The following commit(s) were added to refs/heads/master by this push:
     new dee16de  [SIP-29] Add support for row-level security (#8699)
dee16de is described below

commit dee16de03ea6ed2a08c4d525c89676f6aa5ff3d4
Author: altef <br...@gmail.com>
AuthorDate: Sat Feb 22 01:21:31 2020 -0800

    [SIP-29] Add support for row-level security (#8699)
    
    * Support and apply filters.
    
    * Added the UI for row level security, and moved it all under SQLA in order to access the Table model more easily.
    
    * Added a row level security filter documentation entry.
    
    * Accidentally added two new lines to this file.
    
    * Blacked and iSorted, hopefully.  Also, sometimes g.user may not be set.
    
    * Another isort, and handling g not having a user attribute another way.
    
    * Let's try this again #CI tests.
    
    * Adjusted import order for isort; I was sure I'd already done this..
    
    * Row level filters should be wrapped in parentheses in case one contains an OR.
    
    * Oops, did not think that would change Black's formatting.
    
    * Changes as per @mistercrunch.
    
    * RLS filters are now many-to-many with Roles.
    
    * Updated documentation to reflect RLS filters supporting multiple rows.
    
    * Let's see what happens when I set it to the previous revision ID
    
    * Updated from upstream.
    
    * There was a pylint error.
    
    * Added RLS ids to the cache keys; modified documentation; added template processing to RLS filters.
    
    * A new migration was merged in.
    
    * Removed RLS cache key from query_object.
    
    * RLS added to the cache_key from query_context.
    
    * Changes as per @etr2460.
    
    * Updating entry for RLS pull request.
    
    * Another migration to skip.
    
    * Changes as per @serenajiang.
    
    * Blacked.
    
    * Blacked and added some attributes to check for.
    
    * Changed to a manual query as per @mistercrunch.
    
    * Blacked.
    
    * Another migration in the meantime.
    
    * Black wanted some whitespace changes.
    
    * AttributeError: 'AnonymousUserMixin' object has no attribute 'id'.
    
    * Oops, did hasattr backwards.
    
    * Changes as per @mistercrunch.
    
    * Doesn't look like text us required here anymore.
    
    * Changes as per @dpgaspar
    
    * Two RLS tests.
    
    * Row level security is now disabled by default via the feature flag ENABLE_ROW_LEVEL_SECURITY.
    
    * New head to revise.
    
    * Changed the comment.
---
 UPDATING.md                                        |  5 ++
 docs/security.rst                                  | 23 +++++++
 superset/app.py                                    | 10 +++
 superset/common/query_context.py                   |  3 +-
 superset/config.py                                 |  9 +++
 superset/connectors/sqla/models.py                 | 45 +++++++++++++-
 superset/connectors/sqla/views.py                  | 38 +++++++++++-
 .../0a6f12f60c73_add_role_level_security.py        | 61 +++++++++++++++++++
 superset/security/manager.py                       | 46 ++++++++++++++
 superset/viz.py                                    |  3 +-
 tests/security_tests.py                            | 71 +++++++++++++++++++++-
 11 files changed, 308 insertions(+), 6 deletions(-)

diff --git a/UPDATING.md b/UPDATING.md
index 49f817e..d524f65 100644
--- a/UPDATING.md
+++ b/UPDATING.md
@@ -42,6 +42,11 @@ timestamp has been added to the query object's cache key to ensure updates to
 datasources are always reflected in associated query results. As a consequence all
 previously cached results will be invalidated when updating to the next version.
 
+* [8699](https://github.com/apache/incubator-superset/pull/8699): A `row_level_security_filters` 
+table has been added, which is many-to-many with `tables` and `ab_roles`.  The applicable filters 
+are added to the sqla query, and the RLS ids are added to the query cache keys. If RLS is enabled in config.py (`ENABLE_ROW_LEVEL_SECURITY = True`; by default, it is disabled), they can be 
+accessed through the `Security` menu, or when editting a table.
+
 * [8732](https://github.com/apache/incubator-superset/pull/8732): Swagger user interface is now enabled by default.
 A new permission `show on SwaggerView` is created by `superset init` and given to the `Admin` Role. To disable the UI,
 set `FAB_API_SWAGGER_UI = False` on config.
diff --git a/docs/security.rst b/docs/security.rst
index 3569999..911aabe 100644
--- a/docs/security.rst
+++ b/docs/security.rst
@@ -153,3 +153,26 @@ a set of data sources that power dashboards only made available to executives.
 When looking at its dashboard list, this user will only see the
 list of dashboards it has access to, based on the roles and
 permissions that were attributed.
+
+
+Restricting access to a subset of a particular table
+""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Using ``Row level security filters`` (under the ``Security`` menu) you can create 
+filters that are assigned to a particular table, as well as a set of roles. 
+Say people in your finance department should only have access to rows where 
+``department = "finance"``.  You could create a ``Row level security filter`` 
+with that clause, and assign it to your ``Finance`` role, as well as the 
+applicable table.
+
+The ``clause`` field can contain arbitrary text which is then added to the generated 
+SQL statement's ``WHERE`` clause.  So you could even do something like create a 
+filter for the last 30 days and apply it to a specific role, with a clause like 
+``date_field > DATE_SUB(NOW(), INTERVAL 30 DAY)``.  It can also support multiple 
+conditions: ``client_id = 6 AND advertiser="foo"``, etc. 
+
+All relevant ``Row level security filters`` will be ANDed together, so it's 
+possible to create a situation where two roles conflict in such a way as to 
+limit a table subset to empty.  For example, the filters ``client_id=4`` and 
+and ``client_id=5``, applied to a role, will result in users of that role having 
+``client_id=4 AND client_id=5`` added to their query, which can never be true.
\ No newline at end of file
diff --git a/superset/app.py b/superset/app.py
index e9d085c..29141c2 100644
--- a/superset/app.py
+++ b/superset/app.py
@@ -134,6 +134,7 @@ class SupersetAppInitializer:
             TableColumnInlineView,
             SqlMetricInlineView,
             TableModelView,
+            RowLevelSecurityFiltersModelView,
         )
         from superset.views.annotations import (
             AnnotationLayerModelView,
@@ -255,6 +256,15 @@ class SupersetAppInitializer:
             category_label=__("Manage"),
             icon="fa-search",
         )
+        if self.config["ENABLE_ROW_LEVEL_SECURITY"]:
+            appbuilder.add_view(
+                RowLevelSecurityFiltersModelView,
+                "Row Level Security Filters",
+                label=__("Row level security filters"),
+                category="Security",
+                category_label=__("Security"),
+                icon="fa-lock",
+            )
 
         #
         # Setup views with no menu
diff --git a/superset/common/query_context.py b/superset/common/query_context.py
index 7c982f0..edfcfaa 100644
--- a/superset/common/query_context.py
+++ b/superset/common/query_context.py
@@ -22,7 +22,7 @@ from typing import Any, ClassVar, Dict, List, Optional
 import numpy as np
 import pandas as pd
 
-from superset import app, cache, db
+from superset import app, cache, db, security_manager
 from superset.connectors.base.models import BaseDatasource
 from superset.connectors.connector_registry import ConnectorRegistry
 from superset.stats_logger import BaseStatsLogger
@@ -163,6 +163,7 @@ class QueryContext:
             query_obj.cache_key(
                 datasource=self.datasource.uid,
                 extra_cache_keys=extra_cache_keys,
+                rls=security_manager.get_rls_ids(self.datasource),
                 changed_on=self.datasource.changed_on,
                 **kwargs
             )
diff --git a/superset/config.py b/superset/config.py
index 45237b1..ef473ed 100644
--- a/superset/config.py
+++ b/superset/config.py
@@ -731,6 +731,15 @@ TALISMAN_CONFIG = {
     "force_https_permanent": False,
 }
 
+# Note that: RowLevelSecurityFilter is only given by default to the Admin role
+# and the Admin Role does have the all_datasources security permission.
+# But, if users create a specific role with access to RowLevelSecurityFilter MVC
+# and a custom datasource access, the table dropdown will not be correctly filtered
+# by that custom datasource access. So we are assuming a default security config,
+# a custom security config could potentially give access to setting filters on
+# tables that users do not have access to.
+ENABLE_ROW_LEVEL_SECURITY = False
+
 #
 # Flask session cookie options
 #
diff --git a/superset/connectors/sqla/models.py b/superset/connectors/sqla/models.py
index b25d09b..9236bc6 100644
--- a/superset/connectors/sqla/models.py
+++ b/superset/connectors/sqla/models.py
@@ -57,7 +57,7 @@ from superset.exceptions import DatabaseNotFound
 from superset.jinja_context import get_template_processor
 from superset.models.annotations import Annotation
 from superset.models.core import Database
-from superset.models.helpers import QueryResult
+from superset.models.helpers import AuditMixinNullable, QueryResult
 from superset.utils import core as utils, import_datasource
 
 config = app.config
@@ -646,6 +646,19 @@ class SqlaTable(Model, BaseDatasource):
 
         return self.make_sqla_column_compatible(sqla_metric, label)
 
+    def _get_sqla_row_level_filters(self, template_processor) -> List[str]:
+        """
+        Return the appropriate row level security filters for this table and the current user.
+
+        :param BaseTemplateProcessor template_processor: The template processor to apply to the filters.
+        :returns: A list of SQL clauses to be ANDed together.
+        :rtype: List[str]
+        """
+        return [
+            text("({})".format(template_processor.process_template(f.clause)))
+            for f in security_manager.get_rls_filters(self)
+        ]
+
     def get_sqla_query(  # sqla
         self,
         groupby,
@@ -827,6 +840,8 @@ class SqlaTable(Model, BaseDatasource):
                         where_clause_and.append(col_obj.get_sqla_col() == None)
                     elif op == "IS NOT NULL":
                         where_clause_and.append(col_obj.get_sqla_col() != None)
+
+        where_clause_and += self._get_sqla_row_level_filters(template_processor)
         if extras:
             where = extras.get("where")
             if where:
@@ -944,7 +959,6 @@ class SqlaTable(Model, BaseDatasource):
                     result.df, dimensions, groupby_exprs_sans_timestamp
                 )
                 qry = qry.where(top_groups)
-
         return SqlaQuery(
             extra_cache_keys=extra_cache_keys,
             labels_expected=labels_expected,
@@ -1188,3 +1202,30 @@ class SqlaTable(Model, BaseDatasource):
 
 sa.event.listen(SqlaTable, "after_insert", security_manager.set_perm)
 sa.event.listen(SqlaTable, "after_update", security_manager.set_perm)
+
+
+RLSFilterRoles = Table(
+    "rls_filter_roles",
+    metadata,
+    Column("id", Integer, primary_key=True),
+    Column("role_id", Integer, ForeignKey("ab_role.id"), nullable=False),
+    Column("rls_filter_id", Integer, ForeignKey("row_level_security_filters.id")),
+)
+
+
+class RowLevelSecurityFilter(Model, AuditMixinNullable):
+    """
+    Custom where clauses attached to Tables and Roles.
+    """
+
+    __tablename__ = "row_level_security_filters"
+    id = Column(Integer, primary_key=True)  # pylint: disable=invalid-name
+    roles = relationship(
+        security_manager.role_model,
+        secondary=RLSFilterRoles,
+        backref="row_level_security_filters",
+    )
+
+    table_id = Column(Integer, ForeignKey("tables.id"), nullable=False)
+    table = relationship(SqlaTable, backref="row_level_security_filters")
+    clause = Column(Text, nullable=False)
diff --git a/superset/connectors/sqla/views.py b/superset/connectors/sqla/views.py
index 390012d..1d7d466 100644
--- a/superset/connectors/sqla/views.py
+++ b/superset/connectors/sqla/views.py
@@ -225,6 +225,38 @@ class SqlMetricInlineView(CompactCRUDMixin, SupersetModelView):
     edit_form_extra_fields = add_form_extra_fields
 
 
+class RowLevelSecurityFiltersModelView(SupersetModelView, DeleteMixin):
+    datamodel = SQLAInterface(models.RowLevelSecurityFilter)
+
+    list_title = _("Row level security filter")
+    show_title = _("Show Row level security filter")
+    add_title = _("Add Row level security filter")
+    edit_title = _("Edit Row level security filter")
+
+    list_columns = ["table.table_name", "roles", "clause", "creator", "modified"]
+    order_columns = ["table.table_name", "clause", "modified"]
+    edit_columns = ["table", "roles", "clause"]
+    show_columns = edit_columns
+    search_columns = ("table", "roles", "clause")
+    add_columns = edit_columns
+    base_order = ("changed_on", "desc")
+    description_columns = {
+        "table": _("This is the table this filter will be applied to."),
+        "roles": _("These are the roles this filter will be applied to."),
+        "clause": _(
+            "This is the condition that will be added to the WHERE clause. "
+            "For example, to only return rows for a particular client, you might put in: client_id = 9"
+        ),
+    }
+    label_columns = {
+        "table": _("Table"),
+        "roles": _("Roles"),
+        "clause": _("Clause"),
+        "creator": _("Creator"),
+        "modified": _("Modified"),
+    }
+
+
 class TableModelView(DatasourceModelView, DeleteMixin, YamlExportMixin):
     datamodel = SQLAInterface(models.SqlaTable)
     include_route_methods = RouteMethod.CRUD_SET
@@ -255,7 +287,11 @@ class TableModelView(DatasourceModelView, DeleteMixin, YamlExportMixin):
     ]
     base_filters = [["id", DatasourceFilter, lambda: []]]
     show_columns = edit_columns + ["perm", "slices"]
-    related_views = [TableColumnInlineView, SqlMetricInlineView]
+    related_views = [
+        TableColumnInlineView,
+        SqlMetricInlineView,
+        RowLevelSecurityFiltersModelView,
+    ]
     base_order = ("changed_on", "desc")
     search_columns = ("database", "schema", "table_name", "owners", "is_sqllab_view")
     description_columns = {
diff --git a/superset/migrations/versions/0a6f12f60c73_add_role_level_security.py b/superset/migrations/versions/0a6f12f60c73_add_role_level_security.py
new file mode 100644
index 0000000..b202b87
--- /dev/null
+++ b/superset/migrations/versions/0a6f12f60c73_add_role_level_security.py
@@ -0,0 +1,61 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""add_role_level_security
+
+Revision ID: 0a6f12f60c73
+Revises: 3325d4caccc8
+Create Date: 2019-12-04 17:07:54.390805
+
+"""
+
+# revision identifiers, used by Alembic.
+revision = "0a6f12f60c73"
+down_revision = "3325d4caccc8"
+
+import sqlalchemy as sa
+from alembic import op
+
+
+def upgrade():
+    op.create_table(
+        "row_level_security_filters",
+        sa.Column("created_on", sa.DateTime(), nullable=True),
+        sa.Column("changed_on", sa.DateTime(), nullable=True),
+        sa.Column("id", sa.Integer(), nullable=False),
+        sa.Column("table_id", sa.Integer(), nullable=False),
+        sa.Column("clause", sa.Text(), nullable=False),
+        sa.Column("created_by_fk", sa.Integer(), nullable=True),
+        sa.Column("changed_by_fk", sa.Integer(), nullable=True),
+        sa.ForeignKeyConstraint(["changed_by_fk"], ["ab_user.id"]),
+        sa.ForeignKeyConstraint(["created_by_fk"], ["ab_user.id"]),
+        sa.ForeignKeyConstraint(["table_id"], ["tables.id"]),
+        sa.PrimaryKeyConstraint("id"),
+    )
+    op.create_table(
+        "rls_filter_roles",
+        sa.Column("id", sa.Integer(), nullable=False),
+        sa.Column("role_id", sa.Integer(), nullable=False),
+        sa.Column("rls_filter_id", sa.Integer(), nullable=True),
+        sa.ForeignKeyConstraint(["rls_filter_id"], ["row_level_security_filters.id"]),
+        sa.ForeignKeyConstraint(["role_id"], ["ab_role.id"]),
+        sa.PrimaryKeyConstraint("id"),
+    )
+
+
+def downgrade():
+    op.drop_table("rls_filter_roles")
+    op.drop_table("row_level_security_filters")
diff --git a/superset/security/manager.py b/superset/security/manager.py
index f0afae1..7a0bd5d 100644
--- a/superset/security/manager.py
+++ b/superset/security/manager.py
@@ -120,6 +120,7 @@ class SupersetSecurityManager(SecurityManager):
         "RoleModelView",
         "LogModelView",
         "Security",
+        "RowLevelSecurityFiltersModelView",
     } | USER_MODEL_VIEWS
 
     ALPHA_ONLY_VIEW_MENUS = {"Upload a CSV"}
@@ -891,3 +892,48 @@ class SupersetSecurityManager(SecurityManager):
         """
 
         self.assert_datasource_permission(viz.datasource)
+
+    def get_rls_filters(self, table: "BaseDatasource"):
+        """
+        Retrieves the appropriate row level security filters for the current user and the passed table.
+
+        :param table: The table to check against
+        :returns: A list of filters.
+        """
+        if hasattr(g, "user") and hasattr(g.user, "id"):
+            from superset import db
+            from superset.connectors.sqla.models import (
+                RLSFilterRoles,
+                RowLevelSecurityFilter,
+            )
+
+            user_roles = (
+                db.session.query(assoc_user_role.c.role_id)
+                .filter(assoc_user_role.c.user_id == g.user.id)
+                .subquery()
+            )
+            filter_roles = (
+                db.session.query(RLSFilterRoles.c.id)
+                .filter(RLSFilterRoles.c.role_id.in_(user_roles))
+                .subquery()
+            )
+            query = (
+                db.session.query(
+                    RowLevelSecurityFilter.id, RowLevelSecurityFilter.clause
+                )
+                .filter(RowLevelSecurityFilter.table_id == table.id)
+                .filter(RowLevelSecurityFilter.id.in_(filter_roles))
+            )
+            return query.all()
+        return []
+
+    def get_rls_ids(self, table: "BaseDatasource") -> List[int]:
+        """
+        Retrieves the appropriate row level security filters IDs for the current user and the passed table.
+
+        :param table: The table to check against
+        :returns: A list of IDs.
+        """
+        ids = [f.id for f in self.get_rls_filters(table)]
+        ids.sort()  # Combinations rather than permutations
+        return ids
diff --git a/superset/viz.py b/superset/viz.py
index fee3fbd..8391f18 100644
--- a/superset/viz.py
+++ b/superset/viz.py
@@ -45,7 +45,7 @@ from geopy.point import Point
 from markdown import markdown
 from pandas.tseries.frequencies import to_offset
 
-from superset import app, cache, get_css_manifest_files
+from superset import app, cache, get_css_manifest_files, security_manager
 from superset.constants import NULL_STRING
 from superset.exceptions import NullValueException, SpatialException
 from superset.models.helpers import QueryResult
@@ -371,6 +371,7 @@ class BaseViz:
         cache_dict["time_range"] = self.form_data.get("time_range")
         cache_dict["datasource"] = self.datasource.uid
         cache_dict["extra_cache_keys"] = self.datasource.get_extra_cache_keys(query_obj)
+        cache_dict["rls"] = security_manager.get_rls_ids(self.datasource)
         cache_dict["changed_on"] = self.datasource.changed_on
         json_data = self.json_dumps(cache_dict, sort_keys=True)
         return hashlib.md5(json_data.encode("utf-8")).hexdigest()
diff --git a/tests/security_tests.py b/tests/security_tests.py
index 12b1f2e..7b8df26 100644
--- a/tests/security_tests.py
+++ b/tests/security_tests.py
@@ -20,11 +20,12 @@ import unittest
 from unittest.mock import Mock, patch
 
 import prison
+from flask import g
 
 import tests.test_app
 from superset import app, appbuilder, db, security_manager, viz
 from superset.connectors.druid.models import DruidCluster, DruidDatasource
-from superset.connectors.sqla.models import SqlaTable
+from superset.connectors.sqla.models import RowLevelSecurityFilter, SqlaTable
 from superset.exceptions import SupersetSecurityException
 from superset.models.core import Database
 from superset.models.slice import Slice
@@ -815,3 +816,71 @@ class SecurityManagerTests(SupersetTestCase):
 
         with self.assertRaises(SupersetSecurityException):
             security_manager.assert_viz_permission(test_viz)
+
+
+class RowLevelSecurityTests(SupersetTestCase):
+    """
+    Testing Row Level Security
+    """
+
+    rls_entry = None
+
+    def setUp(self):
+        session = db.session
+
+        # Create the RowLevelSecurityFilter
+        self.rls_entry = RowLevelSecurityFilter()
+        self.rls_entry.table = (
+            session.query(SqlaTable).filter_by(table_name="birth_names").first()
+        )
+        self.rls_entry.clause = "gender = 'male'"
+        self.rls_entry.roles.append(
+            security_manager.find_role("Gamma")
+        )  # db.session.query(Role).filter_by(name="Gamma").first())
+        db.session.add(self.rls_entry)
+
+        db.session.commit()
+
+    def tearDown(self):
+        session = db.session
+        session.delete(self.rls_entry)
+        session.commit()
+
+    # Do another test to make sure it doesn't alter another query
+    def test_rls_filter_alters_query(self):
+        g.user = self.get_user(
+            username="gamma"
+        )  # self.login() doesn't actually set the user
+        tbl = self.get_table_by_name("birth_names")
+        query_obj = dict(
+            groupby=[],
+            metrics=[],
+            filter=[],
+            is_timeseries=False,
+            columns=["name"],
+            granularity=None,
+            from_dttm=None,
+            to_dttm=None,
+            extras={},
+        )
+        sql = tbl.get_query_str(query_obj)
+        self.assertIn("gender = 'male'", sql)
+
+    def test_rls_filter_doesnt_alter_query(self):
+        g.user = self.get_user(
+            username="admin"
+        )  # self.login() doesn't actually set the user
+        tbl = self.get_table_by_name("birth_names")
+        query_obj = dict(
+            groupby=[],
+            metrics=[],
+            filter=[],
+            is_timeseries=False,
+            columns=["name"],
+            granularity=None,
+            from_dttm=None,
+            to_dttm=None,
+            extras={},
+        )
+        sql = tbl.get_query_str(query_obj)
+        self.assertNotIn("gender = 'male'", sql)