You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2022/07/20 15:43:59 UTC

[GitHub] [superset] reesercollins opened a new pull request, #20793: feat(api): Fetch datasets with a given advanced data type

reesercollins opened a new pull request, #20793:
URL: https://github.com/apache/superset/pull/20793

   <!---
   Please write the PR title following the conventions at https://www.conventionalcommits.org/en/v1.0.0/
   Example:
   fix(dashboard): load charts correctly
   -->
   
   ### SUMMARY
   For future drill-to actions, it would be useful to have an api to fetch all datasets with a given advanced data type.
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   <!--- Skip this if not applicable -->
   
   ### TESTING INSTRUCTIONS
   Set an advanced type on a dataset column. Calling the `advanced_data_type/datasets` api with that advanced type should return the dataset id and column id.
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [ ] Has associated issue:
   - [x] Required feature flags: `ENABLE_ADVANCED_DATA_TYPES: True`
   - [ ] Changes UI
   - [ ] Includes DB Migration (follow approval process in [SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [x] Introduces new feature or API
   - [ ] Removes existing feature or API
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


Re: [PR] feat(api): Fetch datasets with a given advanced data type [superset]

Posted by "mistercrunch (via GitHub)" <gi...@apache.org>.
mistercrunch closed pull request #20793: feat(api): Fetch datasets with a given advanced data type
URL: https://github.com/apache/superset/pull/20793


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] codecov[bot] commented on pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on PR #20793:
URL: https://github.com/apache/superset/pull/20793#issuecomment-1190488464

   # [Codecov](https://codecov.io/gh/apache/superset/pull/20793?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#20793](https://codecov.io/gh/apache/superset/pull/20793?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5cec554) into [master](https://codecov.io/gh/apache/superset/commit/115ab700df0f3bf4c8ce0321be7b439c82afc97f?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (115ab70) will **decrease** coverage by `11.57%`.
   > The diff coverage is `55.00%`.
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #20793       +/-   ##
   ===========================================
   - Coverage   66.35%   54.78%   -11.58%     
   ===========================================
     Files        1754     1756        +2     
     Lines       66693    66785       +92     
     Branches     7049     7049               
   ===========================================
   - Hits        44257    36586     -7671     
   - Misses      20639    28402     +7763     
     Partials     1797     1797               
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hive | `53.24% <55.00%> (+0.01%)` | :arrow_up: |
   | mysql | `?` | |
   | postgres | `?` | |
   | presto | `53.10% <55.00%> (+<0.01%)` | :arrow_up: |
   | python | `57.80% <55.00%> (-23.89%)` | :arrow_down: |
   | sqlite | `?` | |
   | unit | `50.25% <55.00%> (-0.33%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/superset/pull/20793?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [superset/advanced\_data\_type/api.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvYWR2YW5jZWRfZGF0YV90eXBlL2FwaS5weQ==) | `68.42% <52.63%> (-31.58%)` | :arrow_down: |
   | [superset/advanced\_data\_type/schemas.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvYWR2YW5jZWRfZGF0YV90eXBlL3NjaGVtYXMucHk=) | `100.00% <100.00%> (ø)` | |
   | [superset/utils/dashboard\_import\_export.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvdXRpbHMvZGFzaGJvYXJkX2ltcG9ydF9leHBvcnQucHk=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [superset/key\_value/commands/update.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQva2V5X3ZhbHVlL2NvbW1hbmRzL3VwZGF0ZS5weQ==) | `0.00% <0.00%> (-88.89%)` | :arrow_down: |
   | [superset/key\_value/commands/delete.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQva2V5X3ZhbHVlL2NvbW1hbmRzL2RlbGV0ZS5weQ==) | `0.00% <0.00%> (-85.30%)` | :arrow_down: |
   | [superset/key\_value/commands/delete\_expired.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQva2V5X3ZhbHVlL2NvbW1hbmRzL2RlbGV0ZV9leHBpcmVkLnB5) | `0.00% <0.00%> (-80.77%)` | :arrow_down: |
   | [superset/dashboards/commands/importers/v0.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZGFzaGJvYXJkcy9jb21tYW5kcy9pbXBvcnRlcnMvdjAucHk=) | `15.62% <0.00%> (-76.25%)` | :arrow_down: |
   | [superset/datasets/commands/update.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZGF0YXNldHMvY29tbWFuZHMvdXBkYXRlLnB5) | `25.30% <0.00%> (-68.68%)` | :arrow_down: |
   | [superset/datasets/commands/create.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZGF0YXNldHMvY29tbWFuZHMvY3JlYXRlLnB5) | `29.41% <0.00%> (-68.63%)` | :arrow_down: |
   | [superset/datasets/commands/importers/v0.py](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQvZGF0YXNldHMvY29tbWFuZHMvaW1wb3J0ZXJzL3YwLnB5) | `24.03% <0.00%> (-67.45%)` | :arrow_down: |
   | ... and [283 more](https://codecov.io/gh/apache/superset/pull/20793/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] eschutho commented on pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
eschutho commented on PR #20793:
URL: https://github.com/apache/superset/pull/20793#issuecomment-1217310071

   > The issue is the `/dataset/related/...` endpoint gives no control over what data is returned from it. It is hard-coded to return the primary key of the data model (In this case, the TableColumn) and the name. That means the endpoint is returning the column id, not the dataset id.
   
   I see what you mean. @dpgaspar would it weird for an api class to overwrite `_get_result_from_rows` to make this response more flexible to each model?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] eschutho commented on a diff in pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
eschutho commented on code in PR #20793:
URL: https://github.com/apache/superset/pull/20793#discussion_r946153674


##########
superset/advanced_data_type/api.py:
##########
@@ -146,3 +150,72 @@ def get_types(self) -> Response:
         """
 
         return self.response(200, result=list(ADVANCED_DATA_TYPES.keys()))
+
+    @protect()
+    @safe
+    @expose("/datasets", methods=["GET"])
+    @permission_name("read")
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: f"{self.__class__.__name__}.get",
+        log_to_statsd=False,  # pylint: disable-arguments-renamed
+    )
+    @rison(advanced_data_type_datasets_schema)
+    def get_datasets(self, **kwargs: Any) -> Response:
+        """Get all datasets with a column of the specified advanced type
+        ---
+        get:
+          description:
+            Get all datasets with a column of the specified advanced type
+          parameters:
+          - in: query
+            name: q
+            content:
+              application/json:
+                schema:
+                  $ref: '#/components/schemas/advanced_data_type_datasets_schema'
+          responses:
+            200:
+              description: Query result
+              content:
+                application/json:
+                  schema:
+                    type: object
+                    properties:
+                      result:
+                        type: object
+                        properties:
+                          table_id:
+                            type: array
+                            items:
+                              type: integer
+            401:
+              $ref: '#/components/responses/401'
+            404:
+              $ref: '#/components/responses/404'
+            500:
+              $ref: '#/components/responses/500'
+        """
+        item = kwargs["rison"]
+        advanced_data_type = item["type"]
+        columns = (
+            db.session.query(TableColumn.table_id, TableColumn.id)
+            .filter(TableColumn.advanced_data_type == advanced_data_type)
+            .all()
+        )
+        forbidden_datasets = []
+        result = {}
+        for column in columns:
+            if column[0] in forbidden_datasets:
+                continue
+            if column[0] not in result:
+                if DatasetDAO.find_by_id(column[0]) is None:
+                    forbidden_datasets.append(column[0])

Review Comment:
   I know that we've used this pattern in the past to check permissions by running `find_by_id` but it seems that we could improve performance by running the query one time rather than multiple times. 
   
   Here's an example where we create a query and then apply the base filters to it and just hit the db once: https://github.com/apache/superset/blob/1.4/superset/datasets/api.py#L482
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] hughhhh commented on a diff in pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
hughhhh commented on code in PR #20793:
URL: https://github.com/apache/superset/pull/20793#discussion_r946349601


##########
superset/advanced_data_type/api.py:
##########
@@ -146,3 +150,72 @@ def get_types(self) -> Response:
         """
 
         return self.response(200, result=list(ADVANCED_DATA_TYPES.keys()))
+
+    @protect()
+    @safe
+    @expose("/datasets", methods=["GET"])
+    @permission_name("read")
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: f"{self.__class__.__name__}.get",
+        log_to_statsd=False,  # pylint: disable-arguments-renamed
+    )
+    @rison(advanced_data_type_datasets_schema)
+    def get_datasets(self, **kwargs: Any) -> Response:
+        """Get all datasets with a column of the specified advanced type
+        ---
+        get:
+          description:
+            Get all datasets with a column of the specified advanced type
+          parameters:
+          - in: query
+            name: q
+            content:
+              application/json:
+                schema:
+                  $ref: '#/components/schemas/advanced_data_type_datasets_schema'
+          responses:
+            200:
+              description: Query result
+              content:
+                application/json:
+                  schema:
+                    type: object
+                    properties:
+                      result:
+                        type: object
+                        properties:
+                          table_id:
+                            type: array
+                            items:
+                              type: integer
+            401:
+              $ref: '#/components/responses/401'
+            404:
+              $ref: '#/components/responses/404'
+            500:
+              $ref: '#/components/responses/500'
+        """
+        item = kwargs["rison"]
+        advanced_data_type = item["type"]
+        columns = (
+            db.session.query(TableColumn.table_id, TableColumn.id)

Review Comment:
   couldn't we just do a join here
   ```
   db.session.query(SqlaTable).join(TableColumn, TableColumn.table_id == SqlaTable.id).filter(TableColumn.advanced_data_type == advanced_data_type).distinct()
   ```
    This should return the list of all the `SqlaTable`s with the filter `advanced_data_type`, also we should move this function into the `DatasetDAO.get_advanced_type_datasets`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] eschutho commented on pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
eschutho commented on PR #20793:
URL: https://github.com/apache/superset/pull/20793#issuecomment-1217061775

   > Hey @eschutho quick question about that. The data we need back is the columns that have the in question advanced data type as well as the dataset Id that they belong to. As far as I was aware from the related endpoint doesn't return the data in this way, but I could be completely wrong. would I be able to retrieve that data in a form mentioned above from `/datasets/related/columns`? Because it would be nice not to add a new endpoint.
   
   Would it work for you for the endpoint to return the entire column data? If so, you can add a filter to the query for the advanced data type and then you'll have the dataset id on the column data. Does that work?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] cccs-RyanS commented on pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
cccs-RyanS commented on PR #20793:
URL: https://github.com/apache/superset/pull/20793#issuecomment-1216611557

   > @reesercollins I wonder if you would be able to get what you needed from the /datasets/related/columns rest api by adding the advanced data type filter. In theory, that would save you from having to create a new api, and would keep it in the new rest v1 api. Then you can also use the `RelatedFieldFilter` for permissions.
   
   Hey question about that. The data we need back is the columns that have the in question advanced data type  as well as the dataset Id that they belong to. As far as I was aware from the related endpoint doesn't return the data in this way, it would just return the columns but I could be completely wrong. would I be able to retrieve that data in a form mentioned above from `/datasets/related/columns`? Because it would be nice not to add a new endpoint.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] eschutho commented on pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
eschutho commented on PR #20793:
URL: https://github.com/apache/superset/pull/20793#issuecomment-1227506796

   @reesercollins I spoke with @dpgaspar and I would recommend using the /related endpoint, but overwriting `_get_result_from_rows` to include more information (key/values) from the columns, including the dataset_id that you need. But to keep it backward compatible, I would leave the "value" and "text" keys as is for anyone who may be using this api. But let me know if you run into any complications with this. 🙏 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


Re: [PR] feat(api): Fetch datasets with a given advanced data type [superset]

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas closed pull request #20793: feat(api): Fetch datasets with a given advanced data type
URL: https://github.com/apache/superset/pull/20793


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] hughhhh commented on a diff in pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
hughhhh commented on code in PR #20793:
URL: https://github.com/apache/superset/pull/20793#discussion_r951579666


##########
superset/datasets/dao.py:
##########
@@ -367,6 +371,59 @@ def bulk_delete(models: Optional[List[SqlaTable]], commit: bool = True) -> None:
                 db.session.rollback()
             raise ex
 
+    @classmethod
+    def find_with_advanced_data_type(
+        cls, advanced_data_type: str
+    ) -> Dict[str, Dict[str, str]]:
+        query = None
+        if get_main_database().backend == "mysql":
+            query = (
+                db.session.query(
+                    SqlaTable.id,
+                    func.json_arrayagg(TableColumn.id),
+                    func.json_arrayagg(TableColumn.verbose_name),
+                    func.json_arrayagg(TableColumn.column_name),
+                )
+                .join(TableColumn, TableColumn.table_id == SqlaTable.id)
+                .filter(TableColumn.advanced_data_type == advanced_data_type)
+                .group_by(SqlaTable.id)
+            )
+        else:
+            query = (
+                db.session.query(
+                    SqlaTable.id,
+                    func.array_agg(TableColumn.id),
+                    func.array_agg(TableColumn.verbose_name),
+                    func.array_agg(TableColumn.column_name),
+                )
+                .join(TableColumn, TableColumn.table_id == SqlaTable.id)
+                .filter(TableColumn.advanced_data_type == advanced_data_type)
+                .group_by(SqlaTable.id)
+            )
+        query = cls.base_filter(
+            cls.id_column_name, SQLAInterface(cls.model_cls, db.session)
+        ).apply(query, None)
+        datasets = query.all()
+        print(datasets)

Review Comment:
   remove this 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] reesercollins commented on a diff in pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
reesercollins commented on code in PR #20793:
URL: https://github.com/apache/superset/pull/20793#discussion_r946809408


##########
superset/advanced_data_type/api.py:
##########
@@ -146,3 +150,72 @@ def get_types(self) -> Response:
         """
 
         return self.response(200, result=list(ADVANCED_DATA_TYPES.keys()))
+
+    @protect()
+    @safe
+    @expose("/datasets", methods=["GET"])
+    @permission_name("read")
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: f"{self.__class__.__name__}.get",
+        log_to_statsd=False,  # pylint: disable-arguments-renamed
+    )
+    @rison(advanced_data_type_datasets_schema)
+    def get_datasets(self, **kwargs: Any) -> Response:
+        """Get all datasets with a column of the specified advanced type
+        ---
+        get:
+          description:
+            Get all datasets with a column of the specified advanced type
+          parameters:
+          - in: query
+            name: q
+            content:
+              application/json:
+                schema:
+                  $ref: '#/components/schemas/advanced_data_type_datasets_schema'
+          responses:
+            200:
+              description: Query result
+              content:
+                application/json:
+                  schema:
+                    type: object
+                    properties:
+                      result:
+                        type: object
+                        properties:
+                          table_id:
+                            type: array
+                            items:
+                              type: integer
+            401:
+              $ref: '#/components/responses/401'
+            404:
+              $ref: '#/components/responses/404'
+            500:
+              $ref: '#/components/responses/500'
+        """
+        item = kwargs["rison"]
+        advanced_data_type = item["type"]
+        columns = (
+            db.session.query(TableColumn.table_id, TableColumn.id)

Review Comment:
   I am interested in the individual columns within the SqlaTable which have the given advanced_data_type, however, I have modified the query in the most recent commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] hughhhh commented on a diff in pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
hughhhh commented on code in PR #20793:
URL: https://github.com/apache/superset/pull/20793#discussion_r951586521


##########
superset/datasets/dao.py:
##########
@@ -367,6 +371,59 @@ def bulk_delete(models: Optional[List[SqlaTable]], commit: bool = True) -> None:
                 db.session.rollback()
             raise ex
 
+    @classmethod
+    def find_with_advanced_data_type(
+        cls, advanced_data_type: str
+    ) -> Dict[str, Dict[str, str]]:
+        query = None
+        if get_main_database().backend == "mysql":
+            query = (
+                db.session.query(
+                    SqlaTable.id,
+                    func.json_arrayagg(TableColumn.id),
+                    func.json_arrayagg(TableColumn.verbose_name),
+                    func.json_arrayagg(TableColumn.column_name),
+                )
+                .join(TableColumn, TableColumn.table_id == SqlaTable.id)
+                .filter(TableColumn.advanced_data_type == advanced_data_type)
+                .group_by(SqlaTable.id)
+            )
+        else:
+            query = (
+                db.session.query(
+                    SqlaTable.id,
+                    func.array_agg(TableColumn.id),
+                    func.array_agg(TableColumn.verbose_name),
+                    func.array_agg(TableColumn.column_name),
+                )
+                .join(TableColumn, TableColumn.table_id == SqlaTable.id)
+                .filter(TableColumn.advanced_data_type == advanced_data_type)
+                .group_by(SqlaTable.id)
+            )
+        query = cls.base_filter(
+            cls.id_column_name, SQLAInterface(cls.model_cls, db.session)
+        ).apply(query, None)
+        datasets = query.all()
+        print(datasets)
+        result = {}
+        for dataset in datasets:
+            column_id = dataset[1]

Review Comment:
   instead of using idx to reference the values in the object can you annotate with that actually columns values?
   `dataset.column_id`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


Re: [PR] feat(api): Fetch datasets with a given advanced data type [superset]

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on PR #20793:
URL: https://github.com/apache/superset/pull/20793#issuecomment-1930638779

   Seems like the conversation here tapered off... is there any intent to get this across the finish line? I'll close/reopen to jumpstart CI for good measure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] reesercollins commented on pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
reesercollins commented on PR #20793:
URL: https://github.com/apache/superset/pull/20793#issuecomment-1217094686

   > Would it work for you for the endpoint to return the entire column data? If so, you can add a filter to the query for the advanced data type and then you'll have the dataset id on the column data. Does that work?
   
   The issue is the `/dataset/related/columns` endpoint gives no control over what data is returned from it as far as I am aware. The endpoint is returning the column id, not the dataset id.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] eschutho commented on pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
eschutho commented on PR #20793:
URL: https://github.com/apache/superset/pull/20793#issuecomment-1215864284

   @reesercollins I wonder if you would be able to get what you needed from the /datasets/related/columns rest api by adding the advanced data type filter. In theory, that would save you from having to create a new api, and would keep it in the new rest v1 api. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] hughhhh commented on a diff in pull request #20793: feat(api): Fetch datasets with a given advanced data type

Posted by GitBox <gi...@apache.org>.
hughhhh commented on code in PR #20793:
URL: https://github.com/apache/superset/pull/20793#discussion_r946349601


##########
superset/advanced_data_type/api.py:
##########
@@ -146,3 +150,72 @@ def get_types(self) -> Response:
         """
 
         return self.response(200, result=list(ADVANCED_DATA_TYPES.keys()))
+
+    @protect()
+    @safe
+    @expose("/datasets", methods=["GET"])
+    @permission_name("read")
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: f"{self.__class__.__name__}.get",
+        log_to_statsd=False,  # pylint: disable-arguments-renamed
+    )
+    @rison(advanced_data_type_datasets_schema)
+    def get_datasets(self, **kwargs: Any) -> Response:
+        """Get all datasets with a column of the specified advanced type
+        ---
+        get:
+          description:
+            Get all datasets with a column of the specified advanced type
+          parameters:
+          - in: query
+            name: q
+            content:
+              application/json:
+                schema:
+                  $ref: '#/components/schemas/advanced_data_type_datasets_schema'
+          responses:
+            200:
+              description: Query result
+              content:
+                application/json:
+                  schema:
+                    type: object
+                    properties:
+                      result:
+                        type: object
+                        properties:
+                          table_id:
+                            type: array
+                            items:
+                              type: integer
+            401:
+              $ref: '#/components/responses/401'
+            404:
+              $ref: '#/components/responses/404'
+            500:
+              $ref: '#/components/responses/500'
+        """
+        item = kwargs["rison"]
+        advanced_data_type = item["type"]
+        columns = (
+            db.session.query(TableColumn.table_id, TableColumn.id)

Review Comment:
   couldn't we just do a join here
   ```
   db.session.query(SqlaTable).join(TableColumn, TableColumn.table_id == SqlaTable.id).filter(TableColumn.advanced_data_type == advanced_data_type).distinct()
   ```
    This should return the list of all the `SqlaTable`s with the filter `advanced_data_type` 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


Re: [PR] feat(api): Fetch datasets with a given advanced data type [superset]

Posted by "mistercrunch (via GitHub)" <gi...@apache.org>.
mistercrunch commented on PR #20793:
URL: https://github.com/apache/superset/pull/20793#issuecomment-1984876782

   PR looks busted with 5000+ files. Closing for now, feel free to reopen once the branch is rebased properly


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org