You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by GitBox <gi...@apache.org> on 2020/03/16 18:39:48 UTC

[GitHub] [atlas] vladhlinsky commented on issue #94: ATLAS-3665 Add 'queryText' attribute to the 'spark_process' type

vladhlinsky commented on issue #94: ATLAS-3665 Add 'queryText' attribute to the 'spark_process' type
URL: https://github.com/apache/atlas/pull/94#issuecomment-599699206
 
 
   @HeartSaVioR thank you for the review!
   After adopting #91 Spark Atlas Connector will create a `spark_process` for each supported at this time `QueryExecution` event except of `CreateTableCommand` and `CreateDataSourceTableCommand`. Most of these commands could be mapped to single query. Since `CreateTableCommand`, `CreateDataSourceTableCommand` and `ExternalCatalogEvent`'s do not result in a `spark_process` being created, it's possible to keep `recentQueries` attribute as list to aggregate corresponding SQL queries until a spark process is not created:
   ![Screenshot from 2020-03-16 19-51-09](https://user-images.githubusercontent.com/61428392/76786227-8837e380-67bf-11ea-9e4d-268429a52e5d.png)
   
   But it seems that at this moment queries that are mapped to `QueryExecution` events will be more suitable to make `spark_process` more readable by the user.
   
   Previously, I referred to `recentQueries` attribute of the `hive_process` type, but as it turned out there is no case when it's value is a list of multiple queries:
   https://github.com/apache/atlas/blob/master/addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java#L377
   https://github.com/apache/atlas/blob/master/addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/events/BaseHiveEvent.java#L663
   
   The PR is updated to add single `queryText` attribute to the `spark_process` type.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services