You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2020/02/03 21:15:27 UTC

[GitHub] [incubator-superset] willbarrett opened a new issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

willbarrett opened a new issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077
 
 
   ## [SIP] Proposal for Improving Superset’s Python Code Organization
   
   ### Motivation
   
   As I was in the weeds fixing a bunch of Pylint failures, Max and I started going back and forth on this PR: https://github.com/apache/incubator-superset/pull/8777, which we ultimately closed. The root cause of that was a lack of shared understanding on the best code structure for Superset going forward. Talking with others at Preset, I realized that the issue was larger than just a new contributor not understanding project practices. Without a shared understanding, we are lacking a cohesive approach towards refactoring Superset - we need a technical North Star for project structure.
   
   Preset’s developers met and identified a number of pain points in the existing code base. Here they are, with a bit of color to make the meaning clear:
   
   * Lack of separation of concerns
       * There are a number of modules that have grown organically over time to truly massive size. Often these modules contain utility functions, classes of multiple types, and a broad array of other functionality. These modules are very hard to work with.
       * Business logic is frequently intermixed with display logic, making code reuse more difficult.
       * The model layer contains a large amount of business logic which has nothing to do with data storage.
   * Dependency Confusion
       * Circular dependency issues crop up regularly with the existing module design
       * Many modules have implicit dependencies on other modules which requires a specific load order
    * Lack of Composability
       * There’s a substantial amount of app-specific code in shared modules
       * Views are currently loaded in all contexts, which makes light-weight command line usage (for example) impossible
   * Deep Class Inheritance
       * In some instances, particularly where there is pre-existing inheritance in Flask-AppBuilder, the inheritance chain can be over 10 classes deep. This can make understanding some classes very challenging.
   
   
   ### Proposed Change
   
   In order to address these concerns, we’d like to propose a few guiding principles for Python code structure going forward:
   
   * Prefer small, functionally-homogeneous modules with only classes and pure functions in module scope
   * Separate business logic from display logic
       * Keep the view layer lightweight and delegate to business logic objects
       * To support this work, we’d like to recommend a Command pattern - more on that below
   * Prefer thin ORM objects
       * To support this work, we’d like to recommend introducing Data Access Objects or DAOs to Superset to hold complicated query logic. More on that below.
   *  Keep `__init__.py` files empty
       * This will help cut down on implicit dependencies
   * Prefer composition over inheritance
       * Collaborator objects can be simpler to understand than a deeply-nested class hierarchy.
   * Organize modules function and back-end objects rather than products
       * Prefer:
           * Dashboard
           * Chart
           * Query
           * Database
           * Models
           * api
       * Avoid:
           * SQL Lab
           * Explore
   
   #### New patterns to introduce:
   
   ##### Command:
   
   There are multiple patterns named Command, and the one we reference here is most similar to an Alembic migration. Commands perform a single cohesive business operation, have a very small public API, and either entirely succeed or entirely fail. In practice, this requires that database transaction behavior be controlled at the Command level. When possible, commands should be modeled such that they perform a single database transaction. In the case of failure, they raise an exception or return an error code, and it is the responsibility of the caller to translate that into a correct response for a user.
   
   Example command:
   ```python
   # This is obviously not a full finished command. The point
   # is to illustrate the structure of a command as
   # encapsulating a business operation, and illustrate
   # a likely public API.
   
   import DashboardDAO from daos.dashboard
   
   class MissingDashboardError(CommandError):
     pass
   
   class UpdateFailedError(CommandError):
     pass
   
   class InvalidDashboardError(CommandError):
     pass
   
   class UpdateDashboardCommand < Command:
     def __init__(current_user, dashboard_id, dashboard_properties):
       self.actor = current_user
       self.dashboard_id = dashboard_id
       self.dashboard_properties = dashboard_properties
       
     def run():
       self.dashboard = DashboardDAO.find_dashboard(actor, dashboard_id)
       if self.dashboard IS NONE:
         raise MissingDashboardError("Dashboard could not be found")
         
       if !self._is_dashboard_valid():
         raise InvalidDashboardError("Dashboard parameters are invalid.")
       
       update_result = DashboardDAO.update(dashboard, dashboard_properties)
       
       if update_result != True:
         raise UpdateFailedError("Dashboard could not be updated.")
         
     
     def _is_dashboard_valid():
       # Check validity of dashboard data before save
   ```
   
   ##### DAO (Data Access Object):
   
   A DAO in the context of Superset would be an object that would manage the lifecycle of SQLAlchemy models. Custom queries would be implemented in DAOs and then called from Command objects. In comparison to Command objects, DAOs have relatively broad public interfaces. DAOs should not depend on other DAOs to avoid circular dependencies. If results are needed cross-DAO, that should be orchestrated in the Command. Here’s a sample simplified DAO for illustrative purposes:
   
   ```python
   # This is not a fully finished DAO. The intention
   # is to illustrate the public interface of a 
   # Data Access Object, and how it might interface
   # with a SQLAlchemy model.
   
   from superset import db
   import DataAccessObject from daos.base
   import Dashboard from models.dashboard
   
   class DashboardDAO(DataAccessObject):
     
     @staticmethod
     def find_for_update(actor, dashboard_id):
       dashboard = db.session.query(Dashboard).filter(id=dashboard_id).one_or_none()
       if dashboard
         if actor in dashboard.owners:
           return dashboard
         else
           return None
       else
         return None
     
     @staticmethod
     def update(dashboard, dashboard_params):
       for key, value in dashboard_params.iteritems():
         setattr(dashboard, key, value)
       try:
         db.session.merge(dashboard)
         db.session.commit()
       except SQLAlchemyError as e:
         logger.error('Failed to update dashboard: ?', e.message)
         db.session.rollback()
         return False
         
       return True
   ```
   
   #### Proposed example package structure that follows the above principles:
   
   * superset/dashboards/api.py
       * Contains FAB and Flask endpoints related to the public API
   * superset/dashboards/schemas.py
       * Contains marshmallow schemas used by both the api and view endpoints
   * superset/dashboards/boot.py
       * Contains initialization and configuration code for API endpoints
       * Invoked by `create_app()`
   * superset/dashboards/views.py
        * Contains FAB and Flask endpoints related to page-refresh views.
   * superset/dashboards/dao.py
   * superset/dashboards/commands/add_user.py
   * superset/dashboards/commands/create.py
   * superset/dashboards/helpers.py
   * superset/dashboards/models/dashboard.py
       * Models are thin SQLAlchemy objects - they do not implement custom queries or filters. That functionality will be found in the DAO, which manipulates the underlying SQLAlchemy model.
   * superset/charts/api.py
   * superset/charts/daos.py
   * superset/charts/boot.py
   * superset/charts/schemas.py
   * superset/charts/commands/add_to_dashboard.py
   * superset/charts/commands/create.py
   * superset/charts/helpers.py
   * superset/charts/models/chart.py
   
   In this design, all systems related to a specific back-end resource have been grouped under a top-level folder. `__init__.py` files should be left empty to enable only pulling in the portions of the system necessary for a specific entrypoint (Celery shouldn’t need `api.py` or `views.py` for instance)
   
   ### New or Changed Public Interfaces
   
   Over time, the internals of Superset will evolve towards the new structure. Public HTTP interfaces will not be likely to change as a result of the above proposal, but code will move and alter to conform. This will impact organizations that apply their own customizations to Superset.
   
   ### New dependencies
   
   None
   
   ### Migration Plan and Compatibility
   
   Introduce refactors to existing code at a manageable pace to allow organizations relying on Superset internals time to adapt.
   
   ### Rejected Alternatives
   
   Preset discussed Service Objects as an alternative to both Commands and DAOs. We felt that Commands provided easier entrypoints for the ports of our application (API endpoints, views, command line invocations, Celery tasks) than Service Objects, and that introducing DAOs as well helped further break down concerns.
   
   We also considered structuring top-level folders by function (api, models, etc.) but found this resulted in drastically more Python modules overall without substantially simplifying the question of where code should live.
   
   
   ### Individuals consulted in creating this SIP
   @mistercrunch @craig-rueda @dpgaspar @robdiciuccio @suddjian @nytai 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] metaperl edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
metaperl edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586375971
 
 
   > The root cause of that was a lack of shared understanding on the best code structure for Superset going forward
   
   1. What software architecture will you choose?
   1. What is your opinion of [the Clean architecture](https://pusher.com/tutorials/clean-architecture-introduction)
   
   > There are a number of modules that have grown organically over time to truly massive size. Often these modules contain utility functions, classes of multiple types, and a broad array of other functionality. These modules are very hard to work with.
   
   1. How do you plan to analyze this problem? Will you create a single mother ticket and a number of child tickets for each issue?
   1. How will you refactor all the tests in the current code base?
   
   > Business logic is frequently intermixed with display logic, making code reuse more difficult.
   
   1. Do you have specific examples of this problem?
   1. Isn't there a business reason for everything you display? how easy is it to separate the two?
   1. do you have a code reviewer? if so, who let these problems get into the codebase in the first place?
   
   > Circular dependency issues crop up regularly with the existing module design
   
   Would [Inversion of control or dependency injection](https://github.com/metaperl/python-oop/wiki/Dependency-Injection) help with this? If not, what will help with it?
   
   > Many modules have implicit dependencies on other modules which requires a specific load order
   
   This does not seem like much of a problem to me. Except that PyCharm (and other tools) run a module-reorder as part of their automatic code cleanup and this might prohibit something from loading.
   
   > There’s a substantial amount of app-specific code in shared modules
   
   I do not understand the problem: Are you saying there is **code duplication**? If the purpose of a module is to do something for the app, then it stands to reason it would have app-specific code, wouldnt it? I'm confused.
   
   > In some instances, particularly where there is pre-existing inheritance in Flask-AppBuilder, the inheritance chain can be over 10 classes deep. This can make understanding some classes very challenging.
   
   It can be challenging, but IMHO the code is well-written. I had to trace throgh 7 levels of OO inheritance here: https://github.com/apache/incubator-superset/issues/8695#issuecomment-559968158
   
   I would be curious to see how you would improve things and whether the problems is rooted in Flask itself and FAB and Superset have no choice but to bow to their Creator.
   
   > Organize modules function and back-end objects rather than products
   
   this phrase does not express a complete thought. I think there is some sort of grammatical error here. I have no idea what you are suggesting.
   
   > Avoid: SQL Lab, Explore
   
   One of these is a noun/product just like the things you prefer. The other is a verb.... What is unacceptable about SQL Lab?
   
   > class DashboardDAO(DataAccessObject):  def find_for_update(actor, dashboard_id): 
   
   Why is this preferred over a find_for_update in the class that `actor` belongs to? ditto for 
   `class DashboardDAO(DataAccessObject):  def update(dashboard, dashboard_params):` - this looks like a method that belongs to the Dashboard class.
   
   
   
   
   > 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] suddjian commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
suddjian commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586478075
 
 
   @metaperl 
   Reviewers should help ensure that code is organized, and I think that's the idea you were going for with that bullet point. This will be especially true once the Superset community arrives at a consistent plan for code organization. Reviewers can point to the plan and request that code fit the established patterns.
   
   But wording that idea as "who let these problems get into the codebase in the first place" reads like pointing fingers at the folks who wrote/reviewed the existing code, which is why I responded to it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586536501
 
 
   Chime in my two cents as a passerby.  I really like that different modules have their independent folders and we are introducing conventions and increasing consistency. I think an intuitive code structure and consistent paradigms help a lot for future contributors to understand the code.
   
   Sharing the approach I took in one of my older projects ([a general purpose CMS](https://github.com/ktmud/david/tree/master/david
   )).  It's slightly different than what is proposed, but very similar in concept.
   
   My python files are organized [like this](https://github.com/ktmud/david/tree/master/david):
   
   ```
   ├── __init__.py
   ├── app.py
   ├── core/   # core data models, user accounts, base data entities etc
   ├── ext/      # 3rd-party/non-essential extensions, s3 uploads, etc
   ├── lib/       # db mixins, cache decorators, utilities
   ├── modules/   # modules as organized in the main menu (URL route)
   │   ├── __init__.py  # import and compose all modules
   │   ├── module_1
   │   │   ├── __init__.py
   │   │   ├── admin.py   # admin views, a separate Blueprint for editing objects
   │   │   ├── model.py   # operation on data entities, objects for data access
   │   │   └── view.py     # User facing Blueprint for "module_1", handlers parsing parameters
   │   ├── module_2
   │   │   ├── __init__.py
   │   │   ├── admin.py
   │   │   ├── model.py
   │   │   └── view.py
   ```
   
   I find the separation of core/shared functionalities and addition of a "modules" folder especially useful because it makes the whole project easier to navigate.
   
   Loading modules  and booting the app is as simple as [this](https://github.com/ktmud/david/blob/master/david/ext/modules.py). 
   
   ```python
   def load_module(app, name):
       """Dynamically load modules in the `modules/` folder.
       package = 'david.modules.%s' % name
       mod = __import__(package, fromlist=['admin', 'bp', 'setup', 'view'])
   
       # register module blueprint and setup
       register_views(app, mod)
   
       if hasattr(mod, 'view'):
           register_views(app, mod.view)
   
       return mod
   
   def register_views(app, *view_modules):
       """Register views, including API endpoints"""
       for v in view_modules:
           if hasattr(v, 'bp'):
               app.register_blueprint(v.bp)
           if hasattr(v, 'setup'):
               v.setup(app)
   ```
   
   This is similar to what the `boot.py` file may achieve.
   
   Hope this helps.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] willbarrett commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
willbarrett commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-587924995
 
 
   @john-bodley I didn't address your comment directly regarding blueprints. Let me do so now:
   
   I concur 100% with the idea of untangling the `app` references from the rest of the code. The problem we currently face with regards to blueprints is that:
   
   1. Blueprints cannot currently be nested (see [this issue on Flask](https://github.com/pallets/flask/issues/593))
   2. Flask-AppBuilder (FAB) currently leverages blueprints internally.
   
   Were we to move entirely to blueprints being defined in Superset, that would have substantial implications for the extent that we can leverage FAB in endpoints without either reworking FAB internals or adding the ability to nest blueprints to Flask. As we've been building out `/api/v1`, we've leveraged Flask's base classes, especially for GET endpoints.
   
   How would you recommend reconciling this problem?
   
   I think that @dpgaspar's suggestion for adding a `register_views` function to the `manager.py` files is a possible resolution - this would provide an API similar to blueprint functionality that could be leveraged in `app.py`. I'd be interested in your opinion on how close this would get us to resolving this issues you're bringing up.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] willbarrett edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
willbarrett edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-587924995
 
 
   @john-bodley I didn't address your comment directly regarding blueprints. Let me do so now:
   
   I concur 100% with the idea of untangling the `app` references from the rest of the code. The problem we currently face with regards to blueprints is that:
   
   1. Blueprints cannot currently be nested (see [this issue on Flask](https://github.com/pallets/flask/issues/593))
   2. Flask-AppBuilder (FAB) currently leverages blueprints internally.
   
   Were we to move entirely to blueprints being defined in Superset, that would have substantial implications for the extent that we can leverage FAB in endpoints without either reworking FAB internals or adding the ability to nest blueprints to Flask. As we've been building out `/api/v1`, we've leveraged FAB's base classes, especially for GET endpoints.
   
   How would you recommend reconciling this problem?
   
   I think that @dpgaspar's suggestion for adding a `register_views` function to the `manager.py` files is a possible resolution - this would provide an API similar to blueprint functionality that could be leveraged in `app.py`. I'd be interested in your opinion on how close this would get us to resolving this issues you're bringing up.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] ktmud commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
ktmud commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586536501
 
 
   Chime in my two cents as a passerby.  I really like that different modules have their independent folders and we are introducing conventions and increasing consistency. I think an intuitive code structure and consistent paradigms help a lot for future contributors to understand the code.
   
   Sharing the approach I took in one of my older projects ([a general purpose CMS](https://github.com/ktmud/david/tree/master/david
   )).  It's slightly different than what is proposed, but very similar in concept.
   
   All my python files are organized [like this](https://github.com/ktmud/david/tree/master/david):
   
   ```
   ├── __init__.py
   ├── app.py
   ├── core/   # core data models, user accounts, base data entities etc
   ├── ext/      # 3rd-party/non-essential extensions, s3 uploads, etc
   ├── lib/       # db mixins, cache decorators, utilities
   ├── modules/   # modules as organized in the main menu (URL route)
   │   ├── __init__.py  # import and compose all modules
   │   ├── module_1
   │   │   ├── __init__.py
   │   │   ├── admin.py   # admin views, a separate Blueprint for editing objects
   │   │   ├── model.py   # operation on data entities, objects for data access
   │   │   └── view.py     # User facing Blueprint for "module_1", handlers parsing parameters
   │   ├── module_2
   │   │   ├── __init__.py
   │   │   ├── admin.py
   │   │   ├── model.py
   │   │   └── view.py
   ```
   
   I find the separation of core/shared functionalities especially useful because it makes the whole project easier to navigate.
   
   Loading modules  and booting the app is as simple as [this](https://github.com/ktmud/david/blob/master/david/ext/modules.py). It seems we can avoid `boot.py` this way.
   
   Hope this helps.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] dpgaspar commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
dpgaspar commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586899389
 
 
   Adding my two cents here:
   
   Regarding blueprints: I'm inclined on delegating the actual registering to each module manager. We can still say that they occur within `app.py` but the specifics are encapsulated on the module manager itself.
   
   This way:
   - We would reduce the centralized complexity of `app.py`
   - Handle module specifics at the module level
   
   For example:
   ```
   - superset/
       - datasets/
           - manager.py (implements `register_views`)
           - commands/
             - CRUD (create.py, update.py, delete.py ...)
             - Custom (export.py, ...)
           - api.py
           - schemas.py
           - dao.py
       - dashboards/
         ...
       - charts/
       - databases/
       - commands/
          - base.py
            ....
       - api/
           - base.py
   ....
   ```
   
   So `app.py` would call each module's `manager.register_views()` with additional pre and post hooks. This kind of pattern can open up the app for external extensions, since additional modules can be independently installed (pip) and loaded in order using config, for example:
   
   ```
   ADDONS = [
       middleware1.Middleware1Manager,
       middleware2.Middleware2Manager,
       ...
   ]
   ```
   ```
   Just a thought, could db_engine_specs follow a similar pattern?
   
   The blueprints are handled by FAB's `BaseApi` or it's child `ModelRestApi` that creates a blueprint with all routes registered under the same API resource namespace, `api/v1/dataset/*`, `api/v1/databases` and OpenAPI spec tag/grouping. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] villebro commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
villebro commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-589260305
 
 
   I welcome this effort, and the amount of circular imports that typing has exposed really shows that something needs to be done. To get started, I would almost recommend grepping for packages with `TYPE_CHECKING` to see the most common circular dependencies, as those are most likely the most spaghettied pieces of code that need to be untangled. If it is possible to start by refactoring those into their new respective locations, we would be well positioned to continue refactoring the rest of the code as proposed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] mistercrunch removed a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
mistercrunch removed a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-584954166
 
 
   🗑🏷 security

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] mistercrunch removed a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
mistercrunch removed a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-584953733
 
 
   🏷 security

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] metaperl commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
metaperl commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586458890
 
 
   > Blaming committers for these issues is not helpful or appropriate.
   
   @suddjian - It certainly is not helpful or appropriate. And if you could explain why you included that comment, I would appreciate it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] john-bodley commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
john-bodley commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-588367848
 
 
   @willbarrett maybe there's a disconnect with my understanding of FAB. Does FAB prevent us from defining all the API endpoints as blueprints and housing these under a root `/api` directory?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] mistercrunch commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
mistercrunch commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-584953733
 
 
   🏷 security

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] mistercrunch edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
mistercrunch edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-584953964
 
 
   🗑🏷 security
   showing someone how to use the bot labels ...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] john-bodley commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
john-bodley commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586349711
 
 
   I agree with many issues/concerns raised in this SIP though I’m not overly familiar with commands and DAOs. Are there examples of other apps (preferably Flask) which use these constructs?
   
   Note at Airbnb we have a number of Flask apps which follow best practices, i.e., are modular, have empty `__init__.py` files, use Flask blueprints, etc. which are fairly easy to understand and structured in a way that result in no circular dependencies.
   
   I’m all for refactoring the Python codebase (leaning more on blueprints), though I’m unsure whether need to restructure the logic at this time to leverage commands and DAOs.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] mistercrunch removed a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
mistercrunch removed a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-584953964
 
 
   🗑🏷 security
   showing someone how to use the bot labels ...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] john-bodley commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
john-bodley commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586609996
 
 
   I really this we need to untangle the `app` from the various components. This is the partially to blame for the spaghetti code, circular dependency issues we’re currently facing.
   
   All these modules, components, etc. need to be app agnostic and should be implemented as blueprints. If there are config or similar aspects these modules need leverage one can use the [`current_app`](https://flask.palletsprojects.com/en/1.1.x/api/#flask.current_app) proxy. The loading and registering (binding) of blueprints should occur within `app.py`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] john-bodley edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
john-bodley edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586349711
 
 
   I agree with many issues/concerns raised in this SIP though I’m not overly familiar with commands and DAOs. Are there examples of other apps (preferably Flask) which use these constructs?
   
   Note at Airbnb we have a number of Flask apps which follow best practices, i.e., are modular, have empty `__init__.py` files, use Flask blueprints, etc. which are fairly easy to understand and structured in a way that result in no circular dependencies.
   
   I’m all for refactoring the Python codebase (leaning more on blueprints), though I’m unsure whether need to restructure the logic at this time to leverage commands and DAOs. A full restructure (as opposed to a refactor) seems like a considerable amount of work and may be somewhat overkill to address the concerns raised in this SIP. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586536501
 
 
   Chime in my two cents as a passerby.  I really like that different modules have their independent folders and we are introducing conventions and increasing consistency. I think an intuitive code structure and consistent paradigms help a lot for future contributors to understand the code.
   
   Sharing the approach I took in one of my older projects ([a general purpose CMS](https://github.com/ktmud/david/tree/master/david
   )).  It's slightly different than what is proposed, but very similar in concept.
   
   All my python files are organized [like this](https://github.com/ktmud/david/tree/master/david):
   
   ```
   ├── __init__.py
   ├── app.py
   ├── core/   # core data models, user accounts, base data entities etc
   ├── ext/      # 3rd-party/non-essential extensions, s3 uploads, etc
   ├── lib/       # db mixins, cache decorators, utilities
   ├── modules/   # modules as organized in the main menu (URL route)
   │   ├── __init__.py  # import and compose all modules
   │   ├── module_1
   │   │   ├── __init__.py
   │   │   ├── admin.py   # admin views, a separate Blueprint for editing objects
   │   │   ├── model.py   # operation on data entities, objects for data access
   │   │   └── view.py     # User facing Blueprint for "module_1", handlers parsing parameters
   │   ├── module_2
   │   │   ├── __init__.py
   │   │   ├── admin.py
   │   │   ├── model.py
   │   │   └── view.py
   ```
   
   I find the separation of core/shared functionalities (and the addition of a "modules" folder) especially useful because it makes the whole project easier to navigate.
   
   Loading modules  and booting the app is as simple as [this](https://github.com/ktmud/david/blob/master/david/ext/modules.py). It seems we can avoid `boot.py` this way.
   
   Hope this helps.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] willbarrett commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
willbarrett commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-584776316
 
 
   @craig-rueda I've centralized the models in the recommendation, but have left the DAOs distributed - I don't think the DAOs will be too heavily reused outside of their domain. If the future proves me wrong, it can be refactored.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] mistercrunch commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
mistercrunch commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-584953964
 
 
   🗑 🏷 security
   showing someone how to use the bot labels ...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] mistercrunch commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
mistercrunch commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-584954166
 
 
   🗑🏷 security

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
DiggidyDave commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586376328
 
 
   The empty `__init__.py` files alone would provide enormous value IMO. If this SIP as a whole is approved I would suggested implementing breaking it into incremental, independently-valuable chunks. (whether it should be a single large SIP or not is a different question, perhaps... is the monolithic SIP coupling controversial suggestions with uncontroversial ones?)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] dpgaspar commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
dpgaspar commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-588408594
 
 
   @willbarrett that's a good explanation.
   
   @john-bodley 
   The proposed FAB model is that a blueprint is related with a `BaseApi` class and the blueprint's `url_prefix` is defined on the `route_base` class attribute. By default `BaseApi` and `ModelRestApi` will compose `/api/<version>/<resource_name>/` for the url_prefix of the blueprint.
   It's possible to overlay `url_prefix` has long has the endpoints don't collide.
   
   Yet, I'm curious about the need to create a big blueprint that holds all API endpoints? Note that we aggregate all API classes under a base API class, where we can impose specific behavior/config.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586536501
 
 
   Chime in my two cents as a passerby.  I really like that different modules have their independent folders and we are introducing conventions and increasing consistency. I think an intuitive code structure and consistent paradigms help a lot for future contributors to understand the code.
   
   Sharing the approach I took in one of my older projects ([a general purpose CMS](https://github.com/ktmud/david/tree/master/david
   )).  It's slightly different than what is proposed, but very similar in concept.
   
   All my python files are organized [like this](https://github.com/ktmud/david/tree/master/david):
   
   ```
   ├── __init__.py
   ├── app.py
   ├── core/   # core data models, user accounts, base data entities etc
   ├── ext/      # 3rd-party/non-essential extensions, s3 uploads, etc
   ├── lib/       # db mixins, cache decorators, utilities
   ├── modules/   # modules as organized in the main menu (URL route)
   │   ├── __init__.py  # import and compose all modules
   │   ├── module_1
   │   │   ├── __init__.py
   │   │   ├── admin.py   # admin views, a separate Blueprint for editing objects
   │   │   ├── model.py   # operation on data entities, objects for data access
   │   │   └── view.py     # User facing Blueprint for "module_1", handlers parsing parameters
   │   ├── module_2
   │   │   ├── __init__.py
   │   │   ├── admin.py
   │   │   ├── model.py
   │   │   └── view.py
   ```
   
   I find the separation of core/shared functionalities (and the addition of a "modules" folder) especially useful because it makes the whole project easier to navigate.
   
   Loading modules  and booting the app is as simple as [this](https://github.com/ktmud/david/blob/master/david/ext/modules.py). 
   
   ```python
   def load_module(app, name):
       package = 'david.modules.%s' % name
       mod = __import__(package, fromlist=['admin', 'bp', 'setup', 'view'])
   
       # register module bp and setup
       register_views(app, mod)
   
       if hasattr(mod, 'view'):
           register_views(app, mod.view)
   
       return mod
   ```
   
   A `boot.py` file is not necessary for this to work.
   
   Hope this helps.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] craig-rueda commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
craig-rueda commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-581624363
 
 
   Looks good overall, however I would caution grouping things like models and/or daos under their domain as you have done above in favor of simply grouping them by function -- i.e. a package for `daos`, and one for `models` under a parent `data` package. Each child "data" package can, and should be further split based on domain. These lower-level bits invariably end up being reused and should be a little more centrally located.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] dpgaspar edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
dpgaspar edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586899389
 
 
   Adding my two cents here:
   
   Regarding blueprints: I'm inclined on delegating the actual registering to each module manager. We can still say that they occur within `app.py` but the specifics are encapsulated on the module manager itself.
   
   This way:
   - We would reduce the centralized complexity of `app.py`
   - Handle module specifics at the module level
   
   For example:
   ```
   - superset/
       - datasets/
           - manager.py (implements `register_views`)
           - commands/
             - <CRUD> (create.py, update.py, delete.py ...)
             - <Custom> (export.py, ...)
           - api.py
           - schemas.py
           - dao.py
       - dashboards/
         ...
       - charts/
       - databases/
       - commands/
          - base.py
            ....
       - api/
           - base.py
   ....
   ```
   
   So `app.py` would call each module's `manager.register_views()` with additional pre and post hooks. This kind of pattern can open up the app for external extensions, since additional modules can be independently installed (pip) and loaded in order using config, for example:
   
   ```
   ADDONS = [
       middleware1.Middleware1Manager,
       middleware2.Middleware2Manager,
       ...
   ]
   ```
   
   Just a thought, could db_engine_specs follow a similar pattern?
   
   The blueprints are handled by FAB's `BaseApi` or it's child `ModelRestApi` that creates a blueprint with all routes registered under the same API resource namespace, `api/v1/dataset/*`, `api/v1/databases` and OpenAPI spec tag/grouping. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] dpgaspar edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
dpgaspar edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586899389
 
 
   Adding my two cents here:
   
   Regarding blueprints: I'm inclined on delegating the actual registering to each module manager. We can still say that they occur within `app.py` but the specifics are encapsulated on the module manager itself.
   
   This way:
   - We would reduce the centralized complexity of `app.py`
   - Handle module specifics at the module level
   
   For example:
   ```
   - superset/
       - datasets/
           - manager.py (implements `register_views`)
           - commands/
             - CRUD (create.py, update.py, delete.py ...)
             - Custom (export.py, ...)
           - api.py
           - schemas.py
           - dao.py
       - dashboards/
         ...
       - charts/
       - databases/
       - commands/
          - base.py
            ....
       - api/
           - base.py
   ....
   ```
   
   So `app.py` would call each module's `manager.register_views()` with additional pre and post hooks. This kind of pattern can open up the app for external extensions, since additional modules can be independently installed (pip) and loaded in order using config, for example:
   
   ```
   ADDONS = [
       middleware1.Middleware1Manager,
       middleware2.Middleware2Manager,
       ...
   ]
   ```
   
   Just a thought, could db_engine_specs follow a similar pattern?
   
   The blueprints are handled by FAB's `BaseApi` or it's child `ModelRestApi` that creates a blueprint with all routes registered under the same API resource namespace, `api/v1/dataset/*`, `api/v1/databases` and OpenAPI spec tag/grouping. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586536501
 
 
   Chime in my two cents as a passerby.  I really like that different modules have their independent folders and we are introducing conventions and increasing consistency. I think an intuitive code structure and consistent paradigms help a lot for future contributors to understand the code.
   
   Sharing the approach I took in one of my older projects ([a general purpose CMS](https://github.com/ktmud/david/tree/master/david
   )).  It's slightly different than what is proposed, but very similar in concept.
   
   My python files are organized [like this](https://github.com/ktmud/david/tree/master/david):
   
   ```
   ├── __init__.py
   ├── app.py
   ├── core/   # core data models, user accounts, base data entities etc
   ├── ext/      # 3rd-party/non-essential extensions, s3 uploads, etc
   ├── lib/       # db mixins, cache decorators, utilities
   ├── modules/   # modules as organized in the main menu (URL route)
   │   ├── __init__.py  # import and compose all modules
   │   ├── module_1
   │   │   ├── __init__.py
   │   │   ├── admin.py   # admin views, a separate Blueprint for editing objects
   │   │   ├── model.py   # operation on data entities, objects for data access
   │   │   └── view.py     # User facing Blueprint for "module_1", handlers parsing parameters
   │   ├── module_2
   │   │   ├── __init__.py
   │   │   ├── admin.py
   │   │   ├── model.py
   │   │   └── view.py
   ```
   
   I find the separation of core/shared functionalities (and the addition of a "modules" folder) especially useful because it makes the whole project easier to navigate.
   
   Loading modules  and booting the app is as simple as [this](https://github.com/ktmud/david/blob/master/david/ext/modules.py). 
   
   ```python
   def load_module(app, name):
       """Dynamically load modules in the `modules/` folder.
       package = 'david.modules.%s' % name
       mod = __import__(package, fromlist=['admin', 'bp', 'setup', 'view'])
   
       # register module blueprint and setup
       register_views(app, mod)
   
       if hasattr(mod, 'view'):
           register_views(app, mod.view)
   
       return mod
   
   def register_views(app, *view_modules):
       """Register views, including API endpoints"""
       for v in view_modules:
           if hasattr(v, 'bp'):
               app.register_blueprint(v.bp)
           if hasattr(v, 'setup'):
               v.setup(app)
   ```
   
   It looks like the `boot.py` file will achieve the same goal.
   
   Hope this helps.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] metaperl commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
metaperl commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586375971
 
 
   > The root cause of that was a lack of shared understanding on the best code structure for Superset going forward
   
   1. What software architecture will you choose?
   1. What is your opinion of [the Clean architecture](https://pusher.com/tutorials/clean-architecture-introduction)
   
   > There are a number of modules that have grown organically over time to truly massive size. Often these modules contain utility functions, classes of multiple types, and a broad array of other functionality. These modules are very hard to work with.
   
   1. How do you plan to analyze this problem? Will you create a single mother ticket and a number of child tickets for each issue?
   1. How will you refactor all the tests in the current code base?
   
   > Business logic is frequently intermixed with display logic, making code reuse more difficult.
   
   1. Do you have specific examples of this problem?
   1. Isn't there a business reason for everything you display? how easy is it to separate the two?
   1. do you have a code reviewer? if so, who let these problems get into the codebase in the first place?
   
   > Circular dependency issues crop up regularly with the existing module design
   
   Would [Inversion of control or dependency injection](https://github.com/metaperl/python-oop/wiki/Dependency-Injection) help with this? If not, what will help with it?
   
   > Many modules have implicit dependencies on other modules which requires a specific load order
   
   This does not seem like much of a problem to me. Except that PyCharm (and other tools) run a module-reorder as part of their automatic code cleanup and this might prohibit something from loading.
   
   > There’s a substantial amount of app-specific code in shared modules
   
   I do not understand the problem: Are you saying there is **code duplication**? If the purpose of a module is to do something for the app, then it stands to reason it would have app-specific code, wouldnt it? I'm confused.
   
   > In some instances, particularly where there is pre-existing inheritance in Flask-AppBuilder, the inheritance chain can be over 10 classes deep. This can make understanding some classes very challenging.
   
   It can be challenging, but IMHO the code is well-written. I had to trace throgh 7 levels of OO inheritance here:
   https://github.com/apache/incubator-superset/issues/8695#issuecomment-559968158
   
   I would be curious to see how you would improve things and whether the problems is rooted in Flask itself and FAB and Superhave no choice but to bow to their Creator.
   
   > Organize modules function and back-end objects rather than products
   
   this phrase does not express a complete thought. I think there is some sort of grammatical error here. I have no idea what you are suggesting.
   
   > Avoid: SQL Lab, Explore
   
   One of these is a noun/product just like the things you prefer. The other is a verb.... What is unacceptable about SQL Lab?
   
   > class DashboardDAO(DataAccessObject):  def find_for_update(actor, dashboard_id): 
   
   Why is this preferred over a find_for_update in the class that `actor` belongs to? ditto for 
   `class DashboardDAO(DataAccessObject):  def update(dashboard, dashboard_params):` - this looks like a method that belongs to the Dashboard class.
   
   
   
   
   > 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] craig-rueda closed issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
craig-rueda closed issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] suddjian commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
suddjian commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586429496
 
 
   @metaperl 
   
   > who let these problems get into the codebase in the first place?
   
   Disorganization is a result of many people independently contributing to a project, focusing on shipping functional code quickly. This has worked well for Superset, but has resulted in a fair amount of tech debt. This SIP is about putting some structure in place that can help with future efforts. Blaming committers for these issues is not helpful or appropriate.
   
   > I have no idea what you are suggesting.
   
   The idea is that backend code should be focused on the entities being acted on, rather than the specific part of Superset that is interacting with that entity. For example, SQL Lab and Explore ("products") interact with Datasources, Queries, Users ("objects"). This is proposing that the backend should be focused on providing APIs that are object-centric (`POST /datasources`, `GET /queries`, rather than APIs that are specific to any one product within Superset.
   
   > IMHO the code is well-written. I had to trace throgh 7 levels of OO inheritance
   
   I would argue that you should not have to delve through 7 levels of inheritance to understand what a piece of code is doing. Simpler is better.
   
   > Isn't there a business reason for everything you display? how easy is it to separate the two?
   
   You may want to take a look at principles of [Clean Architecture](https://pusher.com/tutorials/clean-architecture-introduction) for some information on this.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] willbarrett commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
willbarrett commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-588373465
 
 
   @john-bodley My understanding here is a little fuzzy too, so I'll let @dpgaspar correct me if I'm wrong. FAB creates blueprints under the hood whenever we use it to define endpoints. `ModelView` and `ModelViewApi` classes create collections of endpoints under a blueprint, which is then registered on the application when we call `appbuilder.add_view...` or a similar function. If we wanted to group those items together into an API blueprint, it would not be possible since that would result in nested blueprints.
   
   @dpgaspar how far away from the truth am I :)? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586536501
 
 
   Chime in my two cents as a passerby.  I really like that different modules have their independent folders and we are introducing conventions and increasing consistency. I think an intuitive code structure and consistent paradigms help a lot for future contributors to understand the code.
   
   Sharing the approach I took in one of my older projects ([a general purpose CMS](https://github.com/ktmud/david/tree/master/david
   )).  It's slightly different than what is proposed, but very similar in concept.
   
   All my python files are organized [like this](https://github.com/ktmud/david/tree/master/david):
   
   ```
   ├── __init__.py
   ├── app.py
   ├── core/   # core data models, user accounts, base data entities etc
   ├── ext/      # 3rd-party/non-essential extensions, s3 uploads, etc
   ├── lib/       # db mixins, cache decorators, utilities
   ├── modules/   # modules as organized in the main menu (URL route)
   │   ├── __init__.py  # import and compose all modules
   │   ├── module_1
   │   │   ├── __init__.py
   │   │   ├── admin.py   # admin views, a separate Blueprint for editing objects
   │   │   ├── model.py   # operation on data entities, objects for data access
   │   │   └── view.py     # User facing Blueprint for "module_1", handlers parsing parameters
   │   ├── module_2
   │   │   ├── __init__.py
   │   │   ├── admin.py
   │   │   ├── model.py
   │   │   └── view.py
   ```
   
   I find the separation of core/shared functionalities (and the addition of a "modules" folder) especially useful because it makes the whole project easier to navigate.
   
   Loading modules  and booting the app is as simple as [this](https://github.com/ktmud/david/blob/master/david/ext/modules.py). 
   
   ```python
   def load_module(app, name):
       """Dynamically load modules in the `modules/` folder.
       package = 'david.modules.%s' % name
       mod = __import__(package, fromlist=['admin', 'bp', 'setup', 'view'])
   
       # register module blueprint and setup
       register_views(app, mod)
   
       if hasattr(mod, 'view'):
           register_views(app, mod.view)
   
       return mod
   
   def register_views(app, *view_modules):
       """Register views, including API endpoints"""
       for v in view_modules:
           if hasattr(v, 'bp'):
               app.register_blueprint(v.bp)
           if hasattr(v, 'setup'):
               v.setup(app)
   ```
   
   A `boot.py` file is not necessary for this to work.
   
   Hope this helps.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586536501
 
 
   Chime in my two cents as a passerby.  I really like that different modules have their independent folders and we are introducing conventions and increasing consistency. I think an intuitive code structure and consistent paradigms help a lot for future contributors to understand the code.
   
   Sharing the approach I took in one of my older projects ([a general purpose CMS](https://github.com/ktmud/david/tree/master/david
   )).  It's slightly different than what is proposed, but very similar in concept.
   
   My python files are organized [like this](https://github.com/ktmud/david/tree/master/david):
   
   ```
   ├── __init__.py
   ├── app.py
   ├── core/   # core data models, user accounts, base data entities etc
   ├── ext/      # 3rd-party/non-essential extensions, s3 uploads, etc
   ├── lib/       # db mixins, cache decorators, utilities
   ├── modules/   # modules as organized in the main menu (URL route)
   │   ├── __init__.py  # import and compose all modules
   │   ├── module_1
   │   │   ├── __init__.py
   │   │   ├── admin.py   # admin views, a separate Blueprint for editing objects
   │   │   ├── model.py   # operation on data entities, objects for data access
   │   │   └── view.py     # User facing Blueprint for "module_1", handlers parsing parameters
   │   ├── module_2
   │   │   ├── __init__.py
   │   │   ├── admin.py
   │   │   ├── model.py
   │   │   └── view.py
   ```
   
   I find the separation of core/shared functionalities and addition of a "modules" folder especially useful because it makes the whole project easier to navigate.
   
   Loading modules  and booting the app is as simple as [this](https://github.com/ktmud/david/blob/master/david/ext/modules.py). 
   
   ```python
   def load_module(app, name):
       """Dynamically load modules in the `modules/` folder.
       package = 'david.modules.%s' % name
       mod = __import__(package, fromlist=['admin', 'bp', 'setup', 'view'])
   
       # register module blueprint and setup
       register_views(app, mod)
   
       if hasattr(mod, 'view'):
           register_views(app, mod.view)
   
       return mod
   
   def register_views(app, *view_modules):
       """Register views, including API endpoints"""
       for v in view_modules:
           if hasattr(v, 'bp'):
               app.register_blueprint(v.bp)
           if hasattr(v, 'setup'):
               v.setup(app)
   ```
   
   It looks like the `boot.py` file will achieve the same goal.
   
   Hope this helps.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] willbarrett commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
willbarrett commented on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586414197
 
 
   First, thanks for your comments! I appreciate the community taking the issue seriously.
   
   @DiggidyDave I agree that empty `__init__.py` files will help substantially. @dpgaspar has a PR open right now that's linked to this issue that demonstrates the patterns we're suggesting. IMHO the result is very readable and easy to work with. We hope for this SIP to be a North Star that we gradually work towards. I like your suggestion of implementing valuable bits in chunks. I'd recommend being pragmatic and attacking small enough pieces that they are comprehensible. One of the areas I'm most excited to tackle is slimming down the model layer using DAOs and commands - I think models are far too thick at the moment, but I would treat this work alone as multiple coherent changes rather than one large PR due to the potential complexity involved.
   
   @john-bodley Superset currently has examples of both the Command pattern and a pattern similar to DAOs. Alembic's individual migration classes are an example of a command and FAB's shared filters are very similar to a DAO, in that they encapsulate shared query logic outside of the model. A DAO allows us to extract query logic from model classes to a clear destination. As we refactor, we need a target to move towards. Blueprints separate functionality well at the API layer, but will not provide a clean abstraction at the model or business logic layer without a few other patterns to lean on. I agree that fully refactoring the code is a large unit of work and expect this to be a target we move towards gradually where it makes sense. Currently the model layer is very thick and we have a lot of business logic in endpoint code. To me this is a difficult problem to solve without a shared concept of where this code should go moving forward and a pattern to leverage.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization

Posted by GitBox <gi...@apache.org>.
ktmud edited a comment on issue #9077: [SIP-35] Proposal for Improving Superset’s Python Code Organization
URL: https://github.com/apache/incubator-superset/issues/9077#issuecomment-586536501
 
 
   Chime in my two cents as a passerby.  I really like that different modules have their independent folders and we are introducing conventions and increasing consistency. I think an intuitive code structure and consistent paradigms help a lot for future contributors to understand the code.
   
   Sharing the approach I took in one of my older projects ([a general purpose CMS](https://github.com/ktmud/david/tree/master/david
   )).  It's slightly different than what is proposed, but very similar in concept.
   
   All my python files are organized [like this](https://github.com/ktmud/david/tree/master/david):
   
   ```
   ├── __init__.py
   ├── app.py
   ├── core/   # core data models, user accounts, base data entities etc
   ├── ext/      # 3rd-party/non-essential extensions, s3 uploads, etc
   ├── lib/       # db mixins, cache decorators, utilities
   ├── modules/   # modules as organized in the main menu (URL route)
   │   ├── __init__.py  # import and compose all modules
   │   ├── module_1
   │   │   ├── __init__.py
   │   │   ├── admin.py   # admin views, a separate Blueprint for editing objects
   │   │   ├── model.py   # operation on data entities, objects for data access
   │   │   └── view.py     # User facing Blueprint for "module_1", handlers parsing parameters
   │   ├── module_2
   │   │   ├── __init__.py
   │   │   ├── admin.py
   │   │   ├── model.py
   │   │   └── view.py
   ```
   
   I find the separation of core/shared functionalities (and the addition of a "modules" folder) especially useful because it makes the whole project easier to navigate.
   
   Loading modules  and booting the app is as simple as [this](https://github.com/ktmud/david/blob/master/david/ext/modules.py). 
   
   ```python
   def load_module(app, name):
       """Dynamically load modules in the `modules/` folder.
       package = 'david.modules.%s' % name
       mod = __import__(package, fromlist=['admin', 'bp', 'setup', 'view'])
   
       # register module blueprint and setup
       register_views(app, mod)
   
       if hasattr(mod, 'view'):
           register_views(app, mod.view)
   
       return mod
   
   def register_views(app, *view_modules):
       """Register views, including API endpoints"""
       for v in view_modules:
           if hasattr(v, 'bp'):
               app.register_blueprint(v.bp)
           if hasattr(v, 'setup'):
               v.setup(app)
   ```
   
   It looks like the `boot.py` file will achieve the same goal.
   
   Hope this helps.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org