You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/10/24 17:14:06 UTC

[GitHub] [spark] rdblue commented on issue #26214: [SPARK-29558][SQL] ResolveTables and ResolveRelations should be order-insensitive

rdblue commented on issue #26214: [SPARK-29558][SQL] ResolveTables and ResolveRelations should be order-insensitive
URL: https://github.com/apache/spark/pull/26214#issuecomment-546015539
 
 
   I don't think that the approach of this PR is a good idea.
   
   There are a few guiding principles that we should follow:
   
   1. **Modify v1 as little as possible**: we need to avoid changing any v1 behavior. That's why we've been very careful to avoid modifying how the existing read and write paths work. I think it introduces too much unnecessary risk to rewrite the v1 resolution rule just before a release.
   2. **Keep v2 separate**: we don't want to require rewriting these rules again to remove v1 in the future. And more importantly, we don't want to use rules like `ResolveRelations` in v2. It is over-complicated (uses recursion instead of multiple runs), mixes several concerns together, and doesn't fit the design of the analyzer (assumes it is the only resolution rule).
   3. **Make incremental changes**: we want to avoid completely rewriting v2 resolution to fix a given problem.
   
   I think that merging v2 table resolution into the v1 rule is the wrong direction. I like the approach @brkyvz suggested to apply the `ResolveTables` rule as part of `ResolveRelations`, so it is maintained independently and so that we need fewer changes to `ResolveRelations`.
   
   The approach in #25955 is another way to go. Only session catalog tables are matched by `ResolveRelations`, so I think it is fine to convert those to `UnresolvedCatalogTable` and then to a v2 relation in `FindDataSourceTables`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org