You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "wmoustafa (via GitHub)" <gi...@apache.org> on 2023/05/03 17:11:02 UTC

[GitHub] [iceberg] wmoustafa commented on a diff in pull request #7500: Views: Update spec with expectations on versions, representations, and dialects

wmoustafa commented on code in PR #7500:
URL: https://github.com/apache/iceberg/pull/7500#discussion_r1183980083


##########
format/view-spec.md:
##########
@@ -96,11 +96,15 @@ Summary is a string to string map of metadata about a view version. Common metad
 
 View definitions can be represented in multiple ways. Representations are documented ways to express a view definition.
 
-A view version can have more than one representation. All representations for a version must express the same underlying definition. Engines are free to choose the representation to use.
+A view version can have more than one representation. All representations for a version must express the same underlying definition. Engines are free to choose the representation(s) to use.
+
+View versions are immutable. Once a version is created, it cannot be changed. This means that representations for a version cannot be changed. If a view definition changes (or new representations are to be added), a new version must be created.
 
 Each representation is an object with at least one common field, `type`, that is one of the following:
 * `sql`: a SQL SELECT statement that defines the view
 
+In addition to `type`, each representation defines a `dialect`. A `dialect` is a string that identifies the query language dialect used in the representation. For example, `trino` or `spark`. A view version cannot have duplicate representations with the same `type` and `dialect`.

Review Comment:
   > This isn't true. Representations do not necessarily have a dialect. SQL does because it varies.
   
   The `dialect` field is required. Did not notice something in the spec that says it was not true. 
   
   > I would not expect other specifications to have the same variance.
   
   I was thinking of other languages such as `Datalog`, `SPARQL`, `Cypher`. Those could have dialects too.
   
   > Instead, I think it would be better to state that there should be only one of a give type. Did you want to allow multiple SQL representations with different dialects?
   
   I was actually under the impression that the spec allows multiple SQL representations with different dialects. We can discuss the pros and cons of each option. However, it sounds if allowing multiple types is allowed, then allowing multiple dialects within the type could be allowed as well. A third option is to strictly allow one representation (one `type` and one `dialect`) to guarantee a canonical, SOT representation (I realize this is a big change to the spec, but just putting it here since we are already discussing two other options).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org