You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "lostluck (via GitHub)" <gi...@apache.org> on 2023/02/08 17:05:18 UTC

[GitHub] [beam] lostluck commented on a diff in pull request #25352: add write options with create disposition for bigqueryio

lostluck commented on code in PR #25352:
URL: https://github.com/apache/beam/pull/25352#discussion_r1100420083


##########
sdks/go/pkg/beam/io/bigqueryio/bigquery.go:
##########
@@ -208,29 +208,58 @@ func mustParseTable(table string) QualifiedTableName {
 // TODO(herohde) 7/14/2017: allow CreateDispositions and WriteDispositions. The default

Review Comment:
   Please update the TODO, since it's at least partially accomplished now.



##########
sdks/go/pkg/beam/io/bigqueryio/bigquery.go:
##########
@@ -208,29 +208,58 @@ func mustParseTable(table string) QualifiedTableName {
 // TODO(herohde) 7/14/2017: allow CreateDispositions and WriteDispositions. The default
 // is not quite what the Dataflow examples do.
 
+// WriteOptions represents additional options for executing a write
+type WriteOptions struct {
+	// CreateDisposition specifies the circumstances under which destination table will be created
+	CreateDisposition bigquery.TableCreateDisposition
+}
+
+// newWriteOptions creates a new instance of WriteOptions
+// "CreateIfNeeded" is set as the default write disposition
+func newWriteOptions() WriteOptions {
+	return WriteOptions{CreateDisposition: bigquery.CreateIfNeeded}
+}
+
+// WithCreateDisposition specifies the circumstances under which destination table will be created
+func WithCreateDisposition(cd bigquery.TableCreateDisposition) func(wo *WriteOptions) error {
+	return func(wo *WriteOptions) error {
+		wo.CreateDisposition = cd
+		return nil
+	}
+}
+
 // Write writes the elements of the given PCollection<T> to bigquery. T is required
 // to be the schema type.
-func Write(s beam.Scope, project, table string, col beam.PCollection) {
+func Write(s beam.Scope, project, table string, col beam.PCollection, options ...func(*WriteOptions) error) {

Review Comment:
   Fun fact: This is actually a backwards incompatible change since if some user somewhere were metaprograming and passing around the `Write` function, it's type would have changed!
   
   But since the Go SDK isn't expecting that level of metaprogramming, we accept this sort of breaking change, since it's a vanishingly small fraction of users who would be affected. They already know they're being tricky.
   
   And on the plus side, you've made it so we don't need to make this style of breaking change for this function in the future. Well done!



##########
sdks/go/pkg/beam/io/bigqueryio/bigquery.go:
##########
@@ -208,29 +208,58 @@ func mustParseTable(table string) QualifiedTableName {
 // TODO(herohde) 7/14/2017: allow CreateDispositions and WriteDispositions. The default
 // is not quite what the Dataflow examples do.
 
+// WriteOptions represents additional options for executing a write
+type WriteOptions struct {
+	// CreateDisposition specifies the circumstances under which destination table will be created
+	CreateDisposition bigquery.TableCreateDisposition
+}
+
+// newWriteOptions creates a new instance of WriteOptions
+// "CreateIfNeeded" is set as the default write disposition
+func newWriteOptions() WriteOptions {
+	return WriteOptions{CreateDisposition: bigquery.CreateIfNeeded}
+}
+
+// WithCreateDisposition specifies the circumstances under which destination table will be created
+func WithCreateDisposition(cd bigquery.TableCreateDisposition) func(wo *WriteOptions) error {

Review Comment:
   Instead of using a vanilla function, declare the function type *as* the exported options type.
   
   `type WriteOption func(*writeOptions)`
   
   This avoids the "you're using unexported types in an Exported API" style messages, and provides something we can document on.
   
   This is preferable to simply having an aribitrary visible struct, when the intended mode isn't for users to write their own options functions. Having both "user can write this function" and "helper methods" it makes the API harder to use than "call functions X Y Z to set X Y Z"



##########
sdks/go/pkg/beam/io/bigqueryio/bigquery.go:
##########
@@ -208,29 +208,58 @@ func mustParseTable(table string) QualifiedTableName {
 // TODO(herohde) 7/14/2017: allow CreateDispositions and WriteDispositions. The default
 // is not quite what the Dataflow examples do.
 
+// WriteOptions represents additional options for executing a write
+type WriteOptions struct {
+	// CreateDisposition specifies the circumstances under which destination table will be created
+	CreateDisposition bigquery.TableCreateDisposition

Review Comment:
   I'm a tiny bit concerned whether this will serialize/deserialize correctly since it's a string derived type.  I think it should work, since DoFns are still serialized with JSON and that supports this behavior. 
   
   https://pkg.go.dev/cloud.google.com/go/bigquery#TableCreateDisposition
   
   If we finish moving DoFn serialization to beam schemas, we'll simply have to ensure that this behavior is supported.



##########
sdks/go/pkg/beam/io/bigqueryio/bigquery.go:
##########
@@ -208,29 +208,58 @@ func mustParseTable(table string) QualifiedTableName {
 // TODO(herohde) 7/14/2017: allow CreateDispositions and WriteDispositions. The default
 // is not quite what the Dataflow examples do.
 
+// WriteOptions represents additional options for executing a write
+type WriteOptions struct {

Review Comment:
   It's not necessary to export this type. Types can be unexported and still be serialized /deserialized properly. It's their field names that Must be exported.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org