You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/06/18 19:57:06 UTC

[GitHub] [beam] TheNeuralBit opened a new pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer

TheNeuralBit opened a new pull request #12035:
URL: https://github.com/apache/beam/pull/12035


   Adds a third option, REQUIRE_MISSING, where we throw an error if there's a null value. Only a missing field can encode an error.
   
   This also tweaks RowJsonTest so we can re-use ValueTest to exercise the new options.
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Samza | Spark
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/)
   Portable | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on a change in pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer

Posted by GitBox <gi...@apache.org>.
pabloem commented on a change in pull request #12035:
URL: https://github.com/apache/beam/pull/12035#discussion_r443774541



##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/util/RowJson.java
##########
@@ -159,9 +159,16 @@ public String toString() {
 
     private static final boolean SEQUENTIAL = false;
 
-    public enum MissingFieldBehavior {
-      ALLOW_MISSING,
-      REQUIRE_NULL
+    // Configure how we should attempt to parse null values when deserializing JSON
+    public enum NullBehavior {
+      // Interpret either a missing field or a null value is a null
+      ALLOW_MISSING_OR_NULL,
+      // Require nulls to be represented as a null value. If the key is missing for a nullable field
+      // an error will be thrown.
+      REQUIRE_NULL,
+      // Require nulls to be represented as a missing field. If the key exists with a null value an

Review comment:
       Maybe could rephrase the first phrase like `Requires fields with null values to be represented with a missing key.` ? - (or something else? I found it a little confusing the first time I read it) 
   
   - Requires fields with null values to be represented with a missing key.
   - Requires fields with null values to be missing from JSON representation.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer, Make ACCEPT_MISSING_OR_NULL the default behavior.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-651251071


   Run SQL PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit merged pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer, Make ACCEPT_MISSING_OR_NULL the default behavior.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit merged pull request #12035:
URL: https://github.com/apache/beam/pull/12035


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] reubenvanammers commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer

Posted by GitBox <gi...@apache.org>.
reubenvanammers commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-650970737


   Nice, good to see some continuation on this. LGTM FWIW.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-650394222


   Thanks @pabloem, great suggestion. I made the documentation into javadoc, and also tried to make it more clear, PTAL
   
   I also changed the default to `ACCEPT_MISSING_OR_NULL` I think this is what most users will want, and its consistent with the ["postel principle"](https://en.wikipedia.org/wiki/Robustness_principle) (which I just learned about :P): "be conservative in what you do, be liberal in what you accept from others."
   
   Technically I think that is a breaking change since its possible people are relying on the old behavior, so I added an entry to that effect in CHANGES.md
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer, Make ACCEPT_MISSING_OR_NULL the default behavior.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-652029350


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-650466112


   Run SQL PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer, Make ACCEPT_MISSING_OR_NULL the default behavior.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-651415998


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer, Make ACCEPT_MISSING_OR_NULL the default behavior.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-652067854


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer

Posted by GitBox <gi...@apache.org>.
pabloem commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-650470855


   LGTM thanks Brian!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] pabloem commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer, Make ACCEPT_MISSING_OR_NULL the default behavior.

Posted by GitBox <gi...@apache.org>.
pabloem commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-652065015


   Run Java PreCommit
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer, Make ACCEPT_MISSING_OR_NULL the default behavior.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-651309044


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12035: [BEAM-10220] Add support for REQUIRE_MISSING in RowJsonDeserializer

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12035:
URL: https://github.com/apache/beam/pull/12035#issuecomment-646275195






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org