You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by ie...@apache.org on 2019/05/09 08:24:01 UTC
[beam] branch spark-runner_structured-streaming updated (241453b ->
d08f110)
This is an automated email from the ASF dual-hosted git repository.
iemejia pushed a change to branch spark-runner_structured-streaming
in repository https://gitbox.apache.org/repos/asf/beam.git.
discard 241453b Limit the number of partitions to make tests go 300% faster
discard cc31b85 Add Batch Validates Runner tests for Structured Streaming Runner
discard 6ded5b6 Apply Spotless
discard b65fe7c Update javadoc
discard 4aae62b implement source.stop
discard 5cad1ce Ignore spark offsets (cf javadoc)
discard b6ddc0b Use PAssert in Spark Structured Streaming transform tests
discard dafcca8 Rename SparkPipelineResult to SparkStructuredStreamingPipelineResult This is done to avoid an eventual collision with the one in SparkRunner. However this cannot happen at this moment because it is package private, so it is also done for consistency.
discard 29e95af Add SparkStructuredStreamingPipelineOptions and SparkCommonPipelineOptions - SparkStructuredStreamingPipelineOptions was added to have the new runner rely only on its specific options.
discard 6df737a Fix logging levels in Spark Structured Streaming translation
discard cc6578e Fix spotless issues after rebase
discard 7037b4e Add doFnSchemaInformation to ParDo batch translation
discard 245d03b Fix non-vendored imports from Spark Streaming Runner classes
discard d076c28 Remove specific PipelineOotions and Registrar for Structured Streaming Runner
discard 0b39480 Rename Runner to SparkStructuredStreamingRunner
discard 4b93042 Remove spark-structured-streaming module
discard 5479ad0 Merge Spark Structured Streaming runner into main Spark module
discard 61f0e41 Fix access level issues, typos and modernize code to Java 8 style
discard 4b543ab Disable never ending test SimpleSourceTest.testUnboundedSource
discard c9eea62 Fix spotbogs warnings
discard 490a58f Fix invalid code style with spotless
discard 8ffd770 Deal with checkpoint and offset based read
discard a23ff40 Fllow up on offsets for streaming source
discard 0825b88 Clean streaming source
discard e85353a Remove unneeded 0 arg constructor in batch source
discard 27800f7 Clean unneeded 0 arg constructor in batch source
discard 3166718 Specify checkpointLocation at the pipeline start
discard e93ae54 Add source streaming test
discard 98776fc Add transformators registry in PipelineTranslatorStreaming
discard 39cfbde Add a TODO on spark output modes
discard bf6a5af Start streaming source
discard 21b91de Add streaming source initialisation
discard 4b4750c And unchecked warning suppression
discard 25461de Added TODO comment for ReshuffleTranslatorBatch
discard 863cd8b Added using CachedSideInputReader
discard 9a79a51 Don't use Reshuffle translation
discard 17dcd0d Fix CheckStyle violations
discard 0f6b0a7 Added SideInput support
discard 111db08 Temporary fix remove fail on warnings
discard cab0292 Fix build wordcount: to squash
discard 3d2b59b Fix javadoc
discard 3c05f5f Implement WindowAssignTest
discard 2e0d1f5 Implement WindowAssignTranslatorBatch
discard 4a6c08e Cleaning
discard 4f39c68 [TO UPGRADE WITH THE 2 SPARK RUNNERS BEFORE MERGE] Change de wordcount build to test on new spark runner
discard d3ef07f Fix encoder bug in combinePerkey
discard e855cd3 Add explanation about receiving a Row as input in the combiner
discard 01c8714 Use more generic Row instead of GenericRowWithSchema
discard 634cf93 Fix combine. For unknown reason GenericRowWithSchema is used as input of combine so extract its content to be able to proceed
discard 8e5b2f3 Update test with Long
discard 915e304 Fix various type checking issues in Combine.Globally
discard bbcbf37 Get back to classes in translators resolution because URNs cannot translate Combine.Globally
discard d1d5d05 Cleaning
discard 3583f26 Add CombineGlobally translation to avoid translating Combine.perKey as a composite transform based on Combine.PerKey (which uses low perf GBK)
discard baaaab0 Introduce RowHelpers
discard 909b879 Add combinePerKey and CombineGlobally tests
discard d0e6fbf Fix combiner using KV as input, use binary encoders in place of accumulatorEncoder and outputEncoder, use helpers, spotless
discard 1eda6f7 Introduce WindowingHelpers (and helpers package) and use it in Pardo, GBK and CombinePerKey
discard a65b7a7 Improve type checking of Tuple2 encoder
discard 0f4f188 First version of combinePerKey
discard 7285708 Extract binary schema creation in a helper class
discard 0fdb194 Fix getSideInputs
discard b6be874 Generalize the use of SerializablePipelineOptions in place of (not serializable) PipelineOptions
discard f3e4c8c Rename SparkDoFnFilterFunction to DoFnFilterFunction for consistency
discard a8ef165 Add a test for the most simple possible Combine
discard 615820d Added "testTwoPardoInRow"
discard 8874bfd Fix for test elements container in GroupByKeyTest
discard 909c4ea Rename SparkOutputManager for consistency
discard 7361ce3 Fix kryo issue in GBK translator with a workaround
discard b47909b Simplify logic of ParDo translator
discard f05e378 Don't use deprecated sideInput.getWindowingStrategyInternal()
discard 4fba344 Rename pruneOutput() to pruneOutputFilteredByTag()
discard cecd658 Rename SparkSideInputReader class
discard 2eed3b7 Fixed Javadoc error
discard 8c8843a Apply spotless
discard 939ead2 Fix Encoders: create an Encoder for every manipulated type to avoid Spark fallback to genericRowWithSchema and cast to allow having Encoders with generic types such as WindowedValue<T> and get the type checking back
discard 624b144 Fail in case of having SideInouts or State/Timers
discard ed89f59 Add ComplexSourceTest
discard 6873814 Remove no more needed putDatasetRaw
discard 26e02e5 Port latest changes of ReadSourceTranslatorBatch to ReadSourceTranslatorStreaming
discard 52b2aff Fix type checking with Encoder of WindowedValue<T>
discard 5092bbb Add comments and TODO to GroupByKeyTranslatorBatch
discard d4a484b Add GroupByKeyTest
discard 2535ad1 Clean
discard 32ee7f7 Address minor review notes
discard 214a6dc Add ParDoTest
discard 8e56d0a Cleaning
discard 67cce09 Fix split bug
discard a1a2029 Remove bundleSize parameter and always use spark default parallelism
discard 4639bcb Cleaning
discard 9b00c53 Fix testMode output to comply with new binary schema
discard 59e9e8d Fix errorprone
discard f0cfa34 Comment schema choices
discard 98376d5 Serialize windowedValue to byte[] in source to be able to specify a binary dataset schema and deserialize windowedValue from Row to get a dataset<WindowedValue>
discard f543ba4 First attempt for ParDo primitive implementation
discard f668bb2 Add flatten test
discard 97c4386 Enable gradle build scan
discard 548279c Enable test mode
discard 1511106 Put all transform translators Serializable
discard ddb0991 Simplify beam reader creation as it created once the source as already been partitioned
discard 27635ac Fix SourceTest
discard 31a2a48 Move SourceTest to same package as tested class
discard 710ad1b Add serialization test
discard fe1c367 Fix SerializationDebugger
discard 797c83f Add SerializationDebugger
discard d9f219c Fix serialization issues
discard 4654c13 Cleaning unneeded fields in DatasetReader
discard a2a31aa improve readability of options passing to the source
discard d5779c7 Fix pipeline triggering: use a spark action instead of writing the dataset
discard ef34903 Deactivate deps resolution forcing that prevent using correct spark transitive dep
discard 1ed6f24 Refactor SourceTest to a UTest instaed of a main
discard b3bc0cf Checkstyle and Findbugs
discard 87748e7 Clean
discard dc6499c Add empty 0-arg constructor for mock source
discard 2516d5b Add a dummy schema for reader
discard c3dc452 Fix checkstyle
discard a7518b4 Use new PipelineOptionsSerializationUtils
discard 7008f97 Apply spotless
discard 77a86cc Add missing 0-arg public constructor
discard 1cba738 Wire real SourceTransform and not mock and update the test
discard 7371bcf Refactor DatasetSource fields
discard e6aab46 Pass Beam Source and PipelineOptions to the spark DataSource as serialized strings
discard bbfcd66 Move Source and translator mocks to a mock package.
discard e6ab6ca Add ReadSourceTranslatorStreaming
discard 5e9e576 Cleaning
discard a5c368e Use raw Encoder<WindowedValue> also in regular ReadSourceTranslatorBatch
discard 50bf3b0 Split batch and streaming sources and translators
discard 9cd03ce2 Run pipeline in batch mode or in streaming mode
discard 351b35f Move DatasetSourceMock to proper batch mode
discard 404fe6d clean deps
discard f595c1b Use raw WindowedValue so that spark Encoders could work (temporary)
discard 4947f2d fix mock, wire mock in translators and create a main test.
discard a6cf828 Add source mocks
discard 2bb484f Experiment over using spark Catalog to pass in Beam Source through spark Table
discard 18baa0f Improve type enforcement in ReadSourceTranslator
discard 4ec4334 Improve exception flow
discard 927e37e start source instanciation
discard 93c46b2 Apply spotless
discard 80076eb update TODO
discard e660845 Implement read transform
discard b8e04b8 Use Iterators.transform() to return Iterable
discard b0e8583 Add primitive GroupByKeyTranslatorBatch implementation
discard 9db56d8 Add Flatten transformation translator
discard 92dbb8a Create Datasets manipulation methods
discard 9077611 Create PCollections manipulation methods
discard 8410dc7 Add basic pipeline execution. Refactor translatePipeline() to return the translationContext on which we can run startPipeline()
discard 161a22c Added SparkRunnerRegistrar
discard 66a8a1e Add precise TODO for multiple TransformTranslator per transform URN
discard dd7985d Post-pone batch qualifier in all classes names for readability
discard 8d90727 Add TODOs
discard 0b8c6ef Make codestyle and firebug happy
discard 847ed54 apply spotless for e-formatting
discard ddaeff4 Move common translation context components to superclass
discard 389dc56 Move SparkTransformOverrides to correct package
discard 7575d81 Improve javadocs
discard cd9ebc3 Make transform translation clearer: renaming, comments
discard 41c4ca5 Refactoring: -move batch/streaming common translation visitor and utility methods to PipelineTranslator -rename batch dedicated classes to Batch* to differentiate with their streaming counterparts -Introduce TranslationContext for common batch/streaming components
discard 0895448 Initialise BatchTranslationContext
discard 4c9401b Organise methods in PipelineTranslator
discard b8a5581 Renames: better differenciate pipeline translator for transform translator
discard 3b3e2f7 Wire node translators with pipeline translator
discard 861052a Add nodes translators structure
discard bd11fe7 Add global pipeline translation structure
discard f230100 Start pipeline translation
discard 9bf31ff Add SparkPipelineOptions
discard b8def9d Fix missing dep
discard 801aa0e Add an empty spark-structured-streaming runner project targeting spark 2.4.0
add acdaa78 [BEAM-7095] Upgrade to RabbitMQ amqp-client 4.9.3 in RabbitMqIO
add c2c371a Merge pull request #8329: [BEAM-7095] Upgrade to RabbitMQ amqp-client 4.9.3 in RabbitMqIO
add b29cc7c Mahatma Gandhi is spelt wrong.
add d11cb27 Mahatma Gandhi is spelt wrong.
add e99dd29 [BEAM-7100] BeamValuesRel should accept empty tuples
add 3128cf5 Merge pull request #8339 from amaliujia/rw_empty_join_on_one_side
add f69e960 Update IOIT Dashbards url
add d368c9d Merge pull request #8356: [BEAM-6627] Update IOIT Dashbards url in docs
add 332d5c1 Stage Dataflow worker jar in Dataflow PostCommit tests on Python 3.
add d76a9fe Fix bug with use_fn_api_runner in Python.
add ccbad61 Merge pull request #8352 from tvalentyn/stage_df_worker_jar
add 9a99664 [SQL] Add Data Catalog Table Provider
add 9c8a8dc Merge pull request #8349 from akedin/datacatalog-table-provider
add 5ff8475 [BEAM-7088] Implement switch to using MonitoringInfo labels for Name and Namespace. (#8316)
add fbe357f [BEAM-7118] Link Spark portable runner design doc on website
add 40fe671 Merge pull request #8362 from ibzib/spark-doc
add 7857fa7 [BEAM-7109] Do not reconnect logging at termination
add d866fed Merge pull request #8367 from angoenka/logging_thread_leak
add 1eedc47 [BEAM-7119} Change Nexmark URL to use an archive.org link
add 70941d2 Merge pull request #8309: [BEAM-7119] Change Nexmark URL to use an archive.org link
add e845da1 [BEAM-6730] Update docs for Python generate_sequence transform
add 23276fe Merge pull request #8321: [BEAM-6730] Update docs for Python generate_sequence transform
add c1c6d29 Fix NPE in nullable array/map fields in Row
add f702098 Merge pull request #8368: [BEAM-7125] Row throws NPE when comparing instances with complex null values
add 55fa76b [BEAM-7036] Spark portable runner: stage files to run on 'remote' cluster
add 20411ee Merge pull request #8254: [BEAM-7036] Spark portable runner: stage files to run on 'remote' cluster
add 1fadc8b [BEAM-7122] Adding an accumulating trigger test (#8364)
add b23ef64 [BEAM-7112] Defer state cleanup timers to avoid interference with user timers
add ab82768 Merge pull request #8351: [BEAM-7112] [flink] Defer state cleanup till bundle completion
add dfcc9e5 Blog post for Season of Docs (#8365)
add 1a56c38 Updates container image used by Dataflow Runner in unreleased SDKs.
add 0a40717 Merge pull request #8374: Updates container image used by Dataflow Runner in unreleased SDKs.
add e28b2f0 [BEAM-6660] Fix BQ KMS integration test to use NativeWriter
add 006b741 Merge pull request #7976 from udim/cmek-bq-native
add fc6a7de Add interactive "Try Apache Beam" page
add bdb1a71 Merge pull request #7679: [BEAM-6557] Adds an interactive "Try Apache Beam" page
add 9cc3f2f Adds two existing cross-language tests to Python post-commit.
add a67ea9e Merge pull request #8274: [BEAM-6962] Adds two existing cross-language tests to Python post-commit
add 68101d9 BEAM-6919 switch snapshot publishing to ubuntu labelled agents
add 443412d Merge pull request #8373: BEAM-6919 switch snapshot publishing to ubuntu labelled agents
add 0eade2a [BEAM-7128] Parallelism is unavailable when applying ReplacementTransforms
add dc7b624 Merge pull request #8372: [BEAM-7128] Parallelism is unavailable when applying ReplacementTransforms
add 43be51f [BEAM-6908] Support Python3 performance benchmarks - part 2
add c97e391 Merge pull request #8324 from markflyhigh/py3-perf
add c9befb9 Do not use snappy if python-snappy is not installed
add 50cf84f fix
add 4b3d89d fix
add 0c46d2e Merge pull request #8378 from apache/aaltay-patch-1
add a847ac2 [BEAM-5503] Update BigQueryIO time partitioning doc snippet
add 865727c Merge pull request #8384: [BEAM-5503] Update BigQueryIO time partitioning doc snippet
add 88875bc Add null check to fieldToAvatica
add 1075ff3 Merge pull request #8375: [BEAM-7129] BeamEnumerableConverter does not handle null values for all types
add 5bfa896 [BEAM-7087] Creating "errors" package internal to Beam Go SDK
add de2d929 Merge pull request #8369 from youngoli/beam7087
add 9424517 [BEAM-6760] website: disable testing external links by default
add 5dfb1a0 add website postcommit test that checks external links
add 47d44b1 Merge pull request #8318 from ibzib/test-website-disable-external
add c8fc0c3 [SQL] expose a public entry for toRowList.
add dd3737e Merge pull request #8382: [SQL] expose a public entry for toRowList in BeamEnumerableConverter
add 1840b43 [BEAM-7015] Remove duplicate standard_coders.yaml
add a1680b5 Merge pull request #8319: [BEAM-7015] Remove duplicate standard_coders.yaml
add ad14cb9 Fix Python wordcount benchmark by specifiying beam_it_module
add 8ea9a8e Merge pull request #8389 from markflyhigh/fix-py-benchmark
add 6f3bcd9 [BEAM-7012] Support TestStream in streaming Flink Runner
add c44c30f Merge pull request #8383: [BEAM-7012] Support TestStream in streaming Flink Runner
add a13121c refactor : standardize the kotlin samples
add b6c3c74 Merge pull request #8392 from harshithdwivedi/stdKotlin
add 86b5b19 [BEAM-6908] New Jenkins branch for Python35 benchmark
add d9c2de8 Merge pull request #8393 from markflyhigh/py35-benchmark
add 329dcd7 Merge pull request #8388: Add OSSIndex CVE audit gradle plugin
add 8821ed8 [BEAM-4552] Use Spark AccumulatorsV2 for metrics
add 36b839e Merge pull request #8387: [BEAM-4552] Use Spark AccumulatorsV2 API
add 7c780fd Fix small bug in top documentation
add 270d8fe Merge pull request #8401 Fix small bug in top documentation
add a5f3a75 [BEAM-7070] JOIN condition should accept field access
add 6622642 [sql] ignore Nexmark SQL queries that has non equal join.
add 152c6e0 [sql] generalize RexInputRef and RexFieldAccess in JOIN.
add 2cb44a8 Merge pull request #8301 from amaliujia/rw-improve_join_condition
add 4702dbb [BEAM-3863] Ensure correct firing of processing time timers
add fa2369c Merge pull request #8366: [BEAM-3863] Ensure correct firing of processing time timers
add b296dc9 Merge pull request #8385: Fix broken parsing of uuid
add 22d02e4 Website changes for 2.12.0
add 9a23704 Merge pull request #8215: Website changes for 2.12.0
add 0e06e16 Update release guide with feedback from 2.12.0 release
add b529db9 Merge pull request #8315: Update release guide with feedback from 2.12.0 release
add 22cdd2c Add 2.12.0 Blog post.
add b771c85 Merge pull request #8314: Add 2.12.0 Blog post.
add 8b4de16 [BEAM-5709] Changing sleeps to CountdownLatch in test
add f1330e7 Merge pull request #8395 from youngoli/beam5709
add 563fa37 [BEAM-7139] Blogpost for Kotlin Samples (#8391)
add aebf831 [BEAM-7029] Add KafkaIO.Write as an external transform
add 077223e Merge pull request #8322: [BEAM-7029] Add KafkaIO.Write as an external transform
add a9ccb55 Increase Flink parallelism for ValidatesRunner tests
add d46da92 Merge pull request #8403: Increase Flink parallelism for ValidatesRunner tests
add 0b2db56 [BEAM-7162] Add ValueProvider to CassandraIO Write
add c38666a Merge pull request #8413: [BEAM-7162] Add ValueProvider to CassandraIO Write
add 4b67847 [BEAM-7110] Add Spark master option to SparkJobServerDriver
add 81faf35 Merge pull request #8379 from ibzib/spark-master2
add 249a5ce [BEAM-7133] make Spark job server Gradle task pass along system properties
add f346918 Merge pull request #8386 from ibzib/spark-properties
add 9abf706 Add some pydoc for runner classes in Python
add 516cdb6 Merge pull request #8415 from pabloem/doc-fn
add bb3ac16 [BEAM-6479] Deprecate AvroIO.RecordFormatter
add e5cf1e8 Merge pull request #8418: [BEAM-6479] Deprecate AvroIO.RecordFormatter
add 77b295b Merge pull request #8311: Allow Schema field selections in DoFn using NewDoFn injection
add c565881 [BEAM-2939] Support SDF expansion in the Flink runner.
add 365c99d Merge pull request #8412 [BEAM-2939] Support basic SDF expansion for the Flink runner.
add 2729adf [BEAM-7076] Update Spark runner to use spark version 2.4.2
add ac9f73a Merge pull request #8423: [BEAM-7076] Update Spark runner to use spark version 2.4.2
add a8f4e22 [BEAM-7170] Fix exception when retrieving ExpansionServer port after closing
add 7e1de51 Merge pull request #8424: [BEAM-7170] Fix exception when retrieving ExpansionServer port after closing
add b8db1ee [BEAM-5311] Delete docker build images
add 8067424 Merge pull request #8419: [BEAM-5311] Delete docker build images
add aca9523 [BEAM-7112] Timer race with state cleanup - take two
add 2b111c4 Merge pull request #8399: [BEAM-7112] Prevent race condition between user and cleanup timers
add fbf8704 Python Datastore IO using google-cloud-datastore
add 8d1ff73 Merge pull request #8262: [BEAM-4543] Python Datastore IO using google-cloud-datastore
add ce256ba Adds a possibility to pass a project that hosts Cloud Spanner instance to Cloud Spanner IO ITs.
add 6567f16 Merge pull request #8404: [BEAM-1542] Add a possibility to pass a project that hosts Cloud Spanner instance.
add 6953f0a Update dataflow container version to beam-master-20190426
add 648b5bd Merge pull request #8420 from ajamato/dataflow_release_ver
add 0f52110 [BEAM-7087] Updating Go SDK errors in base beam directory
add 6d180cd [BEAM-7087] Moving "errors" package to its own subdirectory.
add cdebe65 Merge pull request #8406 from youngoli/beam7087
add 5ed2019 [BEAM-6526] Add ReadFiles and ParseFiles transforms for AvroIO
add 3916fe8 Merge pull request #8414: [BEAM-6526] Add ReadFiles and ParseFiles transforms for AvroIO
add 340145d Updates Dataflow runner to support external ParDos.
add ef4b2ef Merge pull request #8270: [BEAM-6894] Support for constructing a Dataflow job request that includes cross-language transforms
add 0b3b18c Ensure that all nested schemas are given ids, and fix bug where nullable was not propagated to the proto.
add 2738c16 Merge pull request #8422: [BEAM-7002] Fix failures in SchemaCoder
add 4c654122 [BEAM-7072][SQL][Nexmark] Disable Query5
add ef67378 Merge pull request #8431 from akedin/disable-sqlquery5
add dfe3646 [BEAM-7178] Adding simple package comment to Go "errors" package.
add 5024e90 Merge pull request #8435 from youngoli/beam7178
add 8403313 [BEAM-7157] Allow creation of BinaryCombineFn from lambdas.
add 85af6f1 fixup
add 4389dcf Merge pull request #8409 [BEAM-7157] Allow creation of BinaryCombineFn from lambdas.
add 28d8115 Avoid depending on context formating in hooks_test
add 190162d Merge pull request #8405 from lostluck/fixTest
add 0dddd45 Support the newest version of httplib2 (#8263)
add 2cf6a20 [BEAM-6621][BEAM-6624] add direct runner and dataflow runner it test suites for python 3.6
add 9814ba9 Merge pull request #8381 from Juta/it-tests
add 3839281 [BEAM-3344] Changes for serialize/deserialize PipelineOptions via jackson in DirectRunner (#8302)
add 3a43442 Add clarifying comment for MonitoringInfoLabel usage.
add 21f6320 Merge pull request #8440 from ajamato/patch-3
add a0515de [BEAM-7166] Add more checks on join condition.
add 12b0493 Merge pull request #8421 from amaliujia/rw-more_checks_on_join_condition
add 73be1e3 withNumFileShards must be used when using withTriggeringFrequency
add fdf84ab Merge pull request #8244 from ttanay/with-num-file-shards
add 62741ed [BEAM-7087] Updating Go SDK errors (Part 1)
add 70cf8c6 Merge pull request #8433 from youngoli/beam7087-3
add 3b2c901 Bumps down the httplib2 version since it conflicts with 'googledatastore' package
add 6b43523 Merge pull request #8449: Bumps down the httplib2 version since it conflicts with 'googledatastore' package
add f2601b1 Revert "[BEAM-5709] Changing sleeps to CountdownLatch in test"
add 504dd76 Merge pull request #8446 from youngoli/revert-8395-beam5709
add fff46e1 [BEAM-6669] Set Dataflow KMS key name (#8296)
add 6cd47fc [BEAM-7196] Add Display Data to FileIO Match/MatchAll
add b1da963 Merge pull request #8451: [BEAM-7196] Add Display Data to FileIO Match/MatchAll
add 5c6e371 [BEAM-7176] don't reuse Spark context in tests (causes OOM)
add 0302a2c Merge pull request #8444: [BEAM-7176] don't reuse Spark context in tests (causes OOM)
add 40076e6 Add pip check invocations to all tox environments.
add f8af79d Merge pull request #8450: Add pip check invocations to all tox environments.
add 77294ed [BEAM-7179] Correct file extension (#8434)
add 99c4de4 Downgrade logging level to avoid log spam
add b419ac4 Merge pull request #8459 from Ardagan/ReduceLogSpam
add 4859bca [BEAM-7201] Go README: update task name
add 33e6b72 Merge pull request #8461 from ibzib/goLock
add a53180e [BEAM-6769] write bytes to bigquery in python 2
add 615013b Merge pull request #8047 from Juta/bq-io
add 62bee7b Fixing fileio tests for windows
add 03ede84 Merge pull request #8460 from pabloem/fileio-win
add d00210f [BEAM-6526] Remove unneeded methods in AvroIO ReadFiles and ParseFiles
add f08a2d3 [BEAM-6526] Refactor AvroIO Read/Parse to use FileIO + ReadFiles/ParseFiles
add 7178882 [BEAM-6526] Add populateDisplayData on AvroIO.ReadFiles
add b20dd77 Merge pull request #8442: [BEAM-6526] Remove unneeded methods in AvroIO ReadFiles and ParseFiles
add 4474978 [BEAM-7193] ParDoLifecycleTest: remove duplicated inner class
add 9f3592c Merge pull request #8445: [BEAM-7193] ParDoLifecycleTest: remove duplicated inner class
add 8f4569f [BEAM-7171] Ensure bundle finalization during Flink checkpoint
add 848f8bb Merge pull request #8426: [BEAM-7171] Ensure bundle finalization during Flink checkpoint
add d3614c2 [website] Link design document on cross-language and legacy IO
add 5ff0a4f Merge pull request #8468: [website] Link design document on cross-language and legacy IO
add 1d42409 [BEAM-7205] Remove MongoDb withKeepAlive configuration
add b5690f2 Merge pull request #8467: [BEAM-7205] Remove MongoDb withKeepAlive configuration
add 54c46b2 [BEAM-6850] Use HadoopFormatIOIT for perfomance/integration testing
add e37a922 [BEAM-6850] Add metrics collectioning to HadoopFormatIOIT
add 3d61429 Merge pull request #8427: [BEAM-6850] Use HadoopFormatIOIT for performance/integration testing
add 45b0325 Use TFRecord to store intermediate cache results using PCollection's PCoder.
add ce77db1 Merge pull request #8458 from leo-unc/feature/cache_manager_tfrecord
add 5c2e983 FnApiMonitoringInfoToCounterUpdateTranformer with User counters
add 0a98fce Merge pull request #8456 from Ardagan/FixUserCounters
add d6bbabd [BEAM-7213] fix beam_PostRelease_NightlySnapshot
add 2d4d0af Merge pull request #8471: [BEAM-7213] fix beam_PostRelease_NightlySnapshot
add d0343a1 [BEAM-7211] Implement HasDefaultTracker interface in ByteKeyRange
add 3f9596e [BEAM-7211] Implement HasDefaultTracker interface in ByteKeyRange
add 6d4771b Allow inference of CombiningValueState coders.
add a434c9e Merge pull request #8402 from robertwb/infer-state-coder
add c02c719 [BEAM-7202] fix inventory jobs
add e0066fd Merge pull request #8463: [BEAM-7202] fix inventory jobs
add 21f61dd [BEAM-4374] Emit SampledByteCount distribution tuple system metric from Python SDK (@Ardagan co-contributed)
add d4afbab Merge pull request #8062 from ajamato/mean_byte_count
add 6edf005 [BEAM-7154] Updating Go SDK errors (Part 2)
add 75eee78 Merge pull request #8462 from youngoli/beam7154
add b343782 Move schema assignment onto Create builder
add 2f821dc Merge pull request #8479 from TheNeuralBit/empty-value-schema
add e655d40 [BEAM-7062] Fix pydoc for @retry.with_exponential_backoff
add 513acbf Make utility functions private ...
add d742189 Merge pull request #8284 from udim/docs-annotations
add fd946ac Adding documentation to DirectRunner functions. (#8464)
add c6f0057 Update beam_fn_api.proto to specify allowed split points. (#8476)
add 2b4c252 Update Dataflow BEAM_CONTAINER_VERSION
add 6d4ab39 Merge pull request #8470: Update Dataflow Python BEAM_CONTAINER_VERSION
add 5e3df08 [BEAM-6859] align teardown with setup calls also for empty streaming batches
add 0606ba8 Merge pull request #8443 from adude3141/BEAM-6859
add 1caca11 [BEAM-6247] Remove deprecated module “hadoop-input-format”
add 0e581d4 Merge pull request #8482: [BEAM-6247] Remove deprecated module “hadoop-input-format”
add 1c26a51 [BEAM-7192] Fix partitioning of buffered elements during checkpointing
add f8d6b90 Merge pull request #8441: [BEAM-7192] Fix partitioning of buffered elements during checkpointing
add 79d8ca2 Update setup.py for pyarrow
add 8964024 Merge pull request #8484 from apache/aaltay-patch-1
add 64ee8d4 [BEAM-6605] Deprecate TextIO.readAll and TextIO.ReadAll transform
add d279404 [BEAM-6605] Refactor TextIO.Read and its Tests to use FileIO + ReadFiles
add 85e6a52 Merge pull request #8465: [BEAM-6605] Deprecate TextIO.readAll and TextIO.ReadAll transform
add c14fbcf [BEAM-6606] Deprecate AvroIO ReadAll and ParseAll transforms
add 0cc6b67 Merge pull request #8466: [BEAM-6606] Deprecate AvroIO ReadAll and ParseAll transforms
add 6992fd0 [BEAM-2939] Initial SyntheticSDF as Source and add an Synthetic pipeline to sdf test (#8338)
add 72a465e [BEAM-7137] encode header to bytes when writing to file at apache_beam.io.textio.WriteToText
add c01b012 Merge pull request #8452 from lazylynx/writetotext-header-encode
add d4ebd2b [BEAM-6966] Spark portable runner: translate READ
add 98453a0 Merge pull request #8493: [BEAM-6966] Spark portable runner: translate READ
add c211939 fix format string errors with errors package.
add 3cf3a47 Update Go protos.
add 9488352 Merge pull request #8487: Update Go protos + minor fix.
add df7058a [BEAM-7102] Adding `jar_packages` experiment option for Python SDK
add 207ac33 Merge pull request #8340: [BEAM-7102] Adding `jar_packages` experiment option for Python SDK
add 22d158e Refine Spark ValidatesRunner exclusions
add fb2aaf9 Merge pull request #8490: Refine Spark ValidatesRunner exclusions
add 2b45dd0 Fix non-determistic row access
add bc1a5e2 Merge pull request #8474: [BEAM-7210] Fix non-deterministic row access
add af16837 [BEAM-7227] Instantiate PipelineRunner from options to support other runners
add 00cd112 Merge pull request #8502: [BEAM-7227] Instantiate PipelineRunner from options to support other runners
add c4d7b3a Categorize missing unbounded NeedsRunner tests in sdks/java/core
add 54f89e1 Merge pull request #8489: Categorize missing unbounded NeedsRunner tests in sdks/java/core
add 9192ac9 Adding PyDoc to CombiningValueStateSpec (#8477)
add f93fd69 Fix EXCEPT DISTINCT behavior
add cee37cb Merge pull request #8447: [BEAM-7194] EXCEPT DISTINCT behavior when right set contains a value is incorrect
add cf767bb [BEAM-7066] re-add python 3.6 and 3.7 precommit test suites
add cbc2618 Merge pull request #8500 from Juta/it-tests
add b19c7a1 Skipping fileio tests on windows
add 938c94e Merge pull request #8509 from pabloem/win-deact
add b330cde Add UsesSchema category to Schema transform tests
add fb88549 Merge pull request #8507: Add UsesSchema category to Schema transform tests
add 9857bfc Minor fixes related to NeedsRunner/ValidatesRunner tests
add db34591 Categorize TestStreamsTest tests that use processing time
add 04eb435 Merge pull request #8508: Refines categories to use `UsesTestStreamWithProcessingTime` in `TestStreamTests`
add ffa5632 [BEAM-7216] reinstate checks for Kafka client methods
add 67b49ce Merge pull request #8503: [BEAM-7216] reinstate checks for Kafka client methods
add d52d720 [BEAM-5197] Guard access to test timer service by a lock
add da4a225 Merge pull request #8504: [BEAM-5197] Guard access to test timer service by a lock
add c36913e Fix project path in beam_PerformanceTests_Python35
add bc47d72 Merge pull request #8510 from markflyhigh/fix-py-perf
add 05e78f8 Merge pull request #8485: Fixed bug where fieldIdsAccessed order could be inconsistent with getFieldsAccessed
add 5108958 [BEAM-7229] ParDoLifecycleTest: remove duplicated test methods
add a7c40dd Merge pull request #8506: [BEAM-7229] ParDoLifecycleTest: remove duplicated test methods
add 5948177 Add Structured streaming Spark Runner design document
add c489f77 Merge pull request #8514: Add Structured streaming Spark Runner design document
add 5380e79 [BEAM-6669] Set Dataflow KMS key name (Python)
add a6a6b8e [BEAM-6669] Set Dataflow KMS key name (Python)
add 5733f69 Add Bigtable to supported Beam connectors
add ea6a342 Merge pull request #8492: [BEAM-7231] Add Bigtable to supported Beam connectors
add b37fd2a GrpcWindmillWorker improvements: - Log error statuses returned from grpc. - Delay finalization of get data stream batch until just before sending.
add 86bef70 [BEAM-7234] GrpcWindmillServer improvements
add bbf40b0 Fixing fileio tests for windows
add 1382505 Merge pull request #8516 from pabloem/fix-win
add fe73353 [BEAM-6986] TypeHints Py3 Error: Multiple tests fail with a generator StopIteration error on Python 3.7 (#8497)
add 7404a92 [sql] reject aggreagtion distinct rather than executing distinct as all.
add b93793d Merge pull request #8498: [SQL] reject aggregation distinct rather than executing distinct as all
add 046b2c8 Move test_sdf_synthetic_source to FnApiRunnerTest
add 44e6fec Merge pull request #8517 from boyuanzz/patch_synthetic
add 65f2db0 Revert "Use TFRecord to store intermediate cache results using PCollection's"
add df0543c Merge pull request #8519 from leo-unc/feature/cache_manager_tfrecord
add d2851bf [BEAM-5644] make Planner configurable.
add c25b34a Merge pull request #7745 from amaliujia/rw-configurable_planner
add ad05d63 Fix fastavro on Python 3
add d128f93 fixup: Fix fastavro on Python 3
add 0ac265b Merge pull request #8130: [BEAM-6522] Use fastavro as default Avro implementation for Avro source/sink in Python3
add ac12bef [BEAM-7214] Run Python validates runner tests on Spark
add 2f38c2e Merge pull request #8511: [BEAM-7214] Run Python validates runner tests on Spark
add 90677a1 [SQL] Remove TYPE_ prefix from DataCatalogTableProvider
add 21700d7 Merge pull request #8520 from akedin/remove-type-prefix
add c26bcc1 [BEAM-6892] Adding support for side inputs on table & schema. Also adding additional params for BQ. (#8102)
add 6d9214f Adding Python samples to the Timely (and stateful) Processing post. (#8448)
add 9cd0d56 [SQL][Fix] Fix DataCatalog MAP type
add 52481a8 Merge pull request #8524 from akedin/fix-datacatalog-type-map
add 2548fc2 [BEAM-7241] Bump the version of apitools.
add 8387077 Merge pull request #8525 from tvalentyn/patch-48
add 6a45d90 [SQL] Upgrade DataCatalog client to 0.4.0-alpha
add 8e14f2f Merge pull request #8527 from akedin/upgrade-dc-040
add 8135e39 [SQL] Refactor BeamSqlEnv
add 79a4637 Merge pull request #8523 from akedin/refactor-sqlenv
add 1d85f9f fix splitting into filters and buckets
add 422b9ef tidy up println
add bfd88c8 Merge pull request #8359 from romanvanderkrogt/BEAM-7081
add 80e2cef [BEAM-7197] ensure DoFn is torn down after exception thrown in lifecycle method
add 7cdda3c Merge pull request #8495: [BEAM-7197] ensure DoFn is torn down after exception thrown in lifecycle method
add da83991 [BEAM-7239] Add withDataSourceProviderFn() to JdbcIO
add 10ae28e [BEAM-7041] Refactor pool creation to be done via composition
add e7f7399 Merge pull request #8521: [BEAM-7239] Add withDataSourceProviderFn() to JdbcIO
add a2900ac [BEAM-7008] adding UTF8 String coder to Java SDK ModelCoder
add 0edab50 Merge pull request #8228: [BEAM-7008] adding UTF8 String coder to Java SDK ModelCoders
add 5451256 [BEAM-7028] Add Combine Java load tests
add d537bd0 Merge pull request #8360: [BEAM-7028] Add Combine Java load tests
add ea63ffc Loosen up dependency specifications
add ceb6809 Add upper bounds.
add c340e1b Merge remote-tracking branch 'origin/master'
add 1ffd20f Merge branch 'master' into master
add 8cd1cd4 Merge pull request #7851: [BEAM-6702] Loosen up dependency specifications
add e98a3a6 [BEAM-7134] Spark portable runner: executable stage sinks should have unique names
add c4c5aab Merge pull request #8526: [BEAM-7134] Spark portable runner: executable stage sinks should have…
add 20badac [BEAM-7173] Avoiding schema autodetection by default in WriteToBigQuery (#8473)
add 814e77c [BEAM-7238] Make sfl4j-simple a runtime only dependency
add d8584df [BEAM-7238] Make sfl4j-jdk14 a runtime only dependency
add 05d50bd Merge pull request #8515: [BEAM-7238] Make sfl4j bindings runtimeOnly/testRuntimeOnly
add c9bad5d Changes to SDF API to use DoFn Params (#8430)
add 92517ca [sql] fix non return bug.
add 2e3f921 Merge pull request #8532 from amaliujia/rw-fix_join_bug
add 98436d6 Modify StreamingDataflowWorker commit loop to only create a commit stream once there is a commit.
add 54c75de [BEAM-7235] StreamingDataflowWorker creates commit stream only when commit available
new 474c0dc Add an empty spark-structured-streaming runner project targeting spark 2.4.0
new 56ad12e Fix missing dep
new 4e8f89e Add SparkPipelineOptions
new 721fa2d Start pipeline translation
new e837be2 Add global pipeline translation structure
new 4ecaa74 Add nodes translators structure
new 73d37f2 Wire node translators with pipeline translator
new 72e7fb9 Renames: better differenciate pipeline translator for transform translator
new 77286bd Organise methods in PipelineTranslator
new 2fa095a Initialise BatchTranslationContext
new 0621611 Refactoring: -move batch/streaming common translation visitor and utility methods to PipelineTranslator -rename batch dedicated classes to Batch* to differentiate with their streaming counterparts -Introduce TranslationContext for common batch/streaming components
new b8e9f7a Make transform translation clearer: renaming, comments
new 1cd8dab Improve javadocs
new cc80092 Move SparkTransformOverrides to correct package
new e4a647e Move common translation context components to superclass
new 22c8338 apply spotless for e-formatting
new 9351397 Make codestyle and firebug happy
new dce77db Add TODOs
new c084b60 Post-pone batch qualifier in all classes names for readability
new e4a334b Add precise TODO for multiple TransformTranslator per transform URN
new 27c7fd8 Added SparkRunnerRegistrar
new aa1234e Add basic pipeline execution. Refactor translatePipeline() to return the translationContext on which we can run startPipeline()
new 073cf1e Create PCollections manipulation methods
new 356a830 Create Datasets manipulation methods
new 9c3b75a Add Flatten transformation translator
new e5b3b64 Add primitive GroupByKeyTranslatorBatch implementation
new df0a396 Use Iterators.transform() to return Iterable
new 9e33e38 Implement read transform
new 261bb7c update TODO
new 2d8db45 Apply spotless
new 8bdcd4c start source instanciation
new 50088ea Improve exception flow
new 90c92af Improve type enforcement in ReadSourceTranslator
new f3d50d4 Experiment over using spark Catalog to pass in Beam Source through spark Table
new 590fe0a Add source mocks
new 89d910e fix mock, wire mock in translators and create a main test.
new 6724b88 Use raw WindowedValue so that spark Encoders could work (temporary)
new 4b40103 clean deps
new f0211b5 Move DatasetSourceMock to proper batch mode
new 37b0a2a Run pipeline in batch mode or in streaming mode
new c706109 Split batch and streaming sources and translators
new 46c60cd Use raw Encoder<WindowedValue> also in regular ReadSourceTranslatorBatch
new d56ef09 Cleaning
new 78db97c Add ReadSourceTranslatorStreaming
new c9c2b3b Move Source and translator mocks to a mock package.
new 3c7688f Pass Beam Source and PipelineOptions to the spark DataSource as serialized strings
new 2707178 Refactor DatasetSource fields
new fc8d7f4 Wire real SourceTransform and not mock and update the test
new 90d9426 Add missing 0-arg public constructor
new b048aa7 Apply spotless
new a3f5ca1 Use new PipelineOptionsSerializationUtils
new 5da4168 Fix checkstyle
new 6f33718 Add a dummy schema for reader
new 92c4cf2 Add empty 0-arg constructor for mock source
new 0518568 Clean
new 8096514 Checkstyle and Findbugs
new 79640b0 Refactor SourceTest to a UTest instaed of a main
new bed6a02 Deactivate deps resolution forcing that prevent using correct spark transitive dep
new a83aa58 Fix pipeline triggering: use a spark action instead of writing the dataset
new fcf47dd improve readability of options passing to the source
new ed569f5 Cleaning unneeded fields in DatasetReader
new 5835df2 Fix serialization issues
new e97b5be Add SerializationDebugger
new fd63639 Fix SerializationDebugger
new cc0ab21 Add serialization test
new 53b9370 Move SourceTest to same package as tested class
new ac00f29 Fix SourceTest
new 8a4b4af Simplify beam reader creation as it created once the source as already been partitioned
new 59777ba Put all transform translators Serializable
new d208890 Enable test mode
new b3f018d Enable gradle build scan
new 2255618 Add flatten test
new df5409e First attempt for ParDo primitive implementation
new 253e350 Serialize windowedValue to byte[] in source to be able to specify a binary dataset schema and deserialize windowedValue from Row to get a dataset<WindowedValue>
new 418c61c Comment schema choices
new 63053ec Fix errorprone
new 5e185b3 Fix testMode output to comply with new binary schema
new f70bb56 Cleaning
new eb76dce Remove bundleSize parameter and always use spark default parallelism
new c89ac48 Fix split bug
new f896c3b Cleaning
new e2fca21 Add ParDoTest
new 92b991a Address minor review notes
new 45ef1af Clean
new 562ebfa Add GroupByKeyTest
new 62d4f5d Add comments and TODO to GroupByKeyTranslatorBatch
new 720681f Fix type checking with Encoder of WindowedValue<T>
new 1faf89b Port latest changes of ReadSourceTranslatorBatch to ReadSourceTranslatorStreaming
new 8b1b17e Remove no more needed putDatasetRaw
new 09a5ba2 Add ComplexSourceTest
new 76fd11c Fail in case of having SideInouts or State/Timers
new 67c7d19 Fix Encoders: create an Encoder for every manipulated type to avoid Spark fallback to genericRowWithSchema and cast to allow having Encoders with generic types such as WindowedValue<T> and get the type checking back
new 7951f7b Apply spotless
new e730fa8 Fixed Javadoc error
new 112704f Rename SparkSideInputReader class
new e8d6b77 Rename pruneOutput() to pruneOutputFilteredByTag()
new c9ad8d9 Don't use deprecated sideInput.getWindowingStrategyInternal()
new 6abc251 Simplify logic of ParDo translator
new c5b4593 Fix kryo issue in GBK translator with a workaround
new 07852b1 Rename SparkOutputManager for consistency
new 7f22722 Fix for test elements container in GroupByKeyTest
new 01cfa11 Added "testTwoPardoInRow"
new 8e7fc66 Add a test for the most simple possible Combine
new 205f96d Rename SparkDoFnFilterFunction to DoFnFilterFunction for consistency
new a6e4b37 Generalize the use of SerializablePipelineOptions in place of (not serializable) PipelineOptions
new f361a7e Fix getSideInputs
new b37ac03 Extract binary schema creation in a helper class
new c4e008d First version of combinePerKey
new 11a87c8 Improve type checking of Tuple2 encoder
new 9ed9d5b Introduce WindowingHelpers (and helpers package) and use it in Pardo, GBK and CombinePerKey
new c829100 Fix combiner using KV as input, use binary encoders in place of accumulatorEncoder and outputEncoder, use helpers, spotless
new 8889f71 Add combinePerKey and CombineGlobally tests
new 86c421b Introduce RowHelpers
new d1e7eea Add CombineGlobally translation to avoid translating Combine.perKey as a composite transform based on Combine.PerKey (which uses low perf GBK)
new 42414191 Cleaning
new a936327 Get back to classes in translators resolution because URNs cannot translate Combine.Globally
new 7b3dfae Fix various type checking issues in Combine.Globally
new 06b4632 Update test with Long
new ae07b1a Fix combine. For unknown reason GenericRowWithSchema is used as input of combine so extract its content to be able to proceed
new 32ae8dc Use more generic Row instead of GenericRowWithSchema
new c844fda Add explanation about receiving a Row as input in the combiner
new a4591c4 Fix encoder bug in combinePerkey
new 056359b [TO UPGRADE WITH THE 2 SPARK RUNNERS BEFORE MERGE] Change de wordcount build to test on new spark runner
new fcb2746 Cleaning
new bbb881a Implement WindowAssignTranslatorBatch
new dccf33a Implement WindowAssignTest
new f7f0da9 Fix javadoc
new 9493bca Fix build wordcount: to squash
new 40ab0cb Temporary fix remove fail on warnings
new 0e1ac77 Added SideInput support
new 7d87b0a Fix CheckStyle violations
new 303b6a6 Don't use Reshuffle translation
new b0a4272 Added using CachedSideInputReader
new 3a83862 Added TODO comment for ReshuffleTranslatorBatch
new cdfffaf And unchecked warning suppression
new 2edf26e Add streaming source initialisation
new 5e857be Start streaming source
new 24f45d9 Add a TODO on spark output modes
new a0f25eb Add transformators registry in PipelineTranslatorStreaming
new 0204d52 Add source streaming test
new b27b982 Specify checkpointLocation at the pipeline start
new 482bc7f Clean unneeded 0 arg constructor in batch source
new f1340fd Remove unneeded 0 arg constructor in batch source
new 8a0def6 Clean streaming source
new 33b310cc Fllow up on offsets for streaming source
new 4ba0af3 Deal with checkpoint and offset based read
new be0e9c3 Fix invalid code style with spotless
new ec633d1 Fix spotbogs warnings
new 385f1a6 Disable never ending test SimpleSourceTest.testUnboundedSource
new 17aa0f7 Fix access level issues, typos and modernize code to Java 8 style
new e76b88b Merge Spark Structured Streaming runner into main Spark module
new dc756a6 Remove spark-structured-streaming module
new f12ec6b Rename Runner to SparkStructuredStreamingRunner
new 2605bae Remove specific PipelineOotions and Registrar for Structured Streaming Runner
new e1c8e96 Fix non-vendored imports from Spark Streaming Runner classes
new 0262dab Add doFnSchemaInformation to ParDo batch translation
new 40f2135 Fix spotless issues after rebase
new 182f2d9 Fix logging levels in Spark Structured Streaming translation
new 1081207 Add SparkStructuredStreamingPipelineOptions and SparkCommonPipelineOptions - SparkStructuredStreamingPipelineOptions was added to have the new runner rely only on its specific options.
new e542bd3 Rename SparkPipelineResult to SparkStructuredStreamingPipelineResult This is done to avoid an eventual collision with the one in SparkRunner. However this cannot happen at this moment because it is package private, so it is also done for consistency.
new 496d660 Use PAssert in Spark Structured Streaming transform tests
new 4786cf1 Ignore spark offsets (cf javadoc)
new f2cac46 implement source.stop
new bd06201 Update javadoc
new f2f953a Apply Spotless
new 85a9589 Add Batch Validates Runner tests for Structured Streaming Runner
new d08f110 Limit the number of partitions to make tests go 300% faster
This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version. This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:
* -- * -- B -- O -- O -- O (241453b)
\
N -- N -- N refs/heads/spark-runner_structured-streaming (d08f110)
You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.
Any revisions marked "omit" are not gone; other references still
refer to them. Any revisions marked "discard" are gone forever.
The 167 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
.gitignore | 1 +
.test-infra/jenkins/README.md | 3 +-
.test-infra/jenkins/job_Inventory.groovy | 4 +-
.../jenkins/job_LoadTests_Combine_Java.groovy | 216 +
...vy => job_PerformanceTests_HadoopFormat.groovy} | 14 +-
.../jenkins/job_PerformanceTests_Python.groovy | 1 +
...groovy => job_Performancetests_Python35.groovy} | 13 +-
.test-infra/jenkins/job_PostCommit_Java.groovy | 2 +-
.../job_PostCommit_Java_PortabilityApi.groovy | 2 +-
...a.groovy => job_PostCommit_Website_Test.groovy} | 23 +-
.../jenkins/job_Release_NightlySnapshot.groovy | 7 +-
.../job_beam_PerformanceTests_Analysis.groovy | 2 +-
build.gradle | 11 +-
buildSrc/build.gradle | 1 +
.../org/apache/beam/gradle/BeamModulePlugin.groovy | 68 +-
.../beam/examples/kotlin/DebuggingWordCount.kt | 1 -
.../beam/examples/kotlin/WindowedWordCount.kt | 8 +-
.../org/apache/beam/examples/kotlin/WordCount.kt | 6 +-
.../kotlin/common/ExampleBigQueryTableOptions.kt | 4 +-
.../ExamplePubsubTopicAndSubscriptionOptions.kt | 4 +-
.../kotlin/common/ExamplePubsubTopicOptions.kt | 5 +-
.../beam/examples/kotlin/common/ExampleUtils.kt | 94 +-
.../kotlin/common/WriteOneFilePerWindow.kt | 15 +-
examples/notebooks/README.md | 66 +
.../notebooks/get-started/try-apache-beam-go.ipynb | 585 ++
.../get-started/try-apache-beam-java.ipynb | 1099 +++
.../notebooks/get-started/try-apache-beam-py.ipynb | 410 +
examples/notebooks/patch.py | 96 +
.../fn-execution/src/main/proto/beam_fn_api.proto | 5 +-
.../beam/model/fnexecution/v1/standard_coders.yaml | 24 +
.../src/main/proto/beam_expansion_api.proto | 2 +-
model/pipeline/src/main/proto/metrics.proto | 35 +-
release/src/main/groovy/TestScripts.groovy | 2 +-
.../src/main/scripts/build_release_candidate.sh | 3 +-
.../src/main/scripts/sign_hash_python_wheels.sh | 8 +-
.../runners/core/construction/Environments.java | 9 +
.../core/construction/ModelCoderRegistrar.java | 3 +
.../runners/core/construction/ModelCoders.java | 2 +
.../core/construction/PTransformTranslation.java | 4 +
.../core/construction/SchemaTranslation.java | 5 +-
.../construction/SplittableParDoNaiveBounded.java | 2 +-
.../construction/expansion/ExpansionServer.java | 9 +-
.../graph/GreedyPCollectionFusers.java | 18 +
.../core/construction/graph/QueryablePipeline.java | 6 +-
.../core/construction/CoderTranslationTest.java | 1 +
.../runners/core/construction/CommonCoderTest.java | 9 +-
.../expansion/ExpansionServerTest.java | 48 +
runners/core-java/build.gradle | 2 +-
...TimeBoundedSplittableProcessElementInvoker.java | 2 +-
.../apache/beam/runners/core/SimpleDoFnRunner.java | 13 +-
.../beam/runners/core/metrics/MetricUrns.java | 41 -
.../runners/core/metrics/MetricsContainerImpl.java | 87 +-
.../core/metrics/MonitoringInfoConstants.java | 7 +-
.../core/metrics/MonitoringInfoMetricName.java | 43 +-
.../core/metrics/SimpleMonitoringInfoBuilder.java | 50 +-
.../core/metrics/MetricsContainerImplTest.java | 27 +-
.../core/metrics/MetricsContainerStepMapTest.java | 9 +-
.../core/metrics/MonitoringInfoMetricNameTest.java | 29 +-
.../core/metrics/MonitoringInfoTestUtil.java | 2 +-
.../metrics/SimpleMonitoringInfoBuilderTest.java | 40 +-
.../core/metrics/SimpleStateRegistryTest.java | 6 +-
.../metrics/SpecMonitoringInfoValidatorTest.java | 6 +-
.../apache/beam/runners/direct/DirectRunner.java | 18 +-
.../beam/runners/direct/DoFnLifecycleManager.java | 2 +-
.../beam/runners/direct/DirectRunnerTest.java | 51 +
runners/flink/flink_runner.gradle | 8 +-
.../flink/FlinkPipelineExecutionEnvironment.java | 9 +-
.../flink/FlinkStreamingPipelineTranslator.java | 2 +-
.../flink/FlinkStreamingTransformTranslators.java | 38 +
.../translation/functions/FlinkDoFnFunction.java | 6 +-
.../functions/FlinkStatefulDoFnFunction.java | 6 +-
.../translation/types/CoderTypeSerializer.java | 2 +
.../runners/flink/translation/utils/NoopLock.java | 68 +
.../wrappers/streaming/DoFnOperator.java | 200 +-
.../streaming/ExecutableStageDoFnOperator.java | 83 +-
.../wrappers/streaming/FlinkKeyUtils.java | 7 +-
.../wrappers/streaming/io/TestStreamSource.java | 80 +
.../streaming/state/FlinkSplitStateInternals.java | 228 -
.../FlinkPipelineExecutionEnvironmentTest.java | 48 +
.../flink/metrics/FlinkMetricContainerTest.java | 68 +-
.../streaming/FlinkSplitStateInternalsTest.java | 130 -
.../wrappers/streaming/DoFnOperatorTest.java | 87 +-
.../streaming/ExecutableStageDoFnOperatorTest.java | 179 +-
.../wrappers/streaming/WindowDoFnOperatorTest.java | 2 +-
.../streaming/io/UnboundedSourceWrapperTest.java | 18 +-
runners/google-cloud-dataflow-java/build.gradle | 2 +-
.../dataflow/DataflowPipelineTranslator.java | 5 +-
.../beam/runners/dataflow/DataflowRunner.java | 2 +-
.../dataflow/worker/StreamingDataflowWorker.java | 13 +-
...ntMonitoringInfoToCounterUpdateTransformer.java | 2 +-
...piMonitoringInfoToCounterUpdateTransformer.java | 21 +-
...ecMonitoringInfoToCounterUpdateTransformer.java | 2 +-
...ntMonitoringInfoToCounterUpdateTransformer.java | 2 +-
...onMonitoringInfoToCounterUpdateTransformer.java | 31 +-
...erMonitoringInfoToCounterUpdateTransformer.java | 27 +-
.../worker/windmill/GrpcWindmillServer.java | 37 +-
.../fn/control/BeamFnMapTaskExecutorTest.java | 6 +-
...nitoringInfoToCounterUpdateTransformerTest.java | 49 +-
.../worker/fn/control/TimerReceiverTest.java | 4 +-
...nitoringInfoToCounterUpdateTransformerTest.java | 8 +-
...nitoringInfoToCounterUpdateTransformerTest.java | 8 +-
runners/java-fn-execution/build.gradle | 2 +-
.../fnexecution/control/RemoteExecutionTest.java | 127 +-
runners/spark/build.gradle | 21 +-
runners/spark/job-server/build.gradle | 6 +-
.../runners/spark/SparkCommonPipelineOptions.java | 3 +-
.../apache/beam/runners/spark/SparkJobInvoker.java | 14 +-
.../beam/runners/spark/SparkJobServerDriver.java | 23 +-
.../beam/runners/spark/SparkPipelineOptions.java | 17 +
.../beam/runners/spark/SparkPipelineRunner.java | 15 +
.../org/apache/beam/runners/spark/SparkRunner.java | 15 +-
.../spark/aggregators/AggregatorsAccumulator.java | 19 +-
.../aggregators/NamedAggregatorsAccumulator.java | 63 +
.../apache/beam/runners/spark/io/SourceRDD.java | 7 +-
.../runners/spark/io/SparkUnboundedSource.java | 4 +-
.../runners/spark/metrics/MetricsAccumulator.java | 23 +-
.../spark/metrics/MetricsAccumulatorParam.java | 42 -
.../MetricsContainerStepMapAccumulator.java | 65 +
.../spark/translation/DoFnRunnerWithMetrics.java | 9 +-
.../spark/translation/MultiDoFnFunction.java | 14 +-
.../SparkBatchPortablePipelineTranslator.java | 43 +-
.../translation/SparkExecutableStageFunction.java | 11 +-
.../SparkGroupAlsoByWindowViaOutputBufferFn.java | 5 +-
.../spark/translation/SparkProcessContext.java | 22 +-
.../spark/translation/SparkTranslationContext.java | 6 +
.../spark/translation/TransformTranslator.java | 9 +-
.../streaming/StreamingTransformTranslator.java | 5 +-
.../spark/metrics/SparkMetricsPusherTest.java | 4 +-
.../SparkExecutableStageFunctionTest.java | 6 +-
.../translation/streaming/CreateStreamTest.java | 52 +
sdks/go/README.md | 2 +-
sdks/go/pkg/beam/artifact/gcsproxy/retrieval.go | 30 +-
sdks/go/pkg/beam/artifact/gcsproxy/staging.go | 25 +-
sdks/go/pkg/beam/artifact/materialize.go | 18 +-
sdks/go/pkg/beam/artifact/server_test.go | 26 +-
sdks/go/pkg/beam/artifact/stage.go | 16 +-
sdks/go/pkg/beam/coder.go | 5 +-
sdks/go/pkg/beam/combine.go | 16 +-
sdks/go/pkg/beam/core/graph/fn.go | 5 +-
sdks/go/pkg/beam/core/util/hooks/hooks_test.go | 20 +-
sdks/go/pkg/beam/create.go | 15 +-
sdks/go/pkg/beam/external.go | 7 +-
sdks/go/pkg/beam/flatten.go | 9 +-
sdks/go/pkg/beam/gbk.go | 13 +-
sdks/go/pkg/beam/internal/errors/errors.go | 188 +
sdks/go/pkg/beam/internal/errors/errors_test.go | 203 +
sdks/go/pkg/beam/io/bigqueryio/bigquery.go | 13 +-
sdks/go/pkg/beam/io/databaseio/database.go | 23 +-
sdks/go/pkg/beam/io/databaseio/mapper.go | 5 +-
sdks/go/pkg/beam/io/databaseio/util.go | 5 +-
sdks/go/pkg/beam/io/databaseio/writer.go | 8 +-
sdks/go/pkg/beam/io/datastoreio/datastore.go | 4 +-
sdks/go/pkg/beam/io/filesystem/filesystem.go | 4 +-
sdks/go/pkg/beam/io/filesystem/gcs/gcs.go | 3 +-
.../beam/model/fnexecution_v1/beam_fn_api.pb.go | 3231 +++-----
.../model/fnexecution_v1/beam_provision_api.pb.go | 138 +-
sdks/go/pkg/beam/model/gen.go | 10 +-
.../model/jobmanagement_v1/beam_artifact_api.pb.go | 198 +-
.../jobmanagement_v1/beam_expansion_api.pb.go | 252 +
.../beam/model/jobmanagement_v1/beam_job_api.pb.go | 683 +-
.../beam/model/pipeline_v1/beam_runner_api.pb.go | 8018 +++++++++++---------
sdks/go/pkg/beam/model/pipeline_v1/endpoints.pb.go | 40 +-
.../model/pipeline_v1/external_transforms.pb.go | 140 +
sdks/go/pkg/beam/model/pipeline_v1/metrics.pb.go | 1853 +++++
.../model/pipeline_v1/standard_window_fns.pb.go | 70 +-
sdks/go/pkg/beam/options/jobopts/options.go | 3 +-
sdks/go/pkg/beam/pardo.go | 11 +-
sdks/go/pkg/beam/partition.go | 7 +-
sdks/go/pkg/beam/pcollection.go | 5 +-
sdks/go/pkg/beam/provision/provision.go | 6 +-
sdks/go/pkg/beam/runners/dataflow/dataflow.go | 12 +-
.../pkg/beam/runners/dataflow/dataflowlib/job.go | 8 +-
.../pkg/beam/runners/dataflow/dataflowlib/stage.go | 6 +-
.../beam/runners/dataflow/dataflowlib/translate.go | 23 +-
sdks/go/pkg/beam/runners/direct/direct.go | 8 +-
sdks/go/pkg/beam/runners/direct/gbk.go | 5 +-
sdks/go/pkg/beam/runners/dot/dot.go | 2 +-
sdks/go/pkg/beam/runners/session/session.go | 15 +-
.../beam/runners/universal/runnerlib/compile.go | 5 +-
.../beam/runners/universal/runnerlib/execute.go | 4 +-
.../go/pkg/beam/runners/universal/runnerlib/job.go | 13 +-
.../pkg/beam/runners/universal/runnerlib/stage.go | 8 +-
sdks/go/pkg/beam/runners/universal/universal.go | 3 +-
sdks/go/pkg/beam/testing/passert/hash.go | 3 +-
sdks/go/pkg/beam/testing/passert/passert.go | 9 +-
sdks/go/pkg/beam/testing/passert/sum.go | 3 +-
sdks/go/pkg/beam/transforms/top/top.go | 13 +-
sdks/go/pkg/beam/util/gcsx/gcs.go | 5 +-
sdks/go/pkg/beam/util/grpcx/dial.go | 4 +-
sdks/go/pkg/beam/util/grpcx/metadata.go | 7 +-
sdks/go/pkg/beam/util/pubsubx/pubsub.go | 3 +-
sdks/go/pkg/beam/util/starcgenx/starcgenx.go | 5 +-
sdks/go/pkg/beam/util/syscallx/syscall.go | 2 +-
sdks/go/pkg/beam/validate.go | 11 +-
sdks/go/pkg/beam/windowing.go | 7 +-
.../main/resources/docker/file/openjdk8/Dockerfile | 49 -
.../docker/file/openjdk8/docker-entrypoint.sh | 24 -
.../main/resources/docker/git/openjdk8/Dockerfile | 53 -
.../docker/git/openjdk8/docker-entrypoint.sh | 22 -
.../resources/docker/release/python2/Dockerfile | 21 -
sdks/java/container/Dockerfile | 5 +
sdks/java/container/Dockerfile-java11 | 5 +
sdks/java/container/boot.go | 2 +
sdks/java/container/build.gradle | 4 +
sdks/java/core/build.gradle | 2 +-
.../java/org/apache/beam/sdk/coders/RowCoder.java | 43 +-
.../apache/beam/sdk/coders/RowCoderGenerator.java | 17 +-
.../main/java/org/apache/beam/sdk/io/AvroIO.java | 390 +-
.../main/java/org/apache/beam/sdk/io/FileIO.java | 18 +-
.../main/java/org/apache/beam/sdk/io/TextIO.java | 54 +-
.../org/apache/beam/sdk/io/range/ByteKeyRange.java | 12 +-
.../beam/sdk/schemas/FieldAccessDescriptor.java | 36 +-
.../org/apache/beam/sdk/schemas/SchemaCoder.java | 5 +
.../apache/beam/sdk/schemas/SchemaRegistry.java | 42 +-
.../sdk/schemas/annotations/DefaultSchema.java | 89 +-
.../beam/sdk/schemas/transforms/Convert.java | 77 +-
.../apache/beam/sdk/schemas/transforms/Select.java | 2 +-
.../beam/sdk/schemas/utils/ConvertHelpers.java | 204 +
.../sdk/schemas/utils/StaticSchemaInference.java | 4 +-
.../org/apache/beam/sdk/testing/LargeKeys.java | 10 +-
.../org/apache/beam/sdk/testing/TestStream.java | 88 +
.../apache/beam/sdk/testing/UsesTestStream.java | 2 +-
....java => UsesTestStreamWithProcessingTime.java} | 7 +-
.../org/apache/beam/sdk/transforms/Combine.java | 66 +
.../beam/sdk/transforms/DoFnSchemaInformation.java | 199 +-
.../org/apache/beam/sdk/transforms/DoFnTester.java | 2 +-
.../java/org/apache/beam/sdk/transforms/ParDo.java | 111 +-
.../sdk/transforms/SerializableBiFunction.java} | 17 +-
.../reflect/ByteBuddyDoFnInvokerFactory.java | 34 +-
.../beam/sdk/transforms/reflect/DoFnInvoker.java | 15 +-
.../beam/sdk/transforms/reflect/DoFnInvokers.java | 22 +
.../beam/sdk/transforms/reflect/DoFnSignature.java | 36 +-
.../sdk/transforms/reflect/DoFnSignatures.java | 29 +-
.../main/java/org/apache/beam/sdk/values/Row.java | 8 +-
.../java/org/apache/beam/sdk/io/AvroIOTest.java | 140 +-
.../java/org/apache/beam/sdk/io/FileIOTest.java | 3 +-
.../org/apache/beam/sdk/io/TFRecordIOTest.java | 1 -
.../org/apache/beam/sdk/io/TextIOReadTest.java | 19 +-
.../org/apache/beam/sdk/io/TextIOWriteTest.java | 31 +-
.../org/apache/beam/sdk/io/WriteFilesTest.java | 3 +-
.../beam/sdk/runners/PipelineRunnerTest.java | 10 +-
.../sdk/schemas/FieldAccessDescriptorTest.java | 41 +-
.../beam/sdk/schemas/transforms/CastTest.java | 5 +
.../sdk/schemas/transforms/CastValidatorTest.java | 10 +-
.../beam/sdk/schemas/transforms/CoGroupTest.java | 5 +
.../beam/sdk/schemas/transforms/ConvertTest.java | 5 +
.../beam/sdk/schemas/transforms/FilterTest.java | 5 +
.../beam/sdk/schemas/transforms/GroupTest.java | 5 +
.../beam/sdk/schemas/transforms/JoinTest.java | 5 +
.../beam/sdk/schemas/transforms/SelectTest.java | 5 +
.../beam/sdk/schemas/transforms/UnnestTest.java | 5 +
.../apache/beam/sdk/testing/TestStreamTest.java | 26 +-
.../apache/beam/sdk/transforms/CombineTest.java | 32 +
.../apache/beam/sdk/transforms/GroupByKeyTest.java | 4 +-
.../apache/beam/sdk/transforms/JsonToRowTest.java | 2 +
.../beam/sdk/transforms/ParDoLifecycleTest.java | 191 +-
.../beam/sdk/transforms/ParDoSchemaTest.java | 269 +-
.../org/apache/beam/sdk/transforms/ParDoTest.java | 7 +-
.../org/apache/beam/sdk/transforms/ReifyTest.java | 3 +-
.../org/apache/beam/sdk/transforms/WatchTest.java | 17 +-
.../sdk/transforms/reflect/DoFnSignaturesTest.java | 35 +-
.../java/org/apache/beam/sdk/values/RowTest.java | 31 +
.../google-cloud-platform-core/build.gradle | 2 +-
sdks/java/extensions/protobuf/build.gradle | 2 +-
sdks/java/extensions/sql/build.gradle | 2 +-
sdks/java/extensions/sql/datacatalog/build.gradle | 63 +
.../sql/example/BeamSqlDataCatalogExample.java | 103 +
.../sdk/extensions/sql/example}/package-info.java | 9 +-
.../meta/provider/datacatalog/BigQueryUtils.java | 58 +
.../datacatalog/DataCatalogClientAdapter.java | 94 +
.../datacatalog/DataCatalogPipelineOptions.java | 30 +-
.../datacatalog/DataCatalogTableProvider.java | 122 +
.../sql/meta/provider/datacatalog/PubsubUtils.java | 53 +
.../sql/meta/provider/datacatalog/SchemaUtils.java | 96 +
.../sql/meta/provider/datacatalog/TableUtils.java | 59 +
.../meta/provider/datacatalog}/package-info.java | 9 +-
.../apache/beam/sdk/extensions/sql/BeamSqlCli.java | 6 +-
.../beam/sdk/extensions/sql/SqlTransform.java | 31 +-
.../beam/sdk/extensions/sql/impl/BeamSqlEnv.java | 289 +-
.../sql/impl/BeamSqlPipelineOptions.java} | 20 +-
.../sql/impl/BeamSqlPipelineOptionsRegistrar.java | 26 +-
...mQueryPlanner.java => CalciteQueryPlanner.java} | 31 +-
.../sdk/extensions/sql/impl/QueryPlanner.java} | 17 +-
.../sql/impl/SqlConversionException.java} | 18 +-
.../sql/impl/rel/BeamEnumerableConverter.java | 44 +-
.../sdk/extensions/sql/impl/rel/BeamJoinRel.java | 116 +-
.../sdk/extensions/sql/impl/rel/BeamValuesRel.java | 8 +-
.../sql/impl/transform/BeamJoinTransforms.java | 38 +-
.../impl/transform/BeamSetOperatorsTransforms.java | 9 +-
.../transform/agg/AggregationCombineFnAdapter.java | 5 +
.../sql/impl/utils/SerializableRexFieldAccess.java | 55 +
.../sql/impl/utils/SerializableRexInputRef.java | 24 +-
.../sql/impl/utils/SerializableRexNode.java | 50 +
.../extensions/sql/BeamSqlDslAggregationTest.java | 11 +
.../sdk/extensions/sql/impl/BeamSqlEnvTest.java | 22 +-
.../extensions/sql/impl/parser/BeamDDLTest.java | 4 +-
.../sql/impl/rel/BeamEnumerableConverterTest.java | 287 +-
.../impl/rel/BeamJoinRelBoundedVsBoundedTest.java | 119 +
.../extensions/sql/impl/rel/BeamMinusRelTest.java | 16 +-
.../extensions/sql/impl/rel/BeamValuesRelTest.java | 13 +
sdks/java/fn-execution/build.gradle | 2 +-
sdks/java/harness/build.gradle | 7 +-
.../apache/beam/fn/harness/FnApiDoFnRunner.java | 11 +-
.../beam/fn/harness/FnApiDoFnRunnerTest.java | 20 +-
.../data/ElementCountFnDataReceiverTest.java | 4 +-
sdks/java/io/amazon-web-services/build.gradle | 2 +-
sdks/java/io/amqp/build.gradle | 2 +-
sdks/java/io/cassandra/build.gradle | 2 +-
.../apache/beam/sdk/io/cassandra/CassandraIO.java | 98 +-
sdks/java/io/clickhouse/build.gradle | 2 +-
.../elasticsearch-tests-2/build.gradle | 2 +-
.../elasticsearch-tests-5/build.gradle | 2 +-
.../elasticsearch-tests-6/build.gradle | 2 +-
.../elasticsearch-tests-common/build.gradle | 2 +-
.../java/org/apache/beam/sdk/io/avro/AvroIOIT.java | 9 +-
.../java/org/apache/beam/sdk/io/text/TextIOIT.java | 9 +-
sdks/java/io/google-cloud-platform/build.gradle | 2 +-
.../beam/sdk/io/gcp/bigquery/BigQueryIO.java | 2 +-
.../beam/sdk/io/gcp/bigquery/BigQueryUtils.java | 7 +-
.../beam/sdk/io/gcp/spanner/SpannerReadIT.java | 12 +-
.../beam/sdk/io/gcp/spanner/SpannerWriteIT.java | 11 +-
sdks/java/io/hadoop-common/OWNERS | 1 +
sdks/java/io/hadoop-file-system/OWNERS | 1 +
sdks/java/io/hadoop-file-system/build.gradle | 1 +
sdks/java/io/{hbase => hadoop-format}/OWNERS | 0
sdks/java/io/hadoop-format/build.gradle | 3 +-
.../sdk/io/hadoop/format/HadoopFormatIOIT.java | 69 +-
sdks/java/io/hadoop-input-format/OWNERS | 5 -
sdks/java/io/hadoop-input-format/build.gradle | 90 -
.../io/hadoop/inputformat/HadoopInputFormatIO.java | 189 -
.../ConfigurableEmployeeInputFormat.java | 126 -
.../beam/sdk/io/hadoop/inputformat/Employee.java | 87 -
.../io/hadoop/inputformat/EmployeeInputFormat.java | 168 -
.../io/hadoop/inputformat/HIFIOCassandraIT.java | 197 -
.../sdk/io/hadoop/inputformat/HIFIOElasticIT.java | 220 -
.../hadoop/inputformat/HIFIOWithElasticTest.java | 274 -
.../HIFIOWithEmbeddedCassandraTest.java | 255 -
.../sdk/io/hadoop/inputformat/HIFITestOptions.java | 76 -
.../hadoop/inputformat/HadoopInputFormatIOIT.java | 229 -
.../inputformat/HadoopInputFormatIOTest.java | 400 -
.../ReuseObjectsEmployeeInputFormat.java | 173 -
.../io/hadoop/inputformat/TestEmployeeDataSet.java | 79 -
.../io/hadoop/inputformat/TestRowDBWritable.java | 91 -
.../src/test/resources/cassandra.yaml | 1074 ---
sdks/java/io/hcatalog/OWNERS | 1 +
sdks/java/io/jdbc/build.gradle | 2 +-
.../java/org/apache/beam/sdk/io/jdbc/JdbcIO.java | 264 +-
.../org/apache/beam/sdk/io/jdbc/JdbcIOTest.java | 26 +-
sdks/java/io/jms/build.gradle | 2 +-
sdks/java/io/kafka/build.gradle | 3 +-
.../org/apache/beam/sdk/io/kafka/ConsumerSpEL.java | 7 +-
.../java/org/apache/beam/sdk/io/kafka/KafkaIO.java | 93 +-
.../org/apache/beam/sdk/io/kafka/KafkaRecord.java | 2 +-
.../apache/beam/sdk/io/kafka/KafkaRecordCoder.java | 6 +-
.../beam/sdk/io/kafka/KafkaUnboundedReader.java | 2 +-
.../beam/sdk/io/kafka/ProducerRecordCoder.java | 25 +-
.../beam/sdk/io/kafka/KafkaIOExternalTest.java | 121 +-
.../beam/sdk/io/kafka/ProducerRecordCoderTest.java | 44 +-
sdks/java/io/kinesis/build.gradle | 2 +-
sdks/java/io/kudu/build.gradle | 1 +
sdks/java/io/mongodb/build.gradle | 2 +-
.../org/apache/beam/sdk/io/mongodb/MongoDbIO.java | 89 +-
.../apache/beam/sdk/io/mongodb/MongoDbIOTest.java | 26 +-
sdks/java/io/mqtt/build.gradle | 2 +-
sdks/java/io/parquet/build.gradle | 2 +-
sdks/java/io/{redis => rabbitmq}/OWNERS | 0
sdks/java/io/rabbitmq/build.gradle | 5 +-
sdks/java/io/redis/build.gradle | 2 +-
sdks/java/io/solr/build.gradle | 2 +-
sdks/java/io/xml/build.gradle | 2 +-
.../apache/beam/sdk/loadtests/CombineLoadTest.java | 19 +-
sdks/java/testing/nexmark/build.gradle | 4 +-
.../java/org/apache/beam/sdk/nexmark/Main.java | 5 +-
.../apache/beam/sdk/nexmark/NexmarkLauncher.java | 17 +-
.../apache/beam/sdk/nexmark/queries/QueryTest.java | 2 +
.../sdk/nexmark/queries/sql/SqlQuery5Test.java | 2 +
sdks/python/apache_beam/coders/coders.py | 37 +-
.../apache_beam/coders/coders_test_common.py | 1 +
.../apache_beam/coders/standard_coders_test.py | 6 +-
.../examples/complete/game/game_stats_it_test.py | 6 +
.../examples/complete/game/leader_board_it_test.py | 6 +
.../apache_beam/examples/fastavro_it_test.py | 19 +-
.../apache_beam/examples/snippets/snippets.py | 2 +-
.../examples/streaming_wordcount_it_test.py | 6 +
.../apache_beam/examples/wordcount_it_test.py | 6 +
sdks/python/apache_beam/io/avroio.py | 54 +-
sdks/python/apache_beam/io/avroio_test.py | 187 +-
.../apache_beam/io/external/generate_sequence.py | 20 +-
sdks/python/apache_beam/io/external/kafka.py | 171 +-
sdks/python/apache_beam/io/fileio.py | 5 +-
sdks/python/apache_beam/io/fileio_test.py | 31 +-
sdks/python/apache_beam/io/filesystem.py | 4 +-
.../io/gcp/big_query_query_to_table_it_test.py | 45 +-
.../io/gcp/big_query_query_to_table_pipeline.py | 28 +-
sdks/python/apache_beam/io/gcp/bigquery.py | 222 +-
.../apache_beam/io/gcp/bigquery_file_loads.py | 55 +-
.../apache_beam/io/gcp/bigquery_file_loads_test.py | 27 +-
sdks/python/apache_beam/io/gcp/bigquery_test.py | 41 +-
sdks/python/apache_beam/io/gcp/bigquery_tools.py | 52 +-
.../io/gcp/datastore/v1/adaptive_throttler_test.py | 6 -
.../apache_beam/io/gcp/datastore/v1/datastoreio.py | 62 +-
.../io/gcp/datastore/v1/datastoreio_test.py | 104 +-
.../apache_beam/io/gcp/datastore/v1/helper_test.py | 21 +-
.../io/gcp/datastore/v1/query_splitter_test.py | 137 +-
.../python/apache_beam/io/gcp/datastore/v1/util.py | 42 +
.../apache_beam/io/gcp/datastore/v1/util_test.py | 41 +-
.../models => io/gcp/datastore/v1new}/__init__.py | 0
.../v1new}/datastore_write_it_pipeline.py | 100 +-
.../v1new}/datastore_write_it_test.py | 15 +-
.../io/gcp/datastore/v1new/datastoreio.py | 433 ++
.../io/gcp/datastore/v1new/datastoreio_test.py | 297 +
.../apache_beam/io/gcp/datastore/v1new/helper.py | 125 +
.../io/gcp/datastore/v1new/helper_test.py | 85 +
.../io/gcp/datastore/v1new/query_splitter.py | 231 +
.../io/gcp/datastore/v1new/query_splitter_test.py | 174 +
.../apache_beam/io/gcp/datastore/v1new/types.py | 213 +
.../io/gcp/datastore/v1new/types_test.py | 147 +
.../io/gcp/datastore_write_it_pipeline.py | 2 +-
.../apache_beam/io/gcp/datastore_write_it_test.py | 8 +-
.../apache_beam/io/gcp/gcsio_integration_test.py | 4 +-
.../apache_beam/io/gcp/pubsub_integration_test.py | 10 +
.../apache_beam/io/gcp/tests/bigquery_matcher.py | 69 +-
.../io/gcp/tests/bigquery_matcher_test.py | 39 +
sdks/python/apache_beam/io/textio.py | 2 +-
sdks/python/apache_beam/io/textio_test.py | 6 +-
sdks/python/apache_beam/io/tfrecordio.py | 5 +-
sdks/python/apache_beam/metrics/execution.py | 14 +-
.../python/apache_beam/metrics/monitoring_infos.py | 150 +-
.../apache_beam/metrics/monitoring_infos_test.py | 18 +-
sdks/python/apache_beam/pipeline.py | 12 +-
sdks/python/apache_beam/portability/common_urns.py | 4 +
sdks/python/apache_beam/runners/common.py | 8 +-
.../dataflow_exercise_metrics_pipeline_test.py | 6 +
.../runners/dataflow/dataflow_runner.py | 100 +-
.../runners/dataflow/internal/apiclient.py | 4 +-
.../clients/dataflow/dataflow_v1b3_client.py | 2 +-
.../clients/dataflow/dataflow_v1b3_messages.py | 13 +-
.../apache_beam/runners/dataflow/internal/names.py | 2 +-
.../apache_beam/runners/direct/direct_runner.py | 3 +-
.../runners/direct/evaluation_context.py | 10 +-
.../runners/direct/sdf_direct_runner_test.py | 7 +-
.../runners/direct/watermark_manager.py | 8 +
.../python/apache_beam/runners/pipeline_context.py | 3 +-
.../runners/portability/flink_runner_test.py | 36 +-
.../runners/portability/fn_api_runner.py | 26 +-
.../runners/portability/fn_api_runner_test.py | 431 +-
.../portability/fn_api_runner_transforms.py | 62 +-
.../runners/portability/portable_runner.py | 31 +-
.../runners/portability/spark_runner_test.py | 146 +
.../apache_beam/runners/portability/stager.py | 60 +
.../apache_beam/runners/portability/stager_test.py | 119 +-
.../apache_beam/runners/worker/bundle_processor.py | 38 +-
.../apache_beam/runners/worker/log_handler.py | 22 +-
.../apache_beam/runners/worker/opcounters.py | 3 +-
.../apache_beam/runners/worker/opcounters_test.py | 4 +-
.../apache_beam/runners/worker/operations.pxd | 2 +-
.../apache_beam/runners/worker/operations.py | 53 +-
.../apache_beam/runners/worker/sdk_worker.py | 28 +
.../apache_beam/testing/data/standard_coders.yaml | 235 -
.../apache_beam/testing/synthetic_pipeline.py | 118 +
sdks/python/apache_beam/transforms/core.py | 63 +-
.../python/apache_beam/transforms/external_test.py | 167 +-
sdks/python/apache_beam/transforms/ptransform.py | 15 +-
sdks/python/apache_beam/transforms/trigger_test.py | 51 +
sdks/python/apache_beam/transforms/userstate.py | 31 +-
sdks/python/apache_beam/typehints/decorators.py | 3 +-
.../python/apache_beam/typehints/typehints_test.py | 6 -
sdks/python/apache_beam/utils/retry.py | 2 +
sdks/python/apache_beam/utils/retry_test.py | 12 +-
sdks/python/build.gradle | 81 +-
sdks/python/gen_protos.py | 9 +
sdks/python/scripts/generate_pydoc.sh | 18 +
sdks/python/scripts/run_integration_test.sh | 25 +-
sdks/python/setup.py | 23 +-
.../dataflow/{py3 => py35}/build.gradle | 18 +-
.../dataflow/{py3 => py36}/build.gradle | 19 +-
.../test-suites/direct/{py3 => py35}/build.gradle | 1 +
.../test-suites/direct/{py3 => py36}/build.gradle | 2 +-
sdks/python/test-suites/tox/py36/build.gradle | 3 +-
sdks/python/test-suites/tox/py37/build.gradle | 3 +-
sdks/python/tox.ini | 68 +-
settings.gradle | 16 +-
website/Rakefile | 1 +
website/_config.yml | 8 +-
website/build.gradle | 93 +-
website/src/.htaccess | 2 +-
website/src/_data/authors.yml | 8 +
website/src/_includes/section-menu/contribute.html | 1 -
.../src/_includes/section-menu/get-started.html | 1 +
website/src/_posts/2017-08-28-timely-processing.md | 102 +-
website/src/_posts/2019-04-19-season-of-docs.md | 64 +
website/src/_posts/2019-04-25-beam-2.12.0.md | 73 +
website/src/_posts/2019-04-25-beam-kotlin.md | 114 +
website/src/contribute/design-documents.md | 3 +
website/src/contribute/docker-images.md | 193 -
website/src/contribute/release-guide.md | 32 +-
.../documentation/io/built-in-google-bigquery.md | 6 +-
website/src/documentation/io/built-in.md | 1 +
website/src/documentation/io/testing.md | 12 +-
website/src/documentation/sdks/nexmark.md | 2 +-
website/src/get-started/beam-overview.md | 8 +-
website/src/get-started/downloads.md | 7 +
website/src/get-started/try-apache-beam.md | 192 +
website/src/images/blog/SoD.png | Bin 0 -> 57748 bytes
website/src/images/blog/kotlin.png | Bin 0 -> 14563 bytes
website/src/index.md | 3 +-
506 files changed, 22741 insertions(+), 14553 deletions(-)
create mode 100644 .test-infra/jenkins/job_LoadTests_Combine_Java.groovy
rename .test-infra/jenkins/{job_PerformanceTests_HadoopInputFormat.groovy => job_PerformanceTests_HadoopFormat.groovy} (85%)
copy .test-infra/jenkins/{job_PerformanceTests_Python.groovy => job_Performancetests_Python35.groovy} (83%)
copy .test-infra/jenkins/{job_PostCommit_Java.groovy => job_PostCommit_Website_Test.groovy} (61%)
create mode 100644 examples/notebooks/README.md
create mode 100644 examples/notebooks/get-started/try-apache-beam-go.ipynb
create mode 100644 examples/notebooks/get-started/try-apache-beam-java.ipynb
create mode 100644 examples/notebooks/get-started/try-apache-beam-py.ipynb
create mode 100644 examples/notebooks/patch.py
create mode 100644 runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/expansion/ExpansionServerTest.java
delete mode 100644 runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MetricUrns.java
create mode 100644 runners/flink/src/main/java/org/apache/beam/runners/flink/translation/utils/NoopLock.java
create mode 100644 runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/TestStreamSource.java
delete mode 100644 runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/state/FlinkSplitStateInternals.java
delete mode 100644 runners/flink/src/test/java/org/apache/beam/runners/flink/streaming/FlinkSplitStateInternalsTest.java
create mode 100644 runners/spark/src/main/java/org/apache/beam/runners/spark/aggregators/NamedAggregatorsAccumulator.java
delete mode 100644 runners/spark/src/main/java/org/apache/beam/runners/spark/metrics/MetricsAccumulatorParam.java
create mode 100644 runners/spark/src/main/java/org/apache/beam/runners/spark/metrics/MetricsContainerStepMapAccumulator.java
create mode 100644 sdks/go/pkg/beam/internal/errors/errors.go
create mode 100644 sdks/go/pkg/beam/internal/errors/errors_test.go
create mode 100644 sdks/go/pkg/beam/model/jobmanagement_v1/beam_expansion_api.pb.go
create mode 100644 sdks/go/pkg/beam/model/pipeline_v1/external_transforms.pb.go
create mode 100644 sdks/go/pkg/beam/model/pipeline_v1/metrics.pb.go
delete mode 100644 sdks/java/build-tools/src/main/resources/docker/file/openjdk8/Dockerfile
delete mode 100755 sdks/java/build-tools/src/main/resources/docker/file/openjdk8/docker-entrypoint.sh
delete mode 100644 sdks/java/build-tools/src/main/resources/docker/git/openjdk8/Dockerfile
delete mode 100755 sdks/java/build-tools/src/main/resources/docker/git/openjdk8/docker-entrypoint.sh
delete mode 100644 sdks/java/build-tools/src/main/resources/docker/release/python2/Dockerfile
create mode 100644 sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/ConvertHelpers.java
copy sdks/java/core/src/main/java/org/apache/beam/sdk/testing/{UsesTestStream.java => UsesTestStreamWithProcessingTime.java} (74%)
copy sdks/java/{io/hadoop-input-format/src/main/java/org/apache/beam/sdk/io/hadoop/inputformat/package-info.java => core/src/main/java/org/apache/beam/sdk/transforms/SerializableBiFunction.java} (58%)
create mode 100644 sdks/java/extensions/sql/datacatalog/build.gradle
create mode 100644 sdks/java/extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlDataCatalogExample.java
copy sdks/java/{io/hadoop-input-format/src/main/java/org/apache/beam/sdk/io/hadoop/inputformat => extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/example}/package-info.java (78%)
create mode 100644 sdks/java/extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/datacatalog/BigQueryUtils.java
create mode 100644 sdks/java/extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/datacatalog/DataCatalogClientAdapter.java
copy runners/spark/src/main/java/org/apache/beam/runners/spark/aggregators/AggAccumParam.java => sdks/java/extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/datacatalog/DataCatalogPipelineOptions.java (56%)
create mode 100644 sdks/java/extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/datacatalog/DataCatalogTableProvider.java
create mode 100644 sdks/java/extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/datacatalog/PubsubUtils.java
create mode 100644 sdks/java/extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/datacatalog/SchemaUtils.java
create mode 100644 sdks/java/extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/datacatalog/TableUtils.java
rename sdks/java/{io/hadoop-input-format/src/main/java/org/apache/beam/sdk/io/hadoop/inputformat => extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/datacatalog}/package-info.java (78%)
copy sdks/java/{core/src/main/java/org/apache/beam/sdk/testing/UsesTestStream.java => extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlPipelineOptions.java} (61%)
copy runners/spark/src/main/java/org/apache/beam/runners/spark/aggregators/AggAccumParam.java => sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlPipelineOptionsRegistrar.java (56%)
rename sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/{BeamQueryPlanner.java => CalciteQueryPlanner.java} (84%)
copy sdks/java/{core/src/main/java/org/apache/beam/sdk/testing/UsesTestStream.java => extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/QueryPlanner.java} (56%)
copy sdks/java/{core/src/main/java/org/apache/beam/sdk/testing/UsesTestStream.java => extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/SqlConversionException.java} (69%)
create mode 100644 sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/utils/SerializableRexFieldAccess.java
rename runners/spark/src/main/java/org/apache/beam/runners/spark/aggregators/AggAccumParam.java => sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/utils/SerializableRexInputRef.java (57%)
create mode 100644 sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/utils/SerializableRexNode.java
copy sdks/java/io/{hbase => hadoop-format}/OWNERS (100%)
delete mode 100644 sdks/java/io/hadoop-input-format/OWNERS
delete mode 100644 sdks/java/io/hadoop-input-format/build.gradle
delete mode 100644 sdks/java/io/hadoop-input-format/src/main/java/org/apache/beam/sdk/io/hadoop/inputformat/HadoopInputFormatIO.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/ConfigurableEmployeeInputFormat.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/Employee.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/EmployeeInputFormat.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/HIFIOCassandraIT.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/HIFIOElasticIT.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/HIFIOWithElasticTest.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/HIFIOWithEmbeddedCassandraTest.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/HIFITestOptions.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/HadoopInputFormatIOIT.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/HadoopInputFormatIOTest.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/ReuseObjectsEmployeeInputFormat.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/TestEmployeeDataSet.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/TestRowDBWritable.java
delete mode 100644 sdks/java/io/hadoop-input-format/src/test/resources/cassandra.yaml
copy sdks/java/io/{redis => rabbitmq}/OWNERS (100%)
copy sdks/python/apache_beam/{testing/benchmarks/nexmark/models => io/gcp/datastore/v1new}/__init__.py (100%)
copy sdks/python/apache_beam/io/gcp/{ => datastore/v1new}/datastore_write_it_pipeline.py (62%)
copy sdks/python/apache_beam/io/gcp/{ => datastore/v1new}/datastore_write_it_test.py (87%)
create mode 100644 sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py
create mode 100644 sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio_test.py
create mode 100644 sdks/python/apache_beam/io/gcp/datastore/v1new/helper.py
create mode 100644 sdks/python/apache_beam/io/gcp/datastore/v1new/helper_test.py
create mode 100644 sdks/python/apache_beam/io/gcp/datastore/v1new/query_splitter.py
create mode 100644 sdks/python/apache_beam/io/gcp/datastore/v1new/query_splitter_test.py
create mode 100644 sdks/python/apache_beam/io/gcp/datastore/v1new/types.py
create mode 100644 sdks/python/apache_beam/io/gcp/datastore/v1new/types_test.py
create mode 100644 sdks/python/apache_beam/runners/portability/spark_runner_test.py
delete mode 100644 sdks/python/apache_beam/testing/data/standard_coders.yaml
copy sdks/python/test-suites/dataflow/{py3 => py35}/build.gradle (79%)
rename sdks/python/test-suites/dataflow/{py3 => py36}/build.gradle (79%)
copy sdks/python/test-suites/direct/{py3 => py35}/build.gradle (96%)
rename sdks/python/test-suites/direct/{py3 => py36}/build.gradle (98%)
create mode 100644 website/src/_posts/2019-04-19-season-of-docs.md
create mode 100644 website/src/_posts/2019-04-25-beam-2.12.0.md
create mode 100644 website/src/_posts/2019-04-25-beam-kotlin.md
delete mode 100644 website/src/contribute/docker-images.md
create mode 100644 website/src/get-started/try-apache-beam.md
create mode 100644 website/src/images/blog/SoD.png
create mode 100644 website/src/images/blog/kotlin.png