You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by peihe <gi...@git.apache.org> on 2017/08/18 07:22:08 UTC

[GitHub] beam pull request #3734: to merge Pr-3624 with fixups

GitHub user peihe opened a pull request:

    https://github.com/apache/beam/pull/3734

    to merge Pr-3624 with fixups

    Follow this checklist to help us incorporate your contribution quickly and easily:
    
     - [ ] Make sure there is a [JIRA issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the change (usually before you start working on it).  Trivial changes like typos do not require a JIRA issue.  Your pull request should address just this issue, without pulling in other changes.
     - [ ] Each commit in the pull request should have a meaningful subject line and body.
     - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue.
     - [ ] Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
     - [ ] Run `mvn clean verify` to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
     - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
    
    ---


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/peihe/incubator-beam PR-3624

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/3734.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3734
    
----
commit 9d4de1b2f5bf4816cb834e08aae8c127d2dca4cf
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-04-06T06:53:04Z

    [BEAM-1899] Start jstorm runner moduel in feature branch.

commit 15ebaf0f77c6194f46666f676644a2ff79fb24a1
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-04-06T06:55:25Z

    [BEAM-1899] Add JStormRunnerRegistrar and empty implementations of PipelineRunner, RunnerResult, PipelineOptions.

commit f6a89b0fc2428d2f85e087525a6ddb5361eed4cb
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-04-22T07:12:45Z

    This closes #2457

commit f1e170a5fa9dc4d462af42f9f382afd0ecd798b6
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-04-25T09:37:52Z

    Merge branch 'master' upto commit 686b774ceda8bee32032cb421651e8350ca5bf3d into jstorm-runner

commit 58d4b97c0a218d01e1b64d5fced693b15d941074
Author: Kenneth Knowles <kl...@google.com>
Date:   2017-04-25T17:29:18Z

    This closes #2672: Merge branch 'master' upto commit 686b774 into jstorm-runner
    
      [BEAM-1993] Remove special unbounded Flink source/sink
      Remove flink-annotations dependency
      Fix Javadoc warnings on Flink Runner
      Enable flink dependency enforcement and make dependencies explicit
      [BEAM-59] Register standard FileSystems wherever we register IOChannelFactories
      [BEAM-1991] Sum.SumDoubleFn => Sum.ofDoubles
      clean up description for sdk_location
      Set the Project of a Table Reference at Runtime
      Only compile HIFIO ITs when compiling with java 8.
      Update assertions of source_test_utils from camelcase to underscore-separated.
      Add no-else return to pylintrc
      Remove getSideInputWindow
      Remove reference to the isStreaming flag
      Javadoc fixups after style guide changes
      Update Dataflow Worker Version
      [BEAM-1922] Close datasource in JdbcIO when possible
      Fix javadoc warnings
      Add javadoc to getCheckpointMark in UnboundedSource
      Removes final minor usages of OldDoFn outside OldDoFn itself
      [BEAM-1915] Removes use of OldDoFn from Apex
      Update Signature of PTransformOverrideFactory
      [BEAM-1964] Fix lint issues and pylint upgrade
      Rename DoFn.Context#sideOutput to output
      [BEAM-1964] Fix lint issues for linter upgrade -3
      [BEAM-1964] Fix lint issues for linter upgrade -2
      Avoi repackaging bigtable classes in dataflow runner.
      ApexRunner: register standard IOs when deserializing pipeline options
      Add PCollections Utilities
      Free PTransform Names if they are being Replaced
      [BEAM-1347] Update protos related to State API for prototyping purposes.
      Update java8 examples pom files to include maven-shade-plugin.
      fix the simplest typo
      [BEAM-1964] Fix lint issues for linter upgrade
      Merge PR#2423: Add Kubernetes scripts for clusters for Performance and Integration tests of Cassandra and ES for Hadoop Input Format IO
      Remove Triggers.java from SDK entirely
      [BEAM-1708] Improve error message when GCP not installed
      Improve gcloud logging message
      [BEAM-1101, BEAM-1068] Remove service account name credential pipeline options
      Update user_score.py
      Pin versions in tox script
      Improve Empty Create Default Coder Error Message
      Represent a Pipeline via a list of Top-level Transforms
      Test all Known Coders to ensure they Serialize via URN
      [BEAM-1950] Add missing 'static' keyword to MicrobatchSource#initReaderCache
      Move Triggers from sdk-core to runners-core-construction
      [BEAM-1222] Chunk size should be FS dependent
      Move HIFIO k8s scripts into shared dir
      Move jdbc's postgres k8s scripts into shared k8s dir
      Move travis/jenkins folders in a test-infra folder
      [BEAM-911] Mark IO APIs as @Experimental
      Revert "Revert "Revert "Add ValueProvider class for FileBasedSource I/O Transforms"""
      Revert "Throw specialized exception in value providers"
      Removes FlatMapElements.MissingOutputTypeDescriptor
      Removes MapElements.MissingOutputTypeDescriptor
      [BEAM-1882] Update postgres k8 scripts & add scripts for running local dev test
      [BEAM-115] Update timer/state fields on ParDoPayload to use a map field for consistent tag usage
      Use SdkComponents in WindowingStrategy.toProto
      [BEAM-1722] Move PubsubIO into the google-cloud-platform module
      Triggers: handle missing case
      Clean HFIOWithEmbeddedCassandraTest before Execution
      DataflowRunner: remove dead code
      Throw specialized exception in value providers
      DataflowRunner: send windowing strategy using Runner API proto
      DataflowRunner misc cleanups
      Improve Work Rejection handling
      Remove Orderdness of Input, Output expansions
      Ignore more python build artifacts.
      Fix build breaks caused by overlaps between b615013 and c08b7b1
      Remove Jdk1.8-tests/.toDelete
      Improve HadoopInputFormatIO DisplayData and Cassandra tests
      Add Coder utilities for Proto conversions
      Flip dependency edge between Dataflow runner and IO-GCP
      Move HashingFn to io/common, switch to better hash
      PubsubIO: remove support for BoundedReader
      Bump Dataflow worker to 20170410
      Removes DoFn.ProcessContinuation completely
      Move WindowingStrategies to runners-core-construction
      Fix GroupByKeyInputVisitor for Direct Runner
      Skip query metrics when creating a template
      Upgrade dependencies.
      Add SdkComponents
      Create as custom source
      BEAM-1053 ApexGroupByKeyOperator serialization issues
      enable test_multi_valued_singleton_side_input test
      [BEAM-386] Move UnboundedReadFromBoundedSource to core-construction-java
      BEAM-1390 Update top level README.md to include Apex Runner
      better log message for bigquery temp tables
      [BEAM-1921] Expose connection properties in JdbcIO
      [BEAM-1294] Long running UnboundedSource Readers
      [BEAM-1737] Implement a Single-output ParDo as a Multi-output ParDo with a single output.
      Fix for potentially unclosed streams in ApexYarnLauncher
      TestDataflowRunner: better error handling
      BEAM-1887 Switch Apex ParDo to new DoFn.
      Adds tests for the watermark hold (previously untested)
      Fixes SDF issues re: watermarks and stop/resume
      Clarifies doc of ProcessElement re: HasDefaultTracker
      [BEAM-65] Adds HasDefaultTracker for RestrictionTracker inference
      Cleanup: removes two unused constants
      [BEAM-1823] Improve ValidatesRunner Test Log
      Clean up in textio and tfrecordio
      ...

commit 1fe64def2f6a69b7f9f230c53237bc96dc28cb63
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-07-24T03:33:40Z

    Merge branch master up to commit '5e3c5c6574bc70320683d6c16fc3b11791a77418' into jstorm-runner

commit 0a05de365ddaff9d1c570423ab7ea527b5bf77ae
Author: Kenneth Knowles <kl...@google.com>
Date:   2017-07-24T04:33:45Z

    This closes #3623: [BEAM-1899] Merge branch master up to commit '5e3c5c6574bc70320683d6c16fc3b11791a77418' into jstorm-runner
    
      Use stable naming strategy for ByteBuddy invokers
      Translate a Pipeline in SdkComponents
      [TRIVIAL] runners-core: delete placeholder
      Fixes an accidentally found bug in SimpleDoFnRunner
      Removes OldDoFn and its kin from runners-core
      Bump Dataflow containers to 0512
      Improve Pruning performed by the DirectRunnerApiSurfaceTest
      Adding support for subnetwork in Python Pipelineoptions
      Use built-in cmp python function in comparing datastore paths
      ApexRunner SDF support
      Fix documentation for the shard_template_name
      [BEAM-2299] Run maven install on Windows machine for build/test coverage on Windows
      Remove "Dataflow" from apache_beam __init__.py file
      Moving the data file for trigger tests to testing/data
      Fix GcsResourceIdTest in postcommits
      readAvros should't have proto Message upper bound
      Reduce Log Level of PubsubUnboundedSource
      [BEAM-2290] Fix issue where timestamps weren't set when using CompressedSource
      [BEAM-2279] Fix archetype breakages
      internal comments
      Fix shading of guava testlib
      Rename FileSystems.setDefaultConfigInWorkers
      [BEAM-2277] HadoopFileSystem: normalize implementation
      Mark FileSystem and related as Experimental
      [BEAM-2277] Add ResourceIdTester and test existing ResourceId implementations
      Remove '/' entirely from determining FileSystem scheme
      [BEAM-2279] Add HDFS support to Spark runner profiles in archetypes and examples
      [BEAM-2277] Fix URI_SCHEME_PATTERN in FileSystems
      BigtableIO should use AutoValue for read and write
      [BEAM-2153] Move connection management in JmsIO.write() to setup/teardown methods
      Mark More values methods Internal
      Rename filesink to filebasedsink
      Enable SerializableCoder to Serialize with Generic Types
      Remove unused test data
      Fix due to GBKO name change.
      Don't deploy jdk1.8-tests module
      Remove some internal details from the public API.
      Move assert_that, equal_to, is_empty to apache_beam.testing.util
      [BEAM-1345] Clearly delineate public api in apache_beam/typehints.
      [BEAM-1345] Mark apache_beam/internal as internal.
      [BEAM-1345] Annotate public members of pvalue.
      Add internal comments to metrics
      [BEAM-1340] Add __all__ tags to modules in package apache_beam/transforms
      [BEAM-2256] Add the last previous range filter
      Use a consistent calculation for GC Time
      fix lint error in fake_datastore.py
      Add __all__ tags to modules in package apache_beam/testing
      [BEAM-1340] Adds __all__ tags to classes in package apache_beam/io.
      [BEAM-1345] Clearly delineate public api in apache_beam/coders.
      [BEAM-1345] Clearly delineate public api in runners package.
      [BEAM-1345] Mark Pipeline as public.
      [BEAM-1345] Clearly delineate public API in apache_beam/options
      Mark internal modules in python datastoreio
      [BEAM-2260] Improve construction-time errors for Text and AvroIO
      [BEAM-2179] Archetype generate-sources.sh cleanup the existing sources before rsync
      [BEAM-1345] Mark windowed value as experimental
      Add internal usage only comments to util/
      Remove protobuf and http-client dependency from runners/google-cloud-dataflow
      minor typo fix in comment
      Add support for local execution to PubsubIO using the google cloud emulator
      [BEAM-2150] Relax regex to support wildcard globbing for GCS
      bump time of precommits
      [BEAM-2244] Move details of Metrics to Runners Core
      Correct javadoc for mobile gaming examples
      Update SDK Coders to return the Empty List from getCoderArguments
      Skip generating empty jars for parent poms
      Fix a typo in TestDataflowRunnerTest
      Re-enable UsesTimersInParDo tests in Dataflow runner
      TestDataflowRunner: throw AssertionError only when assertion known failed
      Allow any throwable in PAssert to constitute adequate failure
      [BEAM-2242] Ensure that jars are shaded correctly by running the jar plugin before the shade plugin
      [BEAM-2240] Always augment exception with step name.
      Adds dependency on findbugs to examples/java
      Splits WriteBundles into windowed/unwindowed versions
      Simpler code for setting shard numbers on results in FileBasedSink
      Implement dynamic-sharding for windowed file outputs, and add an integration test.
      Renames FileBasedSink inner classes
      [BEAM-2250] remove experimental and internal things from pydoc
      [BEAM-2249] Correctly handle partial reads in AvroSource
      Use text output for first two mobile gaming examples
      Remove verifyDeterministic from StructuredCoder
      Update Coder Documentation
      Improve DirectRunner Javadoc
      [BEAM-2211] Delete deprecated NoopPathValidator
      Remove Timer.cancel() from user-facing API
      Remove Readme files.
      Renames some python classes and functions that were unnecessarily public.
      Mark PipelineVisitor and AppliedPTransform as internal.
      Mark PValue and PValueBase Internal
      [BEAM-2236] Move test utilities out of python core
      Include 'sun.reflect' in GcpCoreApiSurfaceTest
      Fix checkstyle error
      Shade dependencies in sdks/core
      Remove trailing whitespace
      Add per-runner profile to Java 8 examples
      Register TestSparkPipelineOptions only in src/test to avoid hard hamcrest dep
      Update Apache Beam Python version to 2.1.0.dev
      Shade JSR305 in the DirectRunner
      Remove hadoop io readme
      Remove templates from wordcount example
      ...

commit 731555c1174aa4aaea70baf25e001e8bccd16142
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-05-02T03:12:58Z

    Refactor: move JStorm runner code to Beam repo.

commit ef81cd8e43cbc9379fee3a40a0c50bb8c267f9f9
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-05-12T08:24:00Z

    Support startBundle & finishBundle on jstorm batch mode

commit 0a5b7e69c6edc289052b44b9c9efd10d4ec1d5a8
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-05-12T08:25:43Z

    Update to internal version 0.7.0-jstorm

commit 4d0a594df51102578c86935a6db7bad853894ae1
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-17T08:28:16Z

    Upgrade Beam to 2.0.0.

commit ec91b660e15e782016e2ec9518247c99921be31e
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-18T03:27:20Z

    Internal: Upgrade fixup for imports.

commit 9d2ddb4553fb7347eb3c5900dee03a02d85cd925
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-18T03:30:36Z

    jstorm-runner: 2.0.0 upgrade fixups.

commit c12c768e9f29b73d005adc367259bc11c8bdabfb
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-18T04:46:22Z

    Internal: Upgrade fixup for states.

commit 9bd97cf35e0e1285fc6459708a5d062905befe74
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-18T05:03:40Z

    Internal: Upgrade fixup for removing aggregators.

commit ff316f61132fd55914f7d19f320d244f4053832e
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-18T05:15:23Z

    Internal: Upgrade fixup for others.

commit a09d761e3171ee62eda4718255144db9107d88cb
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-18T12:16:55Z

    Internal: Implement Beam Metrics.

commit 1d85413c49335d9427bcc01c900bc500fdb7c5d4
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-18T12:22:07Z

    Internal: workaround before calling finishBundle is fixed.

commit fdcd3e9fc4ed561d164f791c848c99c0f59742eb
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-05-19T09:24:38Z

    Fix that bug that state of timers were not persisted correctly.

commit 49968800bb731d1f34322615ddf35cc1e0be6f8c
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-05-19T15:13:33Z

    fix bug that startBundle/finishBundle was not called after fire timers by updating watermark

commit 715512e92049fc2f018b56c044c2beb11fadebc5
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-05-22T02:23:18Z

    Revert "Internal: workaround before calling finishBundle is fixed."
    
    This reverts commit c0a40cf4317a7fa63b401c7f5fecea6b17355b55.

commit d9cde0537ad1b80ca7455e5cc3af0332e208350b
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-05-23T04:32:28Z

    Map local TupleTag to external tag before output in DoFn

commit 3ab3e73fd10408c5ee212c179a1d8bc147a8dcbd
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-05-23T04:39:13Z

    Add combine test cases

commit e9775e4d10d6fd573496ba018dde0f4d68c6bd1d
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-25T09:54:52Z

    Add WordCountBenchmark for JStorm runner.

commit 0b2b26df9060e2a961621c770e2fe07751c6acce
Author: Pei He <he...@alibaba-inc.com>
Date:   2017-05-26T08:12:03Z

    fixup: remove print messages.

commit 10dfe21738a2f2bd1c198c03b521b583d8af0a2b
Author: Pei He <pe...@apache.org>
Date:   2017-05-27T09:47:27Z

    Benchmark: add StateInternalsBenchmark.

commit fa9c498f30a690d5e64dcc48b28f58e15012f4b4
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-06-02T02:39:04Z

    1. Fix incorrect asserted value in stateInternalsTest
    2. Deactivate source reader when closing spout
    3. Fix typo in pom.xml

commit 343421d8ccf0a8ed4e63c64c816756400c706510
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-06-06T08:17:36Z

    1. Fix bug of state internal test
    2. Improve performance of bag state by reducing duplicated wrtie/read ops

commit 7ee5ef4127ab849645bd6c136efe1c987fda3aa5
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-06-06T09:27:52Z

    Fix incorrect asserted result of CombineTest

commit ca79d9fc728a2ebf3f899d7c36b88945386ad954
Author: basti.lj <ba...@alibaba-inc.com>
Date:   2017-06-08T08:45:27Z

    support Iterable in JStormBagState.read

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] beam pull request #3734: [BEAM-1899] Implementation of JStorm runner. (merge...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/beam/pull/3734


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---