You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/03/23 15:14:55 UTC
[GitHub] [beam] TheNeuralBit opened a new pull request #14308: [BEAM-12024] Move wordcount_dataframe to examples.dataframe, add taxiride example
TheNeuralBit opened a new pull request #14308:
URL: https://github.com/apache/beam/pull/14308
This PR creates a new `examples.dataframe` module and moves `wordcount_dataframe` into it. It also creates two new example pipelines in `examples.dataframe.taxiride`, one based on the aggregation pipeline from https://beam.apache.org/blog/dataframe-api-preview-available/, and one that augments that pipeline with a join on the taxi zone lookup table.
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
--- | --- | --- | --- | --- | --- | ---
Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | ---
Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam
.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.a
pache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam
.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/) | ---
XLang | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/) | ---
Pre-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
--- |Java | Python | Go | Website | Whitespace | Typescript
--- | --- | --- | --- | --- | --- | ---
Non-portable | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/) <br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/be
am_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/)
Portable | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | --- | --- | ---
See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
------------------------------------------------------------------------------------------------
[![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
[![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
[![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-816843494
Ok I added `wordcount_dataframe` back as an alias for `dataframe.wordcount` https://github.com/apache/beam/pull/14308/commits/3fb8fa24b276159b093257a108d9305b4be996a1, LMK if that's acceptable @robertwb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit merged pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit merged pull request #14308:
URL: https://github.com/apache/beam/pull/14308
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (3fb8fa2) into [master](https://codecov.io/gh/apache/beam/commit/1b862668a8bd92babe4af45fcf4d610e9fae1681?el=desc) (1b86266) will **decrease** coverage by `0.01%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.49% 83.48% -0.02%
==========================================
Files 447 449 +2
Lines 58904 58949 +45
==========================================
+ Hits 49183 49213 +30
- Misses 9721 9736 +15
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...eam/runners/portability/fn\_api\_runner/fn\_runner.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL2ZuX3J1bm5lci5weQ==) | | |
| [...s/sdks/python/apache\_beam/examples/avro\_bitcoin.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvYXZyb19iaXRjb2luLnB5) | | |
| [...apache\_beam/typehints/native\_type\_compatibility.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL25hdGl2ZV90eXBlX2NvbXBhdGliaWxpdHkucHk=) | | |
| [...ks/python/apache\_beam/io/gcp/pubsub\_it\_pipeline.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3B1YnN1Yl9pdF9waXBlbGluZS5weQ==) | | |
| [...sdks/python/apache\_beam/runners/direct/executor.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvZXhlY3V0b3IucHk=) | | |
| [...cs/sdks/python/apache\_beam/io/source\_test\_utils.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vc291cmNlX3Rlc3RfdXRpbHMucHk=) | | |
| [...38/build/srcs/sdks/python/apache\_beam/io/textio.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vdGV4dGlvLnB5) | | |
| [...testing/benchmarks/nexmark/models/nexmark\_model.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbW9kZWxzL25leG1hcmtfbW9kZWwucHk=) | | |
| [...sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | | |
| [...eam/transforms/py\_dataflow\_distribution\_counter.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9weV9kYXRhZmxvd19kaXN0cmlidXRpb25fY291bnRlci5weQ==) | | |
| ... and [886 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [1b86266...3fb8fa2](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] rohdesamuel commented on a change in pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
rohdesamuel commented on a change in pull request #14308:
URL: https://github.com/apache/beam/pull/14308#discussion_r608892895
##########
File path: sdks/python/apache_beam/examples/dataframe/taxiride.py
##########
@@ -0,0 +1,115 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Pipelines that use the DataFrame API to process NYC taxiride CSV data."""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import argparse
+import logging
+
+import apache_beam as beam
+from apache_beam.dataframe.convert import to_dataframe
+from apache_beam.dataframe.convert import to_pcollection
+from apache_beam.dataframe.io import read_csv
+from apache_beam.io import ReadFromText
+from apache_beam.options.pipeline_options import PipelineOptions
+
+ZONE_LOOKUP_PATH = "gs://apache-beam-samples/nyc_taxi/misc/taxi+_zone_lookup.csv"
+
+
+def run_aggregation_pipeline(pipeline_args, input_path, output_path):
+ # The pipeline will be run on exiting the with block.
+ with beam.Pipeline(options=PipelineOptions(pipeline_args)) as p:
+ rides = p | read_csv(input_path)
Review comment:
For my own edification, does read_csv expand and return a beam DataFrame? Is there anything else it can return?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (9e10eeb) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **increase** coverage by `0.08%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
+ Coverage 83.40% 83.48% +0.08%
==========================================
Files 469 464 -5
Lines 58727 58965 +238
==========================================
+ Hits 48981 49229 +248
+ Misses 9746 9736 -10
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...hon/apache\_beam/runners/worker/worker\_pool\_main.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvd29ya2VyX3Bvb2xfbWFpbi5weQ==) | | |
| [...pache\_beam/runners/portability/portable\_metrics.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9wb3J0YWJsZV9tZXRyaWNzLnB5) | | |
| [...nners/direct/consumer\_tracking\_pipeline\_visitor.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvY29uc3VtZXJfdHJhY2tpbmdfcGlwZWxpbmVfdmlzaXRvci5weQ==) | | |
| [.../python/apache\_beam/examples/windowed\_wordcount.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd2luZG93ZWRfd29yZGNvdW50LnB5) | | |
| [...s/sdks/python/apache\_beam/io/gcp/bigquery\_tools.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X3Rvb2xzLnB5) | | |
| [...s/python/apache\_beam/io/aws/clients/s3/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvX19pbml0X18ucHk=) | | |
| [...s/python/apache\_beam/examples/snippets/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvX19pbml0X18ucHk=) | | |
| [...\_beam/testing/benchmarks/nexmark/queries/query7.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvcXVlcmllcy9xdWVyeTcucHk=) | | |
| [...s/sdks/python/apache\_beam/transforms/sideinputs.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9zaWRlaW5wdXRzLnB5) | | |
| [...cs/sdks/python/apache\_beam/io/gcp/gcsfilesystem.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2djc2ZpbGVzeXN0ZW0ucHk=) | | |
| ... and [923 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...9e10eeb](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-816831774
> Just a thought. We have various wordcount examples to demonstrate various ways of doing things. Maybe it makes sense to leave the dataframe one there to inform people about dataframes?
Fair point. I also like having all the dataframe examples in one module with a README though...
What if we have both `examples.wordcount_dataframe` and `examples.dataframe.wordcount` and the former just calls out to the latter?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-816843494
Ok I added `wordcount_dataframe` back as an alias for `dataframe.wordcount` in https://github.com/apache/beam/pull/14308/commits/3fb8fa24b276159b093257a108d9305b4be996a1, LMK if that's acceptable @robertwb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (5ca8939) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **decrease** coverage by `0.01%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.40% 83.38% -0.02%
==========================================
Files 469 470 +1
Lines 58727 58766 +39
==========================================
+ Hits 48981 49002 +21
- Misses 9746 9764 +18
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...m/portability/api/beam\_interactive\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1faW50ZXJhY3RpdmVfYXBpX3BiMl9ncnBjLnB5) | | |
| [...thon/apache\_beam/runners/worker/operation\_specs.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvb3BlcmF0aW9uX3NwZWNzLnB5) | | |
| [...dks/python/apache\_beam/runners/pipeline\_context.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9waXBlbGluZV9jb250ZXh0LnB5) | | |
| [...s/python/apache\_beam/testing/datatype\_inference.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9kYXRhdHlwZV9pbmZlcmVuY2UucHk=) | | |
| [...am/portability/api/standard\_window\_fns\_pb2\_urns.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL3N0YW5kYXJkX3dpbmRvd19mbnNfcGIyX3VybnMucHk=) | | |
| [...beam/testing/benchmarks/nexmark/queries/query12.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvcXVlcmllcy9xdWVyeTEyLnB5) | | |
| [...s/snippets/transforms/aggregation/combineperkey.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9hZ2dyZWdhdGlvbi9jb21iaW5lcGVya2V5LnB5) | | |
| [...e\_beam/portability/api/beam\_runner\_api\_pb2\_urns.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjJfdXJucy5weQ==) | | |
| [...am/examples/snippets/transforms/aggregation/top.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9hZ2dyZWdhdGlvbi90b3AucHk=) | | |
| [...he\_beam/io/flink/flink\_streaming\_impulse\_source.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmxpbmsvZmxpbmtfc3RyZWFtaW5nX2ltcHVsc2Vfc291cmNlLnB5) | | |
| ... and [929 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...5ca8939](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (9e10eeb) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **increase** coverage by `0.08%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
+ Coverage 83.40% 83.48% +0.08%
==========================================
Files 469 464 -5
Lines 58727 58965 +238
==========================================
+ Hits 48981 49229 +248
+ Misses 9746 9736 -10
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...thon/apache\_beam/runners/worker/operation\_specs.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvb3BlcmF0aW9uX3NwZWNzLnB5) | | |
| [...dks/python/apache\_beam/metrics/monitoring\_infos.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vbWV0cmljcy9tb25pdG9yaW5nX2luZm9zLnB5) | | |
| [.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5) | | |
| [...n/apache\_beam/typehints/typed\_pipeline\_test\_py3.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL3R5cGVkX3BpcGVsaW5lX3Rlc3RfcHkzLnB5) | | |
| [...python/apache\_beam/examples/complete/distribopt.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvY29tcGxldGUvZGlzdHJpYm9wdC5weQ==) | | |
| [...gcp/datastore/v1new/datastore\_write\_it\_pipeline.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2RhdGFzdG9yZS92MW5ldy9kYXRhc3RvcmVfd3JpdGVfaXRfcGlwZWxpbmUucHk=) | | |
| [...beam/testing/benchmarks/nexmark/queries/query11.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvcXVlcmllcy9xdWVyeTExLnB5) | | |
| [...am/examples/complete/juliaset/juliaset/juliaset.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvY29tcGxldGUvanVsaWFzZXQvanVsaWFzZXQvanVsaWFzZXQucHk=) | | |
| [...ers/dataflow/internal/clients/dataflow/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9pbnRlcm5hbC9jbGllbnRzL2RhdGFmbG93L19faW5pdF9fLnB5) | | |
| [...on/apache\_beam/examples/complete/juliaset/setup.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvY29tcGxldGUvanVsaWFzZXQvc2V0dXAucHk=) | | |
| ... and [923 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...9e10eeb](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (5ca8939) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **decrease** coverage by `0.01%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.40% 83.38% -0.02%
==========================================
Files 469 470 +1
Lines 58727 58766 +39
==========================================
+ Hits 48981 49002 +21
- Misses 9746 9764 +18
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...beam/runners/dataflow/internal/clients/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9pbnRlcm5hbC9jbGllbnRzL19faW5pdF9fLnB5) | | |
| [...ache\_beam/runners/portability/expansion\_service.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9leHBhbnNpb25fc2VydmljZS5weQ==) | | |
| [.../python/apache\_beam/runners/worker/statesampler.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc3RhdGVzYW1wbGVyLnB5) | | |
| [...s/python/apache\_beam/examples/wordcount\_minimal.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X21pbmltYWwucHk=) | | |
| [...rcs/sdks/python/apache\_beam/runners/direct/util.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdXRpbC5weQ==) | | |
| [...ks/python/apache\_beam/runners/worker/sideinputs.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2lkZWlucHV0cy5weQ==) | | |
| [...y38/build/srcs/sdks/python/apache\_beam/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vX19pbml0X18ucHk=) | | |
| [...snippets/transforms/aggregation/combineglobally.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9hZ2dyZWdhdGlvbi9jb21iaW5lZ2xvYmFsbHkucHk=) | | |
| [...ld/srcs/sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5) | | |
| [...eam/runners/portability/fn\_api\_runner/execution.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL2V4ZWN1dGlvbi5weQ==) | | |
| ... and [929 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...5ca8939](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (9248ea8) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **decrease** coverage by `0.02%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.40% 83.37% -0.03%
==========================================
Files 469 470 +1
Lines 58727 58766 +39
==========================================
+ Hits 48981 48999 +18
- Misses 9746 9767 +21
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [.../python/apache\_beam/testing/benchmarks/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL19faW5pdF9fLnB5) | | |
| [...ild/srcs/sdks/python/apache\_beam/runners/runner.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9ydW5uZXIucHk=) | | |
| [.../py38/build/srcs/sdks/python/apache\_beam/pvalue.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcHZhbHVlLnB5) | | |
| [...srcs/sdks/python/apache\_beam/transforms/trigger.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy90cmlnZ2VyLnB5) | | |
| [...python/apache\_beam/typehints/typecheck\_test\_py3.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL3R5cGVjaGVja190ZXN0X3B5My5weQ==) | | |
| [...che\_beam/runners/interactive/interactive\_runner.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9ydW5uZXIucHk=) | | |
| [...s/sdks/python/apache\_beam/dataframe/expressions.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2V4cHJlc3Npb25zLnB5) | | |
| [...am/testing/benchmarks/nexmark/models/field\_name.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbW9kZWxzL2ZpZWxkX25hbWUucHk=) | | |
| [...he\_beam/runners/interactive/pipeline\_instrument.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9waXBlbGluZV9pbnN0cnVtZW50LnB5) | | |
| [...testing/benchmarks/nexmark/models/nexmark\_model.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbW9kZWxzL25leG1hcmtfbW9kZWwucHk=) | | |
| ... and [929 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...9248ea8](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-816831774
> Just a thought. We have various wordcount examples to demonstrate various ways of doing things. Maybe it makes sense to leave the dataframe one there to inform people about dataframes?
Fair point. I'd also like to have all the dataframe examples in one module with a README though...
What if we have both `examples.wordcount_dataframe` and `examples.dataframe.wordcount` and the former just calls out to the latter?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #14308: [BEAM-12024] Move wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-804988137
R: @robertwb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (9e10eeb) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **increase** coverage by `0.08%`.
> The diff coverage is `n/a`.
> :exclamation: Current head 9e10eeb differs from pull request most recent head 9248ea8. Consider uploading reports for the commit 9248ea8 to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
+ Coverage 83.40% 83.48% +0.08%
==========================================
Files 469 464 -5
Lines 58727 58965 +238
==========================================
+ Hits 48981 49229 +248
+ Misses 9746 9736 -10
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...n/apache\_beam/runners/direct/evaluation\_context.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvZXZhbHVhdGlvbl9jb250ZXh0LnB5) | | |
| [...m/portability/api/beam\_interactive\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1faW50ZXJhY3RpdmVfYXBpX3BiMl9ncnBjLnB5) | | |
| [...python/apache\_beam/runners/direct/direct\_runner.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvZGlyZWN0X3J1bm5lci5weQ==) | | |
| [...e\_beam/examples/complete/top\_wikipedia\_sessions.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvY29tcGxldGUvdG9wX3dpa2lwZWRpYV9zZXNzaW9ucy5weQ==) | | |
| [...ache\_beam/examples/snippets/transforms/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9fX2luaXRfXy5weQ==) | | |
| [...srcs/sdks/python/apache\_beam/transforms/trigger.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy90cmlnZ2VyLnB5) | | |
| [...dks/python/apache\_beam/ml/gcp/naturallanguageml.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vbWwvZ2NwL25hdHVyYWxsYW5ndWFnZW1sLnB5) | | |
| [.../python/apache\_beam/runners/worker/statesampler.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc3RhdGVzYW1wbGVyLnB5) | | |
| [...\_beam/runners/interactive/user\_pipeline\_tracker.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS91c2VyX3BpcGVsaW5lX3RyYWNrZXIucHk=) | | |
| [...ache\_beam/io/gcp/datastore/v1new/query\_splitter.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2RhdGFzdG9yZS92MW5ldy9xdWVyeV9zcGxpdHRlci5weQ==) | | |
| ... and [923 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...9248ea8](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] rohdesamuel commented on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
rohdesamuel commented on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-815134252
lgtm just a couple of comments, ty!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on a change in pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #14308:
URL: https://github.com/apache/beam/pull/14308#discussion_r610783745
##########
File path: sdks/python/apache_beam/examples/dataframe/taxiride.py
##########
@@ -0,0 +1,115 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Pipelines that use the DataFrame API to process NYC taxiride CSV data."""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import argparse
+import logging
+
+import apache_beam as beam
+from apache_beam.dataframe.convert import to_dataframe
+from apache_beam.dataframe.convert import to_pcollection
+from apache_beam.dataframe.io import read_csv
+from apache_beam.io import ReadFromText
+from apache_beam.options.pipeline_options import PipelineOptions
+
+ZONE_LOOKUP_PATH = "gs://apache-beam-samples/nyc_taxi/misc/taxi+_zone_lookup.csv"
+
+
+def run_aggregation_pipeline(pipeline_args, input_path, output_path):
+ # The pipeline will be run on exiting the with block.
+ with beam.Pipeline(options=PipelineOptions(pipeline_args)) as p:
+ rides = p | read_csv(input_path)
Review comment:
yeah the `read_*` methods generally return `DeferredDataFrame` instances (when applied to a pipeline). If pandas has and read methods that return a Series, we should just produce a `DeferredSeries`. I'm not sure if any readers like that exist though, maybe [read_feather](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_feather.html) can
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] commented on pull request #14308: [BEAM-12024] Move wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (5ca8939) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **decrease** coverage by `0.01%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.40% 83.38% -0.02%
==========================================
Files 469 470 +1
Lines 58727 58766 +39
==========================================
+ Hits 48981 49002 +21
- Misses 9746 9764 +18
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...ache\_beam/io/gcp/datastore/v1new/query\_splitter.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2RhdGFzdG9yZS92MW5ldy9xdWVyeV9zcGxpdHRlci5weQ==) | | |
| [...ache\_beam/runners/dataflow/test\_dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy90ZXN0X2RhdGFmbG93X3J1bm5lci5weQ==) | | |
| [.../apache\_beam/runners/dataflow/internal/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9pbnRlcm5hbC9fX2luaXRfXy5weQ==) | | |
| [...ython/apache\_beam/io/gcp/datastore/v1new/helper.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2RhdGFzdG9yZS92MW5ldy9oZWxwZXIucHk=) | | |
| [...ples/snippets/transforms/aggregation/groupbykey.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9hZ2dyZWdhdGlvbi9ncm91cGJ5a2V5LnB5) | | |
| [.../srcs/sdks/python/apache\_beam/coders/coder\_impl.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vY29kZXJzL2NvZGVyX2ltcGwucHk=) | | |
| [...srcs/sdks/python/apache\_beam/transforms/trigger.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy90cmlnZ2VyLnB5) | | |
| [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | | |
| [...ache\_beam/runners/interactive/pipeline\_fragment.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9waXBlbGluZV9mcmFnbWVudC5weQ==) | | |
| [...apache\_beam/examples/complete/game/leader\_board.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvY29tcGxldGUvZ2FtZS9sZWFkZXJfYm9hcmQucHk=) | | |
| ... and [929 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...5ca8939](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] robertwb commented on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-816828884
Just a thought. We have various wordcount examples to demonstrate various ways of doing things. Maybe it makes sense to leave the dataframe one there to inform people about dataframes?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (5ca8939) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **decrease** coverage by `0.01%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.40% 83.38% -0.02%
==========================================
Files 469 470 +1
Lines 58727 58766 +39
==========================================
+ Hits 48981 49002 +21
- Misses 9746 9764 +18
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...ache\_beam/coders/proto2\_coder\_test\_messages\_pb2.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vY29kZXJzL3Byb3RvMl9jb2Rlcl90ZXN0X21lc3NhZ2VzX3BiMi5weQ==) | | |
| [...ers/interactive/display/pipeline\_graph\_renderer.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3BpcGVsaW5lX2dyYXBoX3JlbmRlcmVyLnB5) | | |
| [...ython/apache\_beam/typehints/decorators\_test\_py3.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL2RlY29yYXRvcnNfdGVzdF9weTMucHk=) | | |
| [...ild/srcs/sdks/python/apache\_beam/io/filesystems.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZXN5c3RlbXMucHk=) | | |
| [...am/testing/benchmarks/chicago\_taxi/process\_tfma.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL2NoaWNhZ29fdGF4aS9wcm9jZXNzX3RmbWEucHk=) | | |
| [...sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | | |
| [.../python/apache\_beam/io/gcp/datastore/v1new/util.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2RhdGFzdG9yZS92MW5ldy91dGlsLnB5) | | |
| [...build/srcs/sdks/python/apache\_beam/io/mongodbio.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbW9uZ29kYmlvLnB5) | | |
| [...les/complete/juliaset/juliaset/juliaset\_test\_it.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvY29tcGxldGUvanVsaWFzZXQvanVsaWFzZXQvanVsaWFzZXRfdGVzdF9pdC5weQ==) | | |
| [...sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | | |
| ... and [929 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...5ca8939](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-817045569
GHA failures look like flakes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (3fb8fa2) into [master](https://codecov.io/gh/apache/beam/commit/ed3df93e747ddc271db5186faf2e05af0b57de1d?el=desc) (ed3df93) will **decrease** coverage by `0.01%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.50% 83.48% -0.02%
==========================================
Files 447 449 +2
Lines 58904 58949 +45
==========================================
+ Hits 49187 49213 +26
- Misses 9717 9736 +19
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...ld/srcs/sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5) | | |
| [.../srcs/sdks/python/apache\_beam/dataframe/schemas.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL3NjaGVtYXMucHk=) | | |
| [.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5) | | |
| [...am/runners/interactive/options/capture\_limiters.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9vcHRpb25zL2NhcHR1cmVfbGltaXRlcnMucHk=) | | |
| [...e\_beam/examples/complete/top\_wikipedia\_sessions.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvY29tcGxldGUvdG9wX3dpa2lwZWRpYV9zZXNzaW9ucy5weQ==) | | |
| [...ache\_beam/runners/interactive/recording\_manager.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9yZWNvcmRpbmdfbWFuYWdlci5weQ==) | | |
| [...ks/python/apache\_beam/io/gcp/pubsub\_it\_pipeline.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3B1YnN1Yl9pdF9waXBlbGluZS5weQ==) | | |
| [...examples/snippets/transforms/elementwise/kvswap.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9lbGVtZW50d2lzZS9rdnN3YXAucHk=) | | |
| [.../srcs/sdks/python/apache\_beam/io/external/kafka.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZXh0ZXJuYWwva2Fma2EucHk=) | | |
| [...thon/apache\_beam/io/azure/blobstoragefilesystem.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2VmaWxlc3lzdGVtLnB5) | | |
| ... and [886 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [1b86266...3fb8fa2](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] rohdesamuel commented on a change in pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
rohdesamuel commented on a change in pull request #14308:
URL: https://github.com/apache/beam/pull/14308#discussion_r608895537
##########
File path: sdks/python/apache_beam/examples/dataframe/taxiride.py
##########
@@ -0,0 +1,115 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Pipelines that use the DataFrame API to process NYC taxiride CSV data."""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import argparse
+import logging
+
+import apache_beam as beam
+from apache_beam.dataframe.convert import to_dataframe
+from apache_beam.dataframe.convert import to_pcollection
+from apache_beam.dataframe.io import read_csv
+from apache_beam.io import ReadFromText
+from apache_beam.options.pipeline_options import PipelineOptions
+
+ZONE_LOOKUP_PATH = "gs://apache-beam-samples/nyc_taxi/misc/taxi+_zone_lookup.csv"
+
+
+def run_aggregation_pipeline(pipeline_args, input_path, output_path):
+ # The pipeline will be run on exiting the with block.
+ with beam.Pipeline(options=PipelineOptions(pipeline_args)) as p:
+ rides = p | read_csv(input_path)
+
+ # Count the number of passengers dropped off per LocationID
+ agg = rides.groupby('DOLocationID').passenger_count.sum()
+ agg.to_csv(output_path)
+
+
+def run_enrich_pipeline(
+ pipeline_args, input_path, output_path, zone_lookup_path):
+ """Enrich taxi ride data with zone lookup table and perform a grouped
+ aggregation."""
+ # The pipeline will be run on exiting the with block.
+ with beam.Pipeline(options=PipelineOptions(pipeline_args)) as p:
+ rides = p | "Read taxi rides" >> read_csv(input_path)
+ zones = p | "Read zone lookup" >> read_csv(zone_lookup_path)
+
+ # Enrich taxi ride data with boroughs from zone lookup table
+ # Joins on zones.LocationID and rides.DOLocationID, by first making the
+ # former the index for zones.
+ rides = rides.merge(
+ zones.set_index('LocationID').Borough,
+ right_index=True,
+ left_on='DOLocationID',
+ how='left')
+
+ # Sum passengers dropped off per Borough
+ agg = rides.groupby('Borough').passenger_count.sum()
+ agg.to_csv(output_path)
+
+ # A more intuitive alternative to the above merge call, but this option
+ # doesn't preserve index, thus requires non-parallel execution.
+ #rides = rides.merge(zones[['LocationID','Borough']],
+ # how="left",
+ # left_on='DOLocationID',
+ # right_on='LocationID')
+
+
+def run(argv=None):
+ """Main entry point."""
+ parser = argparse.ArgumentParser()
+ parser.add_argument(
+ '--input',
+ dest='input',
+ default='gs://apache-beam-samples/nyc_taxi/misc/sample.csv',
+ help='Input file to process.')
+ parser.add_argument(
+ '--output',
+ dest='output',
+ required=True,
+ help='Output file to write results to.')
+ parser.add_argument(
+ '--zone_lookup',
+ dest='zone_lookup_path',
+ default=ZONE_LOOKUP_PATH,
+ help='Location for taxi zone lookup CSV.')
+ parser.add_argument('--pipeline', dest='pipeline', default='location_id_agg')
Review comment:
Please also put the different values pipeline can take in the help message.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (9e10eeb) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **increase** coverage by `0.08%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
+ Coverage 83.40% 83.48% +0.08%
==========================================
Files 469 464 -5
Lines 58727 58965 +238
==========================================
+ Hits 48981 49229 +248
+ Misses 9746 9736 -10
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...\_beam/testing/benchmarks/nexmark/queries/query8.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvcXVlcmllcy9xdWVyeTgucHk=) | | |
| [...pache\_beam/runners/portability/portable\_metrics.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9wb3J0YWJsZV9tZXRyaWNzLnB5) | | |
| [.../python/apache\_beam/typehints/trivial\_inference.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL3RyaXZpYWxfaW5mZXJlbmNlLnB5) | | |
| [...nners/direct/consumer\_tracking\_pipeline\_visitor.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvY29uc3VtZXJfdHJhY2tpbmdfcGlwZWxpbmVfdmlzaXRvci5weQ==) | | |
| [...suites/tox/py38/build/srcs/sdks/python/conftest.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vY29uZnRlc3QucHk=) | | |
| [.../examples/snippets/transforms/elementwise/pardo.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9lbGVtZW50d2lzZS9wYXJkby5weQ==) | | |
| [...\_beam/testing/benchmarks/nexmark/queries/query1.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvcXVlcmllcy9xdWVyeTEucHk=) | | |
| [...srcs/sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5) | | |
| [...eam/portability/api/beam\_expansion\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fZXhwYW5zaW9uX2FwaV9wYjJfZ3JwYy5weQ==) | | |
| [...ks/python/apache\_beam/runners/worker/opcounters.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvb3Bjb3VudGVycy5weQ==) | | |
| ... and [923 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...9e10eeb](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] robertwb commented on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
robertwb commented on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-816853458
Sounds good.
On Fri, Apr 9, 2021 at 10:37 AM Brian Hulette ***@***.***>
wrote:
> Ok I added wordcount_dataframe back as an alias for dataframe.wordcount
> in 3fb8fa2
> <https://github.com/apache/beam/commit/3fb8fa24b276159b093257a108d9305b4be996a1>,
> LMK if that's acceptable @robertwb <https://github.com/robertwb>
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/apache/beam/pull/14308#issuecomment-816843494>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AADWVAMLINISPGXGAL25EFTTH43ODANCNFSM4ZVMIVCQ>
> .
>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (9248ea8) into [master](https://codecov.io/gh/apache/beam/commit/ed3df93e747ddc271db5186faf2e05af0b57de1d?el=desc) (ed3df93) will **decrease** coverage by `0.12%`.
> The diff coverage is `n/a`.
> :exclamation: Current head 9248ea8 differs from pull request most recent head 3fb8fa2. Consider uploading reports for the commit 3fb8fa2 to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.50% 83.37% -0.13%
==========================================
Files 447 470 +23
Lines 58904 58766 -138
==========================================
- Hits 49187 48999 -188
- Misses 9717 9767 +50
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...ks/python/apache\_beam/coders/coders\_test\_common.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vY29kZXJzL2NvZGVyc190ZXN0X2NvbW1vbi5weQ==) | | |
| [...\_beam/testing/benchmarks/nexmark/queries/query9.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvcXVlcmllcy9xdWVyeTkucHk=) | | |
| [...ython/apache\_beam/typehints/decorators\_test\_py3.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL2RlY29yYXRvcnNfdGVzdF9weTMucHk=) | | |
| [...s/sdks/python/apache\_beam/internal/gcp/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvZ2NwL19faW5pdF9fLnB5) | | |
| [...am/examples/snippets/transforms/elementwise/map.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9lbGVtZW50d2lzZS9tYXAucHk=) | | |
| [...thon/apache\_beam/runners/worker/sdk\_worker\_main.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlcl9tYWluLnB5) | | |
| [...ow/dataflow\_exercise\_streaming\_metrics\_pipeline.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19leGVyY2lzZV9zdHJlYW1pbmdfbWV0cmljc19waXBlbGluZS5weQ==) | | |
| [...nchmarks/chicago\_taxi/tfdv\_analyze\_and\_validate.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL2NoaWNhZ29fdGF4aS90ZmR2X2FuYWx5emVfYW5kX3ZhbGlkYXRlLnB5) | | |
| [...ython/apache\_beam/io/gcp/tests/bigquery\_matcher.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3Rlc3RzL2JpZ3F1ZXJ5X21hdGNoZXIucHk=) | | |
| [...pache\_beam/runners/portability/artifact\_service.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9hcnRpZmFjdF9zZXJ2aWNlLnB5) | | |
| ... and [907 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [1b86266...3fb8fa2](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (3fb8fa2) into [master](https://codecov.io/gh/apache/beam/commit/ed3df93e747ddc271db5186faf2e05af0b57de1d?el=desc) (ed3df93) will **decrease** coverage by `0.01%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.50% 83.48% -0.02%
==========================================
Files 447 449 +2
Lines 58904 58949 +45
==========================================
+ Hits 49187 49213 +26
- Misses 9717 9736 +19
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...38/build/srcs/sdks/python/apache\_beam/io/fileio.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZWlvLnB5) | | |
| [...he\_beam/testing/benchmarks/nexmark/nexmark\_util.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya191dGlsLnB5) | | |
| [...ython/apache\_beam/examples/kafkataxi/kafka\_taxi.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMva2Fma2F0YXhpL2thZmthX3RheGkucHk=) | | |
| [...eam/runners/interactive/options/capture\_control.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9vcHRpb25zL2NhcHR1cmVfY29udHJvbC5weQ==) | | |
| [...he\_beam/portability/api/external\_transforms\_pb2.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2V4dGVybmFsX3RyYW5zZm9ybXNfcGIyLnB5) | | |
| [...pache\_beam/runners/portability/portable\_metrics.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9wb3J0YWJsZV9tZXRyaWNzLnB5) | | |
| [...che\_beam/runners/interactive/augmented\_pipeline.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9hdWdtZW50ZWRfcGlwZWxpbmUucHk=) | | |
| [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | | |
| [...ks/python/apache\_beam/runners/worker/operations.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvb3BlcmF0aW9ucy5weQ==) | | |
| [...ache\_beam/runners/dataflow/test\_dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy90ZXN0X2RhdGFmbG93X3J1bm5lci5weQ==) | | |
| ... and [886 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [1b86266...3fb8fa2](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-814267250
R: @rohdesamuel
Do you have time to review this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on a change in pull request #14308: [BEAM-12024] Move wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #14308:
URL: https://github.com/apache/beam/pull/14308#discussion_r599666280
##########
File path: sdks/python/apache_beam/examples/dataframe/taxiride_test.py
##########
@@ -0,0 +1,127 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Test for the wordcount example."""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import collections
+import logging
+import os
+import re
+import tempfile
+import unittest
+
+import pandas as pd
+
+from apache_beam.examples.dataframe import taxiride
+from apache_beam.testing.util import open_shards
+
+
+class TaxiRideExampleTest(unittest.TestCase):
+
+ # First 10 lines from gs://apache-beam-samples/nyc_taxi/misc/sample.csv
+ SAMPLE_RIDES = """VendorID,tpep_pickup_datetime,tpep_dropoff_datetime,passenger_count,trip_distance,RatecodeID,store_and_fwd_flag,PULocationID,DOLocationID,payment_type,fare_amount,extra,mta_tax,tip_amount,tolls_amount,improvement_surcharge,total_amount,congestion_surcharge
+ 1,2019-01-01 00:46:40,2019-01-01 00:53:20,1,1.50,1,N,151,239,1,7,0.5,0.5,1.65,0,0.3,9.95,
+ 1,2019-01-01 00:59:47,2019-01-01 01:18:59,1,2.60,1,N,239,246,1,14,0.5,0.5,1,0,0.3,16.3,
+ 2,2018-12-21 13:48:30,2018-12-21 13:52:40,3,.00,1,N,236,236,1,4.5,0.5,0.5,0,0,0.3,5.8,
+ 2,2018-11-28 15:52:25,2018-11-28 15:55:45,5,.00,1,N,193,193,2,3.5,0.5,0.5,0,0,0.3,7.55,
+ 2,2018-11-28 15:56:57,2018-11-28 15:58:33,5,.00,2,N,193,193,2,52,0,0.5,0,0,0.3,55.55,
+ 2,2018-11-28 16:25:49,2018-11-28 16:28:26,5,.00,1,N,193,193,2,3.5,0.5,0.5,0,5.76,0.3,13.31,
+ 2,2018-11-28 16:29:37,2018-11-28 16:33:43,5,.00,2,N,193,193,2,52,0,0.5,0,0,0.3,55.55,
+ 1,2019-01-01 00:21:28,2019-01-01 00:28:37,1,1.30,1,N,163,229,1,6.5,0.5,0.5,1.25,0,0.3,9.05,
+ 1,2019-01-01 00:32:01,2019-01-01 00:45:39,1,3.70,1,N,229,7,1,13.5,0.5,0.5,3.7,0,0.3,18.5
+ """
+
+ SAMPLE_ZONE_LOOKUP = """"LocationID","Borough","Zone","service_zone"
+ 7,"Queens","Astoria","Boro Zone"
+ 193,"Queens","Queensbridge/Ravenswood","Boro Zone"
+ 229,"Manhattan","Sutton Place/Turtle Bay North","Yellow Zone"
+ 236,"Manhattan","Upper East Side North","Yellow Zone"
+ 239,"Manhattan","Upper West Side South","Yellow Zone"
+ 246,"Manhattan","West Chelsea/Hudson Yards","Yellow Zone"
+ """
+
+ def setUp(self):
+ self.tmpdir = tempfile.TemporaryDirectory()
+ self.input_path = os.path.join(self.tmpdir.name, 'rides.csv')
+ self.lookup_path = os.path.join(self.tmpdir.name, 'lookup.csv')
+ self.output_path = os.path.join(self.tmpdir.name, 'output.csv')
+
+ with open(self.input_path, 'w') as fp:
+ fp.write(self.SAMPLE_RIDES)
+
+ with open(self.lookup_path, 'w') as fp:
+ fp.write(self.SAMPLE_ZONE_LOOKUP)
+
+ def tearDown(self):
+ self.tmpdir.cleanup()
+
+ def test_aggregation(self):
+ # Compute expected result
+ rides = pd.read_csv(self.input_path)
+ expected_counts = rides.groupby('DOLocationID').passenger_count.sum()
+
+ taxiride.run_aggregation_pipeline([], self.input_path, self.output_path)
+
+ # Parse result file and compare.
+ # TODO(BEAM-XXXX): taxiride examples should produce int sums, not floats
Review comment:
I will file a jira and update these TODOs before merging
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] codecov[bot] edited a comment on pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #14308:
URL: https://github.com/apache/beam/pull/14308#issuecomment-805990068
# [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=h1) Report
> Merging [#14308](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=desc) (5ca8939) into [master](https://codecov.io/gh/apache/beam/commit/243128a8fc52798e1b58b0cf1a271d95ee7aa241?el=desc) (243128a) will **decrease** coverage by `0.01%`.
> The diff coverage is `n/a`.
> :exclamation: Current head 5ca8939 differs from pull request most recent head 9e10eeb. Consider uploading reports for the commit 9e10eeb to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/14308/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #14308 +/- ##
==========================================
- Coverage 83.40% 83.38% -0.02%
==========================================
Files 469 470 +1
Lines 58727 58766 +39
==========================================
+ Hits 48981 49002 +21
- Misses 9746 9764 +18
```
| [Impacted Files](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [...s/snippets/transforms/aggregation/combinevalues.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9hZ2dyZWdhdGlvbi9jb21iaW5ldmFsdWVzLnB5) | | |
| [...ld/srcs/sdks/python/apache\_beam/utils/histogram.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaGlzdG9ncmFtLnB5) | | |
| [...on/apache\_beam/runners/portability/flink\_runner.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbGlua19ydW5uZXIucHk=) | | |
| [...s/python/apache\_beam/io/aws/clients/s3/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvX19pbml0X18ucHk=) | | |
| [...ers/dataflow/internal/clients/dataflow/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9pbnRlcm5hbC9jbGllbnRzL2RhdGFmbG93L19faW5pdF9fLnB5) | | |
| [...e\_beam/portability/api/beam\_runner\_api\_pb2\_urns.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjJfdXJucy5weQ==) | | |
| [.../python/apache\_beam/testing/benchmarks/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL19faW5pdF9fLnB5) | | |
| [...dks/python/apache\_beam/transforms/external\_java.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9leHRlcm5hbF9qYXZhLnB5) | | |
| [...ython/apache\_beam/io/gcp/tests/bigquery\_matcher.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3Rlc3RzL2JpZ3F1ZXJ5X21hdGNoZXIucHk=) | | |
| [...python/apache\_beam/examples/wordcount\_debugging.py](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree#diff-YmVhbV9QcmVDb21taXRfUHl0aG9uX0Nyb24vc3JjL3Nka3MvcHl0aG9uL3Rlc3Qtc3VpdGVzL3RveC9weTM4L2J1aWxkL3NyY3Mvc2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X2RlYnVnZ2luZy5weQ==) | | |
| ... and [929 more](https://codecov.io/gh/apache/beam/pull/14308/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=footer). Last update [377f4b2...9e10eeb](https://codecov.io/gh/apache/beam/pull/14308?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] rohdesamuel commented on a change in pull request #14308: [BEAM-12024] Move examples.wordcount_dataframe to examples.dataframe, add taxiride example
Posted by GitBox <gi...@apache.org>.
rohdesamuel commented on a change in pull request #14308:
URL: https://github.com/apache/beam/pull/14308#discussion_r608892895
##########
File path: sdks/python/apache_beam/examples/dataframe/taxiride.py
##########
@@ -0,0 +1,115 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Pipelines that use the DataFrame API to process NYC taxiride CSV data."""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import argparse
+import logging
+
+import apache_beam as beam
+from apache_beam.dataframe.convert import to_dataframe
+from apache_beam.dataframe.convert import to_pcollection
+from apache_beam.dataframe.io import read_csv
+from apache_beam.io import ReadFromText
+from apache_beam.options.pipeline_options import PipelineOptions
+
+ZONE_LOOKUP_PATH = "gs://apache-beam-samples/nyc_taxi/misc/taxi+_zone_lookup.csv"
+
+
+def run_aggregation_pipeline(pipeline_args, input_path, output_path):
+ # The pipeline will be run on exiting the with block.
+ with beam.Pipeline(options=PipelineOptions(pipeline_args)) as p:
+ rides = p | read_csv(input_path)
Review comment:
For my own edification, does read_csv expand and return a beam DataFrame?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org