You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kamil Wasilewski (Jira)" <ji...@apache.org> on 2020/01/20 10:58:00 UTC
[jira] [Updated] (BEAM-9154) Move Chicago Taxi Example to Python 3
[ https://issues.apache.org/jira/browse/BEAM-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kamil Wasilewski updated BEAM-9154:
-----------------------------------
Description:
The Chicago Taxi Example[1] should be moved to the latest version of Python supported by Beam (currently it's Python 3.7).
At the moment, the following error occurs when running the benchmark on Python 3.7 (requires futher investigation):
{code:java}
Traceback (most recent call last):
File "preprocess.py", line 259, in <module>
main()
File "preprocess.py", line 254, in main
project=known_args.metric_reporting_project
File "preprocess.py", line 155, in transform_data
('Analyze' >> tft_beam.AnalyzeDataset(preprocessing_fn)))
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 987, in __ror__
return self.transform.__ror__(pvalueish, self.label)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 547, in __ror__
result = p.apply(self, pvalueish, label)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 532, in apply
return self.apply(transform, pvalueish)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 573, in apply
pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 193, in apply
return m(transform, input, options)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 223, in apply_PTransform
return transform.expand(input)
File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 825, in expand
input_metadata))
File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 716, in expand
output_signature = self._preprocessing_fn(copied_inputs)
File "preprocess.py", line 102, in preprocessing_fn
_fill_in_missing(inputs[key]),
KeyError: 'company'
{code}
[1] sdks/python/apache_beam/testing/benchmarks/chicago_taxi
was:
The Chicago Taxi Example[1] should be moved to the latest version of Python supported by Beam (currently it's Python 3.7). The benchmark should run both on Dataflow and Flink.
At the moment, the following error occurs when running the benchmark (requires futher investigation):
{code:java}
Traceback (most recent call last):
File "preprocess.py", line 259, in <module>
main()
File "preprocess.py", line 254, in main
project=known_args.metric_reporting_project
File "preprocess.py", line 155, in transform_data
('Analyze' >> tft_beam.AnalyzeDataset(preprocessing_fn)))
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 987, in __ror__
return self.transform.__ror__(pvalueish, self.label)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 547, in __ror__
result = p.apply(self, pvalueish, label)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 532, in apply
return self.apply(transform, pvalueish)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 573, in apply
pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 193, in apply
return m(transform, input, options)
File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 223, in apply_PTransform
return transform.expand(input)
File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 825, in expand
input_metadata))
File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 716, in expand
output_signature = self._preprocessing_fn(copied_inputs)
File "preprocess.py", line 102, in preprocessing_fn
_fill_in_missing(inputs[key]),
KeyError: 'company'
{code}
[1] sdks/python/apache_beam/testing/benchmarks/chicago_taxi
> Move Chicago Taxi Example to Python 3
> -------------------------------------
>
> Key: BEAM-9154
> URL: https://issues.apache.org/jira/browse/BEAM-9154
> Project: Beam
> Issue Type: Improvement
> Components: testing
> Reporter: Kamil Wasilewski
> Assignee: Kamil Wasilewski
> Priority: Major
>
> The Chicago Taxi Example[1] should be moved to the latest version of Python supported by Beam (currently it's Python 3.7).
> At the moment, the following error occurs when running the benchmark on Python 3.7 (requires futher investigation):
> {code:java}
> Traceback (most recent call last):
> File "preprocess.py", line 259, in <module>
> main()
> File "preprocess.py", line 254, in main
> project=known_args.metric_reporting_project
> File "preprocess.py", line 155, in transform_data
> ('Analyze' >> tft_beam.AnalyzeDataset(preprocessing_fn)))
> File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 987, in __ror__
> return self.transform.__ror__(pvalueish, self.label)
> File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", line 547, in __ror__
> result = p.apply(self, pvalueish, label)
> File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 532, in apply
> return self.apply(transform, pvalueish)
> File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 573, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 193, in apply
> return m(transform, input, options)
> File "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", line 223, in apply_PTransform
> return transform.expand(input)
> File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 825, in expand
> input_metadata))
> File "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", line 716, in expand
> output_signature = self._preprocessing_fn(copied_inputs)
> File "preprocess.py", line 102, in preprocessing_fn
> _fill_in_missing(inputs[key]),
> KeyError: 'company'
> {code}
> [1] sdks/python/apache_beam/testing/benchmarks/chicago_taxi
--
This message was sent by Atlassian Jira
(v8.3.4#803005)