You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kasia Kucharczyk (JIRA)" <ji...@apache.org> on 2018/11/22 09:32:00 UTC

[jira] [Created] (BEAM-6115) SyntheticSource parameter of bundle size sometimes is casted to invalid type

Kasia Kucharczyk created BEAM-6115:
--------------------------------------

             Summary: SyntheticSource parameter of bundle size sometimes is casted to invalid type
                 Key: BEAM-6115
                 URL: https://issues.apache.org/jira/browse/BEAM-6115
             Project: Beam
          Issue Type: Bug
          Components: testing
            Reporter: Kasia Kucharczyk
            Assignee: Kasia Kucharczyk


The parameter {code}bundle_size_in_elements{code} in SyntheticSources in Python in specific situations becomes `float` instead of `int` what causes failure on Dataflow:
{code:java}
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 642, in do_work
work_executor.execute()
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 198, in execute
self._split_task)
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 206, in _perform_source_split_considering_api_limits
desired_bundle_size)
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 243, in _perform_source_split
for split in source.split(desired_bundle_size):
File "/usr/local/lib/python2.7/dist-packages/apache_beam/testing/synthetic_pipeline.py", line 222, in split
bundle_size_in_elements):
TypeError: range() integer step argument expected, got float.{code}
 
Debugging showed that on Dataflow following line causes this problem (line 213-214):
{code:python}max(1, self._num_records / self._initial_splitting_num_bundles){code}.
In line 218, there is:
{code:python}math.floor(math.sqrt(self._num_records)){code} which also returns float.

In 222 line _bundle_size_in_elements_ is used to _range_ method which requires _int_.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)