You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "Abacn (via GitHub)" <gi...@apache.org> on 2023/01/28 01:56:28 UTC

[GitHub] [beam] Abacn opened a new issue, #25211: [Task]: Basic bundle strategy for direct runner

Abacn opened a new issue, #25211:
URL: https://github.com/apache/beam/issues/25211

   ### What needs to happen?
   
   Elements incoming for Java direct runner are not bundled, i.e., all in bundle size = 1. Some flakiness in our unit test is likely due to this issue. For example, it is found that BigQueryWriterTest is flaky due to intermittent time out (context #25207) the issue is traced down to fact that some batch load and file load tests concurrently write thousand of files locally.
   
   e.g. https://ci-beam.apache.org/job/beam_PreCommit_Java_GCP_IO_Direct_Phrase/57/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryIOWriteTest/testWriteDynamicDestinationsBatchWithSchemas_0_/
   
   the complete log will show (tested locally) 2,000 lines of org.apache.beam.sdk.io.gcp.bigquery.BigQueryRowWriter <init> log. Since the largest concurrent file open is limited, test sometimes get stuck
   
   ### Issue Priority
   
   Priority: 3 (nice-to-have improvement)
   
   ### Issue Components
   
   - [ ] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #25211: [Task]: Improve bundle strategy for direct runner

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #25211:
URL: https://github.com/apache/beam/issues/25211#issuecomment-1407416392

   Well, actually the bundle size being 1 for `testWriteDynamicDestinationsBatchWithSchemas_0_/` is mostly due to the test setting `.withMaxFileSize(10)` which makes each row having a file. Leaving this open for further improvement (though low priority)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org