You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "akashorabek (via GitHub)" <gi...@apache.org> on 2024/04/02 08:16:33 UTC

Re: [PR] Add SpannerIO Stress test [beam]

akashorabek commented on PR #30800:
URL: https://github.com/apache/beam/pull/30800#issuecomment-2031360719

   > Looks like it has only 100-50 MB/s even for 20 workers,
   > 
   > and there are errors in log:
   > 
   > ```
   > DEADLINE_EXCEEDED writing batch of 25 mutations to Cloud Spanner, retrying after backoff of 9137ms
   > (DEADLINE_EXCEEDED: com.google.api.gax.rpc.DeadlineExceededException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: Deadline Exceeded)
   > ```
   > 
   > looks like write is throttled on the spanner side. If double the number of units of the spanner instance, would the throughput show difference?
   > 
   > The stress test is testing the capability of Beam IO, so we want to Spanner side has sufficient capability
   > 
   > Also need to run spotlessApply to clear the PreCommit failure
   
   Yeah, you were right about throttling on the spanner side. So I tested different number of spanner instance nodes and here are the results:
   
   - [5 nodes.](https://console.cloud.google.com/dataflow/jobs/us-central1/2024-04-01_01_31_36-2551455892432405596;step=Write%20to%20Spanner;mainTab=JOB_GRAPH;bottomTab=WORKER_LOGS;bottomStepTab=DATA_SAMPLING;logsSeverity=INFO;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22))) Average throughput is 250-400 MB/s
   - [20 nodes.](https://console.cloud.google.com/dataflow/jobs/us-central1/2024-04-01_23_29_28-17873229330511088299;step=Write%20to%20Spanner;mainTab=JOB_GRAPH;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22))) Throughput is 500-750 MB/s
   - [30 nodes.](https://console.cloud.google.com/dataflow/jobs/us-central1/2024-04-01_12_21_23-12244802819720839592;step=Write%20to%20Spanner;mainTab=JOB_GRAPH;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22))) Throughput is 600-900 MB/s
   - [40 nodes.](https://console.cloud.google.com/dataflow/jobs/us-central1/2024-04-02_00_15_40-17710127828952119517;step=Write%20to%20Spanner;mainTab=JOB_GRAPH;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22))) Basically the same as wtih 30 nodes. 
   
   Since there is no much difference in performance after 30 nodes I decided to use this number. 
   Also fixed the DEADLINE_EXCEEDED warnings. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org