You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2021/04/01 02:54:00 UTC

[jira] [Created] (SPARK-34929) MapStatusesSerDeserBenchmark causes a side effect to other benchmarks with tasks being too big (JDK 11)

Hyukjin Kwon created SPARK-34929:
------------------------------------

             Summary: MapStatusesSerDeserBenchmark causes a side effect to other benchmarks with tasks being too big (JDK 11)
                 Key: SPARK-34929
                 URL: https://issues.apache.org/jira/browse/SPARK-34929
             Project: Spark
          Issue Type: Bug
          Components: SQL, Tests
    Affects Versions: 3.2.0
            Reporter: Hyukjin Kwon


In JDK 11, MapStatusesSerDeserBenchmark (being started failed) seems affecting other benchmark cases with growing the size of task:

```
2021-03-31T16:46:43.1179145Z 21/03/31 16:46:43 WARN DAGScheduler: Broadcasting large task binary with size 42.2 MiB
2021-03-31T16:46:47.3079315Z 21/03/31 16:46:47 WARN DAGScheduler: Broadcasting large task binary with size 42.2 MiB
2021-03-31T16:46:51.5920733Z 21/03/31 16:46:51 WARN DAGScheduler: Broadcasting large task binary with size 42.2 MiB
2021-03-31T16:46:55.9175194Z 21/03/31 16:46:55 WARN DAGScheduler: Broadcasting large task binary with size 42.2 MiB
2021-03-31T16:46:57.6874541Z   Stopped after 3 iterations, 12928 ms
2021-03-31T16:46:57.6875644Z 
2021-03-31T16:46:57.6877153Z OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1041-azure
2021-03-31T16:46:57.7095280Z Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
2021-03-31T16:46:57.7097654Z from_json as subExpr in Project:          Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
2021-03-31T16:46:57.7099059Z ------------------------------------------------------------------------------------------------------------------------
2021-03-31T16:46:57.7100274Z subExprElimination false, codegen: true           38880          41246        1389          0.0   388800445.2       1.0X
2021-03-31T16:46:57.7101134Z subExprElimination false, codegen: false          35819          38141        1234          0.0   358188088.6       1.1X
2021-03-31T16:46:57.7106264Z subExprElimination true, codegen: true             3947           4157         364          0.0    39465629.1       9.9X
2021-03-31T16:46:57.7106982Z subExprElimination true, codegen: false            4191           4309         112          0.0    41908945.5       9.3X
2021-03-31T16:46:57.7107595Z 
2021-03-31T16:46:57.7135178Z Preparing data for benchmarking ...
2021-03-31T16:46:58.5630584Z Running benchmark: from_json as subExpr in Filter
2021-03-31T16:46:58.5633083Z   Running case: subExprElimination false, codegen: true
2021-03-31T16:48:25.5619312Z 21/03/31 16:48:25 WARN DAGScheduler: Broadcasting large task binary with size 43.0 MiB
```

It only happens when the benchmarks run sequentially via Benchmarks.scala. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org