You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "gnehil (via GitHub)" <gi...@apache.org> on 2024/01/25 07:51:36 UTC
[PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
gnehil opened a new pull request, #30368:
URL: https://github.com/apache/doris/pull/30368
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
## Further comments
If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2003060751
PR approved by anyone and no changes requested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "CalvinKirs (via GitHub)" <gi...@apache.org>.
CalvinKirs commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1996795170
run buildall
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1996878580
run buildall
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "CalvinKirs (via GitHub)" <gi...@apache.org>.
CalvinKirs commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1998826454
Colude you provide more details on the version requirements? Are there any specific requirements for the Spark version on the user service end?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1996919713
<details>
<summary>TPC-H: <b>Total hot run time: 38360 ms</b></summary>
```
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7ad6689c8b6b4eb5c12094982deb72fb7f80c4c2, data reload: false
------ Round 1 ----------------------------------
q1 17699 5572 4152 4152
q2 2039 153 149 149
q3 10752 1051 903 903
q4 7878 780 724 724
q5 7467 2577 2623 2577
q6 184 125 123 123
q7 1231 830 793 793
q8 9349 2034 2032 2032
q9 7097 6430 6422 6422
q10 8491 3499 3635 3499
q11 426 218 221 218
q12 573 310 305 305
q13 17806 2815 2874 2815
q14 273 252 245 245
q15 505 460 465 460
q16 488 413 389 389
q17 947 597 519 519
q18 7231 6553 6437 6437
q19 2828 1494 1421 1421
q20 566 285 281 281
q21 6221 3591 3656 3591
q22 363 305 307 305
Total cold run time: 110414 ms
Total hot run time: 38360 ms
----- Round 2, with runtime_filter_mode=off -----
q1 4144 4119 4061 4061
q2 316 224 229 224
q3 2928 2826 2842 2826
q4 1893 1581 1602 1581
q5 5214 5246 5275 5246
q6 195 116 118 116
q7 2262 1817 1874 1817
q8 3154 3268 3283 3268
q9 8616 8564 8538 8538
q10 3753 3668 3688 3668
q11 535 455 439 439
q12 718 544 539 539
q13 16915 2870 2857 2857
q14 297 251 242 242
q15 481 452 454 452
q16 450 437 413 413
q17 1734 1463 1487 1463
q18 7531 7266 7167 7167
q19 1617 1542 1497 1497
q20 1897 1702 1704 1702
q21 4705 4746 4628 4628
q22 533 452 463 452
Total cold run time: 69888 ms
Total hot run time: 53196 ms
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "CalvinKirs (via GitHub)" <gi...@apache.org>.
CalvinKirs commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2003072049
Please try to add regression tests for Spark3, this can be done in the next PR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "CalvinKirs (via GitHub)" <gi...@apache.org>.
CalvinKirs merged PR #30368:
URL: https://github.com/apache/doris/pull/30368
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2003060718
PR approved by at least one committer and no changes requested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1997626924
<details>
<summary>TPC-H: <b>Total hot run time: 38466 ms</b></summary>
```
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f3fb59651850ac7accccd2356c4cb5c082afcc87, data reload: false
------ Round 1 ----------------------------------
q1 17684 4889 4156 4156
q2 2027 157 167 157
q3 10692 1077 899 899
q4 6744 732 737 732
q5 7474 2580 2725 2580
q6 182 123 122 122
q7 1180 829 817 817
q8 9346 1984 1997 1984
q9 7157 6457 6450 6450
q10 8488 3476 3606 3476
q11 430 231 222 222
q12 667 310 302 302
q13 17796 2868 2893 2868
q14 273 245 251 245
q15 491 452 466 452
q16 493 405 394 394
q17 943 532 554 532
q18 7217 6616 6509 6509
q19 1559 1403 1388 1388
q20 545 296 279 279
q21 6539 3589 3638 3589
q22 381 313 314 313
Total cold run time: 108308 ms
Total hot run time: 38466 ms
----- Round 2, with runtime_filter_mode=off -----
q1 4201 4105 4148 4105
q2 322 229 231 229
q3 2934 2888 2800 2800
q4 1890 1607 1615 1607
q5 5238 5280 5264 5264
q6 209 114 119 114
q7 2235 1834 1846 1834
q8 3149 3302 3276 3276
q9 8573 8565 8600 8565
q10 3715 3696 3680 3680
q11 536 430 437 430
q12 712 553 571 553
q13 16924 2901 2852 2852
q14 274 246 250 246
q15 471 457 448 448
q16 457 419 399 399
q17 1775 1475 1454 1454
q18 7491 7181 7110 7110
q19 1616 1503 1485 1485
q20 1876 1681 1689 1681
q21 4715 4676 4696 4676
q22 517 433 438 433
Total cold run time: 69830 ms
Total hot run time: 53241 ms
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2003047727
> Colude you provide more details on the version requirements? Are there any specific requirements for the Spark version on the user service end?
For the previous code, when writing the parquet file, the Row object is converted to an Internal Row object by calling the toRow method of RowEncoder. For this behavior in the RowEncoder class, spark 2 and spark 3 have different implementation methods, so spark load can only run in the spark 2 environment.
For the current modification, use the apply method of InternalRow to initialize a new InternalRow object through the value array in the Row object. This method is implemented in the same way in spark 2 and spark 3, so spark load can run normally in both two versions of spark environment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1997033975
<details>
<summary>TPC-H: <b>Total hot run time: 38446 ms</b></summary>
```
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 23b7c0b0ab58d04d4b294206d2b53cf027d4b6fd, data reload: false
------ Round 1 ----------------------------------
q1 17644 4201 4125 4125
q2 2029 150 143 143
q3 10672 1087 903 903
q4 7078 755 731 731
q5 7486 2771 2774 2771
q6 196 123 125 123
q7 1159 835 806 806
q8 9403 2031 1971 1971
q9 7056 6470 6376 6376
q10 8551 3477 3676 3477
q11 423 229 223 223
q12 676 304 296 296
q13 17789 2853 2847 2847
q14 276 250 264 250
q15 496 463 455 455
q16 498 400 398 398
q17 964 519 603 519
q18 7255 6505 6465 6465
q19 1529 1477 1494 1477
q20 540 283 275 275
q21 6358 3515 3561 3515
q22 366 300 307 300
Total cold run time: 108444 ms
Total hot run time: 38446 ms
----- Round 2, with runtime_filter_mode=off -----
q1 4099 4079 4123 4079
q2 320 221 219 219
q3 2982 2824 2849 2824
q4 1892 1568 1596 1568
q5 5203 5244 5248 5244
q6 196 115 125 115
q7 2239 1858 1862 1858
q8 3158 3264 3257 3257
q9 8574 8522 8553 8522
q10 3696 3703 3693 3693
q11 538 435 462 435
q12 708 567 532 532
q13 16910 2843 2884 2843
q14 277 244 254 244
q15 490 446 449 446
q16 482 413 419 413
q17 1729 1479 1464 1464
q18 7485 7298 7038 7038
q19 1631 1473 1548 1473
q20 1901 1722 1715 1715
q21 4887 4704 4769 4704
q22 541 452 467 452
Total cold run time: 69938 ms
Total hot run time: 53138 ms
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1997586878
run buildall
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2002831349
run buildall
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1996795942
run buildall
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]
Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1997001115
run buildall
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org