You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "gnehil (via GitHub)" <gi...@apache.org> on 2024/01/25 07:51:36 UTC

[PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

gnehil opened a new pull request, #30368:
URL: https://github.com/apache/doris/pull/30368

   ## Proposed changes
   
   Issue Number: close #xxx
   
   <!--Describe your changes.-->
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2003060751

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "CalvinKirs (via GitHub)" <gi...@apache.org>.
CalvinKirs commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1996795170

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1996878580

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "CalvinKirs (via GitHub)" <gi...@apache.org>.
CalvinKirs commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1998826454

   Colude you provide more details on the version requirements? Are there any specific requirements for the Spark version on the user service end?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1996919713

   
   <details>
   <summary>TPC-H: <b>Total hot run time: 38360 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 7ad6689c8b6b4eb5c12094982deb72fb7f80c4c2, data reload: false
   
   ------ Round 1 ----------------------------------
   q1	17699	5572	4152	4152
   q2	2039	153	149	149
   q3	10752	1051	903	903
   q4	7878	780	724	724
   q5	7467	2577	2623	2577
   q6	184	125	123	123
   q7	1231	830	793	793
   q8	9349	2034	2032	2032
   q9	7097	6430	6422	6422
   q10	8491	3499	3635	3499
   q11	426	218	221	218
   q12	573	310	305	305
   q13	17806	2815	2874	2815
   q14	273	252	245	245
   q15	505	460	465	460
   q16	488	413	389	389
   q17	947	597	519	519
   q18	7231	6553	6437	6437
   q19	2828	1494	1421	1421
   q20	566	285	281	281
   q21	6221	3591	3656	3591
   q22	363	305	307	305
   Total cold run time: 110414 ms
   Total hot run time: 38360 ms
   
   ----- Round 2, with runtime_filter_mode=off -----
   q1	4144	4119	4061	4061
   q2	316	224	229	224
   q3	2928	2826	2842	2826
   q4	1893	1581	1602	1581
   q5	5214	5246	5275	5246
   q6	195	116	118	116
   q7	2262	1817	1874	1817
   q8	3154	3268	3283	3268
   q9	8616	8564	8538	8538
   q10	3753	3668	3688	3668
   q11	535	455	439	439
   q12	718	544	539	539
   q13	16915	2870	2857	2857
   q14	297	251	242	242
   q15	481	452	454	452
   q16	450	437	413	413
   q17	1734	1463	1487	1463
   q18	7531	7266	7167	7167
   q19	1617	1542	1497	1497
   q20	1897	1702	1704	1702
   q21	4705	4746	4628	4628
   q22	533	452	463	452
   Total cold run time: 69888 ms
   Total hot run time: 53196 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "CalvinKirs (via GitHub)" <gi...@apache.org>.
CalvinKirs commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2003072049

   Please try to add regression tests for Spark3, this can be done in the next PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "CalvinKirs (via GitHub)" <gi...@apache.org>.
CalvinKirs merged PR #30368:
URL: https://github.com/apache/doris/pull/30368


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2003060718

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1997626924

   
   <details>
   <summary>TPC-H: <b>Total hot run time: 38466 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit f3fb59651850ac7accccd2356c4cb5c082afcc87, data reload: false
   
   ------ Round 1 ----------------------------------
   q1	17684	4889	4156	4156
   q2	2027	157	167	157
   q3	10692	1077	899	899
   q4	6744	732	737	732
   q5	7474	2580	2725	2580
   q6	182	123	122	122
   q7	1180	829	817	817
   q8	9346	1984	1997	1984
   q9	7157	6457	6450	6450
   q10	8488	3476	3606	3476
   q11	430	231	222	222
   q12	667	310	302	302
   q13	17796	2868	2893	2868
   q14	273	245	251	245
   q15	491	452	466	452
   q16	493	405	394	394
   q17	943	532	554	532
   q18	7217	6616	6509	6509
   q19	1559	1403	1388	1388
   q20	545	296	279	279
   q21	6539	3589	3638	3589
   q22	381	313	314	313
   Total cold run time: 108308 ms
   Total hot run time: 38466 ms
   
   ----- Round 2, with runtime_filter_mode=off -----
   q1	4201	4105	4148	4105
   q2	322	229	231	229
   q3	2934	2888	2800	2800
   q4	1890	1607	1615	1607
   q5	5238	5280	5264	5264
   q6	209	114	119	114
   q7	2235	1834	1846	1834
   q8	3149	3302	3276	3276
   q9	8573	8565	8600	8565
   q10	3715	3696	3680	3680
   q11	536	430	437	430
   q12	712	553	571	553
   q13	16924	2901	2852	2852
   q14	274	246	250	246
   q15	471	457	448	448
   q16	457	419	399	399
   q17	1775	1475	1454	1454
   q18	7491	7181	7110	7110
   q19	1616	1503	1485	1485
   q20	1876	1681	1689	1681
   q21	4715	4676	4696	4676
   q22	517	433	438	433
   Total cold run time: 69830 ms
   Total hot run time: 53241 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2003047727

   > Colude you provide more details on the version requirements? Are there any specific requirements for the Spark version on the user service end?
   
   For the previous code, when writing the parquet file, the Row object is converted to an Internal Row object by calling the toRow method of RowEncoder. For this behavior in the RowEncoder class, spark 2 and spark 3 have different implementation methods, so spark load can only run in the spark 2 environment.
   For the current modification, use the apply method of InternalRow to initialize a new InternalRow object through the value array in the Row object. This method is implemented in the same way in spark 2 and spark 3, so spark load can run normally in both two versions of spark environment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1997033975

   
   <details>
   <summary>TPC-H: <b>Total hot run time: 38446 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 23b7c0b0ab58d04d4b294206d2b53cf027d4b6fd, data reload: false
   
   ------ Round 1 ----------------------------------
   q1	17644	4201	4125	4125
   q2	2029	150	143	143
   q3	10672	1087	903	903
   q4	7078	755	731	731
   q5	7486	2771	2774	2771
   q6	196	123	125	123
   q7	1159	835	806	806
   q8	9403	2031	1971	1971
   q9	7056	6470	6376	6376
   q10	8551	3477	3676	3477
   q11	423	229	223	223
   q12	676	304	296	296
   q13	17789	2853	2847	2847
   q14	276	250	264	250
   q15	496	463	455	455
   q16	498	400	398	398
   q17	964	519	603	519
   q18	7255	6505	6465	6465
   q19	1529	1477	1494	1477
   q20	540	283	275	275
   q21	6358	3515	3561	3515
   q22	366	300	307	300
   Total cold run time: 108444 ms
   Total hot run time: 38446 ms
   
   ----- Round 2, with runtime_filter_mode=off -----
   q1	4099	4079	4123	4079
   q2	320	221	219	219
   q3	2982	2824	2849	2824
   q4	1892	1568	1596	1568
   q5	5203	5244	5248	5244
   q6	196	115	125	115
   q7	2239	1858	1862	1858
   q8	3158	3264	3257	3257
   q9	8574	8522	8553	8522
   q10	3696	3703	3693	3693
   q11	538	435	462	435
   q12	708	567	532	532
   q13	16910	2843	2884	2843
   q14	277	244	254	244
   q15	490	446	449	446
   q16	482	413	419	413
   q17	1729	1479	1464	1464
   q18	7485	7298	7038	7038
   q19	1631	1473	1548	1473
   q20	1901	1722	1715	1715
   q21	4887	4704	4769	4704
   q22	541	452	467	452
   Total cold run time: 69938 ms
   Total hot run time: 53138 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1997586878

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-2002831349

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1996795942

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [refactor](spark load) update spark version for spark load to resolve cve problem [doris]

Posted by "gnehil (via GitHub)" <gi...@apache.org>.
gnehil commented on PR #30368:
URL: https://github.com/apache/doris/pull/30368#issuecomment-1997001115

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org