You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/01/25 19:41:00 UTC
[jira] [Updated] (ARROW-15454) [Python] Try to make CSV cancellation test more robust
[ https://issues.apache.org/jira/browse/ARROW-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated ARROW-15454:
-----------------------------------
Labels: pull-request-available (was: )
> [Python] Try to make CSV cancellation test more robust
> ------------------------------------------------------
>
> Key: ARROW-15454
> URL: https://issues.apache.org/jira/browse/ARROW-15454
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: Antoine Pitrou
> Assignee: Antoine Pitrou
> Priority: Major
> Labels: pull-request-available
> Fix For: 8.0.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The test can occasionally fail, see symptoms here:
> https://github.com/ursacomputing/crossbow/runs/4920924293?check_suite_focus=true#step:4:12555
> {code}
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> captured stdout >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> workload size: 100000
> workload size: 300000
> workload size: 900000
> workload size: 2700000
> workload size: 8100000
> workload size: 24300000
> workload size: 72900000
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> self = <pyarrow.tests.test_csv.TestSerialCSVTableRead object at 0x1215b9f70>
> def test_cancellation(self):
> if (threading.current_thread().ident !=
> threading.main_thread().ident):
> pytest.skip("test only works from main Python thread")
> # Skips test if not available
> raise_signal = util.get_raise_signal()
>
> # Make the interruptible workload large enough to not finish
> # before the interrupt comes, even in release mode on fast machines.
> last_duration = 0.0
> workload_size = 100_000
>
> while last_duration < 1.0:
> print("workload size:", workload_size)
> large_csv = b"a,b,c\n" + b"1,2,3\n" * workload_size
> t1 = time.time()
> self.read_bytes(large_csv)
> last_duration = time.time() - t1
> workload_size = workload_size * 3
>
> def signal_from_thread():
> time.sleep(0.2)
> raise_signal(signal.SIGINT)
>
> t1 = time.time()
> try:
> try:
> t = threading.Thread(target=signal_from_thread)
> with pytest.raises(KeyboardInterrupt) as exc_info:
> t.start()
> > self.read_bytes(large_csv)
> E Failed: DID NOT RAISE <class 'KeyboardInterrupt'>
> pyarrow/tests/test_csv.py:1400: Failed
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)