You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/01/25 19:41:00 UTC

[jira] [Updated] (ARROW-15454) [Python] Try to make CSV cancellation test more robust

     [ https://issues.apache.org/jira/browse/ARROW-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated ARROW-15454:
-----------------------------------
    Labels: pull-request-available  (was: )

> [Python] Try to make CSV cancellation test more robust
> ------------------------------------------------------
>
>                 Key: ARROW-15454
>                 URL: https://issues.apache.org/jira/browse/ARROW-15454
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Antoine Pitrou
>            Assignee: Antoine Pitrou
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 8.0.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The test can occasionally fail, see symptoms here:
> https://github.com/ursacomputing/crossbow/runs/4920924293?check_suite_focus=true#step:4:12555
> {code}
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> captured stdout >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> workload size: 100000
> workload size: 300000
> workload size: 900000
> workload size: 2700000
> workload size: 8100000
> workload size: 24300000
> workload size: 72900000
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> self = <pyarrow.tests.test_csv.TestSerialCSVTableRead object at 0x1215b9f70>
>     def test_cancellation(self):
>         if (threading.current_thread().ident !=
>                 threading.main_thread().ident):
>             pytest.skip("test only works from main Python thread")
>         # Skips test if not available
>         raise_signal = util.get_raise_signal()
>     
>         # Make the interruptible workload large enough to not finish
>         # before the interrupt comes, even in release mode on fast machines.
>         last_duration = 0.0
>         workload_size = 100_000
>     
>         while last_duration < 1.0:
>             print("workload size:", workload_size)
>             large_csv = b"a,b,c\n" + b"1,2,3\n" * workload_size
>             t1 = time.time()
>             self.read_bytes(large_csv)
>             last_duration = time.time() - t1
>             workload_size = workload_size * 3
>     
>         def signal_from_thread():
>             time.sleep(0.2)
>             raise_signal(signal.SIGINT)
>     
>         t1 = time.time()
>         try:
>             try:
>                 t = threading.Thread(target=signal_from_thread)
>                 with pytest.raises(KeyboardInterrupt) as exc_info:
>                     t.start()
> >                   self.read_bytes(large_csv)
> E                   Failed: DID NOT RAISE <class 'KeyboardInterrupt'>
> pyarrow/tests/test_csv.py:1400: Failed
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)