You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@spamassassin.apache.org on 2022/08/14 10:39:15 UTC

[Bug 8027] New: extracttext plugin fails when executable installed in a path containing a space

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8027

            Bug ID: 8027
           Summary: extracttext plugin fails when executable installed in
                    a path containing a space
           Product: Spamassassin
           Version: 4.0.0
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Plugins
          Assignee: dev@spamassassin.apache.org
          Reporter: sidney@sidney.com
  Target Milestone: Undefined

The default installation for at least one of the available Windows installation
files for Tesseract is in a subdirectory of C:\Program Files. The entire
command line is one config entry that is parsed by splitting on space. That
breaks if there is an embedded space, and there is no provision for quoting
fields in teh value. Fixing this could be done by making the executable ma,e be
a separate config entry from the command line arguments, or the parsing code
can be made more complex to handle quotes.

Until this is fixed a viable workaround is to install tesseract and pdftotext
only in directory paths with no spaces.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8027] extracttext plugin fails when executable installed in a path containing a space

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8027

--- Comment #2 from Sidney Markowitz <si...@sidney.com> ---
Created attachment 5837
  --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5837&action=edit
Skip tests in extracttext.t if executable is in a path with a space

The underlying cause is that sub helper_app_pipe_open in Utils.pm fails when
the path of the helper app contains a space. This sub is currently used in DCC,
Pyzor and ExtractText plugins.

The requirement that there can't be spaces in the paths for dcc, pyzor, and any
application used in the configuration for extracttext is good enough for the
4.0.0 release. However, to avoid test failures in GitHub actions, where the
Windows runner has a cat.exe in a subdirectory of C:\Program Files that is in
PATH, this patch is only in the t/extracttext.t test file, and skips tests if
the executable that is found has a space in the path.

As this patch is only for the test, it can be committed for the 4.0.0 release
without an RTC vote.

I'll open a new enhancement issue for after 4.0.0 for supporting executables
with space in the path in sub helper_app_pipe_open.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8027] extracttext plugin fails when executable installed in a path containing a space

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8027

Sidney Markowitz <si...@sidney.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|Undefined                   |4.0.0
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #3 from Sidney Markowitz <si...@sidney.com> ---
trunk % svn ci -m "bug 8027 - skip extracttext tests if executable found in
path with space to avoid test failure" t/extracttext.t
Sending        t/extracttext.t
Transmitting file data .done
Committing transaction...
Committed revision 1904466.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8027] extracttext plugin fails when executable installed in a path containing a space

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8027

--- Comment #1 from Sidney Markowitz <si...@sidney.com> ---
This bug is showing up in tests run on Github action Windows runners since we
have added "cat" to the test, as apparently the Windows runners have a "cat"
program in Path in a directory under C:\Program Files

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 8027] extracttext plugin fails when executable installed in a path containing a space

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8027

Sidney Markowitz <si...@sidney.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sidney@sidney.com
           Severity|normal                      |minor

-- 
You are receiving this mail because:
You are the assignee for the bug.