You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jerome Lacoste (Created) (JIRA)" <ji...@apache.org> on 2011/12/23 14:36:37 UTC

[jira] [Created] (TIKA-832) ForkParser is unfriendly to code that prints things to its output

ForkParser is unfriendly to code that prints things to its output
-----------------------------------------------------------------

                 Key: TIKA-832
                 URL: https://issues.apache.org/jira/browse/TIKA-832
             Project: Tika
          Issue Type: Bug
    Affects Versions: 1.0
            Reporter: Jerome Lacoste
            Priority: Minor
         Attachments: TIKA-832_ForkClient_wait_a_bit_and_empty_the_initial_buffers.patch, TIKA-832_ForkClient_wait_a_bit_when_asked_to_empty_the_initial_buffers.patch

When given a java command that causes java to write something to the output, like a debugging instruction, tika fails.

I attach 2 patches that solve the issue in different way. Both use the same unit test

But I don't know it this is worth the complexity. At least to start a discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (TIKA-832) ForkParser is unfriendly to code that prints things to its output

Posted by "Jerome Lacoste (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175589#comment-13175589 ] 

Jerome Lacoste commented on TIKA-832:
-------------------------------------

OK for me. I can write it if you want.
                
> ForkParser is unfriendly to code that prints things to its output
> -----------------------------------------------------------------
>
>                 Key: TIKA-832
>                 URL: https://issues.apache.org/jira/browse/TIKA-832
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Jerome Lacoste
>            Priority: Minor
>         Attachments: TIKA-832_ForkClient_wait_a_bit_and_empty_the_initial_buffers.patch, TIKA-832_ForkClient_wait_a_bit_when_asked_to_empty_the_initial_buffers.patch
>
>
> When given a java command that causes java to write something to the output, like a debugging instruction, tika fails.
> I attach 2 patches that solve the issue in different way. Both use the same unit test
> But I don't know it this is worth the complexity. At least to start a discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-832) ForkParser is unfriendly to code that prints things to its output

Posted by "Jerome Lacoste (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jerome Lacoste updated TIKA-832:
--------------------------------

    Attachment: 0008-TIKA-832-add-a-start-signal-to-make-sure-output-crea.patch

Here you go.
                
> ForkParser is unfriendly to code that prints things to its output
> -----------------------------------------------------------------
>
>                 Key: TIKA-832
>                 URL: https://issues.apache.org/jira/browse/TIKA-832
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Jerome Lacoste
>            Priority: Minor
>         Attachments: 0008-TIKA-832-add-a-start-signal-to-make-sure-output-crea.patch
>
>
> When given a java command that causes java to write something to the output, like a debugging instruction, tika fails.
> I attach 2 patches that solve the issue in different way. Both use the same unit test
> But I don't know it this is worth the complexity. At least to start a discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-832) ForkParser is unfriendly to code that prints things to its output

Posted by "Jerome Lacoste (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jerome Lacoste updated TIKA-832:
--------------------------------

    Attachment: TIKA-832_ForkClient_wait_a_bit_when_asked_to_empty_the_initial_buffers.patch
                TIKA-832_ForkClient_wait_a_bit_and_empty_the_initial_buffers.patch

2 versions of the patch, I like none of them :)

At least the unit test exposes the problem.
                
> ForkParser is unfriendly to code that prints things to its output
> -----------------------------------------------------------------
>
>                 Key: TIKA-832
>                 URL: https://issues.apache.org/jira/browse/TIKA-832
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.0
>            Reporter: Jerome Lacoste
>            Priority: Minor
>         Attachments: TIKA-832_ForkClient_wait_a_bit_and_empty_the_initial_buffers.patch, TIKA-832_ForkClient_wait_a_bit_when_asked_to_empty_the_initial_buffers.patch
>
>
> When given a java command that causes java to write something to the output, like a debugging instruction, tika fails.
> I attach 2 patches that solve the issue in different way. Both use the same unit test
> But I don't know it this is worth the complexity. At least to start a discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-832) ForkParser is unfriendly to code that prints things to its output

Posted by "Jerome Lacoste (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jerome Lacoste updated TIKA-832:
--------------------------------

    Attachment:     (was: TIKA-832_ForkClient_wait_a_bit_when_asked_to_empty_the_initial_buffers.patch)
    
> ForkParser is unfriendly to code that prints things to its output
> -----------------------------------------------------------------
>
>                 Key: TIKA-832
>                 URL: https://issues.apache.org/jira/browse/TIKA-832
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Jerome Lacoste
>            Priority: Minor
>         Attachments: 0008-TIKA-832-add-a-start-signal-to-make-sure-output-crea.patch
>
>
> When given a java command that causes java to write something to the output, like a debugging instruction, tika fails.
> I attach 2 patches that solve the issue in different way. Both use the same unit test
> But I don't know it this is worth the complexity. At least to start a discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (TIKA-832) ForkParser is unfriendly to code that prints things to its output

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved TIKA-832.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 1.2
         Assignee: Jukka Zitting

Sorry for the delay on this. I committed the patch now in revision 1355759. Thanks!
                
> ForkParser is unfriendly to code that prints things to its output
> -----------------------------------------------------------------
>
>                 Key: TIKA-832
>                 URL: https://issues.apache.org/jira/browse/TIKA-832
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Jerome Lacoste
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 0008-TIKA-832-add-a-start-signal-to-make-sure-output-crea.patch
>
>
> When given a java command that causes java to write something to the output, like a debugging instruction, tika fails.
> I attach 2 patches that solve the issue in different way. Both use the same unit test
> But I don't know it this is worth the complexity. At least to start a discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (TIKA-832) ForkParser is unfriendly to code that prints things to its output

Posted by "Jukka Zitting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175591#comment-13175591 ] 

Jukka Zitting commented on TIKA-832:
------------------------------------

bq. I can write it if you want.

That would be great, thanks!
                
> ForkParser is unfriendly to code that prints things to its output
> -----------------------------------------------------------------
>
>                 Key: TIKA-832
>                 URL: https://issues.apache.org/jira/browse/TIKA-832
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Jerome Lacoste
>            Priority: Minor
>         Attachments: TIKA-832_ForkClient_wait_a_bit_and_empty_the_initial_buffers.patch, TIKA-832_ForkClient_wait_a_bit_when_asked_to_empty_the_initial_buffers.patch
>
>
> When given a java command that causes java to write something to the output, like a debugging instruction, tika fails.
> I attach 2 patches that solve the issue in different way. Both use the same unit test
> But I don't know it this is worth the complexity. At least to start a discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-832) ForkParser is unfriendly to code that prints things to its output

Posted by "Jerome Lacoste (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jerome Lacoste updated TIKA-832:
--------------------------------

    Attachment:     (was: TIKA-832_ForkClient_wait_a_bit_and_empty_the_initial_buffers.patch)
    
> ForkParser is unfriendly to code that prints things to its output
> -----------------------------------------------------------------
>
>                 Key: TIKA-832
>                 URL: https://issues.apache.org/jira/browse/TIKA-832
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Jerome Lacoste
>            Priority: Minor
>         Attachments: 0008-TIKA-832-add-a-start-signal-to-make-sure-output-crea.patch
>
>
> When given a java command that causes java to write something to the output, like a debugging instruction, tika fails.
> I attach 2 patches that solve the issue in different way. Both use the same unit test
> But I don't know it this is worth the complexity. At least to start a discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-832) ForkParser is unfriendly to code that prints things to its output

Posted by "Jukka Zitting (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated TIKA-832:
-------------------------------

    Issue Type: Improvement  (was: Bug)

bq. java command that causes java to write something to the output

The ForkParser expects to be given a normal java command, i.e. one that simply executes the given code without doing anything extra. Thus I wouldn't call this a bug, but rather an improvement request that extends the capability of ForkParser to previously unsupported use cases (like the mentioned debug statements).

Instead of the proposed alternatives (both of which have downsides), how about if we used a simple handshake protocol to make sure that the forked process is ready for use? For example, the parent process could start by sending some unique byte sequence down the stream, and then ignore all output from the child process until it responds by echoing that same byte sequence. At that point we can safely assume that the client is properly up and running.
                
> ForkParser is unfriendly to code that prints things to its output
> -----------------------------------------------------------------
>
>                 Key: TIKA-832
>                 URL: https://issues.apache.org/jira/browse/TIKA-832
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Jerome Lacoste
>            Priority: Minor
>         Attachments: TIKA-832_ForkClient_wait_a_bit_and_empty_the_initial_buffers.patch, TIKA-832_ForkClient_wait_a_bit_when_asked_to_empty_the_initial_buffers.patch
>
>
> When given a java command that causes java to write something to the output, like a debugging instruction, tika fails.
> I attach 2 patches that solve the issue in different way. Both use the same unit test
> But I don't know it this is worth the complexity. At least to start a discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira