You are viewing a plain text version of this content. The canonical link for it is here.
Posted to droids-dev@incubator.apache.org by "Tobias Rübner (JIRA)" <ji...@apache.org> on 2011/07/14 13:46:59 UTC

[jira] [Created] (DROIDS-146) MultiThreadedTaskMaster stops on HTTP error code 404

MultiThreadedTaskMaster stops on HTTP error code 404
----------------------------------------------------

                 Key: DROIDS-146
                 URL: https://issues.apache.org/jira/browse/DROIDS-146
             Project: Droids
          Issue Type: Bug
          Components: core
    Affects Versions: 0.0.2
            Reporter: Tobias Rübner
             Fix For: 0.0.2


Crawling a site and getting a HTTP error >= 400 the MultiThreadedTaskMaster stops the process.
{code} 
1:12:27.312 [pool-1-thread-1] ERROR org.apache.droids.AbstractDroid -
org.apache.http.client.HttpResponseException: Not Found
        at org.apache.droids.protocol.http.HttpProtocol.load(HttpProtocol.java:71) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
        at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:72) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
        at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:39) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
        at org.apache.droids.impl.MultiThreadedTaskMaster$TaskExecutor.run(MultiThreadedTaskMaster.java:335) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_24]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_24]
        at java.lang.Thread.run(Thread.java:662) [na:1.6.0_24]
{code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (DROIDS-146) MultiThreadedTaskMaster stops on HTTP error code 404

Posted by "Tobias Rübner (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DROIDS-146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tobias Rübner updated DROIDS-146:
---------------------------------

    Attachment: DROIDS-146.patch

> MultiThreadedTaskMaster stops on HTTP error code 404
> ----------------------------------------------------
>
>                 Key: DROIDS-146
>                 URL: https://issues.apache.org/jira/browse/DROIDS-146
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.0.2
>            Reporter: Tobias Rübner
>             Fix For: 0.0.2
>
>         Attachments: DROIDS-146.patch
>
>
> Crawling a site and getting a HTTP error >= 400 the MultiThreadedTaskMaster stops the process.
> {code} 
> 1:12:27.312 [pool-1-thread-1] ERROR org.apache.droids.AbstractDroid -
> org.apache.http.client.HttpResponseException: Not Found
>         at org.apache.droids.protocol.http.HttpProtocol.load(HttpProtocol.java:71) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:72) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:39) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.impl.MultiThreadedTaskMaster$TaskExecutor.run(MultiThreadedTaskMaster.java:335) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_24]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_24]
>         at java.lang.Thread.run(Thread.java:662) [na:1.6.0_24]
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (DROIDS-146) MultiThreadedTaskMaster stops on HTTP error code 404

Posted by "Bertil Chapuis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DROIDS-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065372#comment-13065372 ] 

Bertil Chapuis commented on DROIDS-146:
---------------------------------------

Thanks for the patch. I may have miss something but what about handling this case with a custom ExceptionHanlder which do not return FATAL when a 404 error is encountered?

> MultiThreadedTaskMaster stops on HTTP error code 404
> ----------------------------------------------------
>
>                 Key: DROIDS-146
>                 URL: https://issues.apache.org/jira/browse/DROIDS-146
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.0.2
>            Reporter: Tobias Rübner
>             Fix For: 0.0.2
>
>         Attachments: DROIDS-146.patch
>
>
> Crawling a site and getting a HTTP error >= 400 the MultiThreadedTaskMaster stops the process.
> {code} 
> 1:12:27.312 [pool-1-thread-1] ERROR org.apache.droids.AbstractDroid -
> org.apache.http.client.HttpResponseException: Not Found
>         at org.apache.droids.protocol.http.HttpProtocol.load(HttpProtocol.java:71) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:72) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:39) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.impl.MultiThreadedTaskMaster$TaskExecutor.run(MultiThreadedTaskMaster.java:335) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_24]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_24]
>         at java.lang.Thread.run(Thread.java:662) [na:1.6.0_24]
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Resolved] (DROIDS-146) MultiThreadedTaskMaster stops on HTTP error code 404

Posted by "Tobias Rübner (Resolved JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DROIDS-146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tobias Rübner resolved DROIDS-146.
----------------------------------

    Resolution: Fixed

duplicate of DROIDS-152
                
> MultiThreadedTaskMaster stops on HTTP error code 404
> ----------------------------------------------------
>
>                 Key: DROIDS-146
>                 URL: https://issues.apache.org/jira/browse/DROIDS-146
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.0.2
>            Reporter: Tobias Rübner
>             Fix For: 0.0.2
>
>         Attachments: DROIDS-146.patch
>
>
> Crawling a site and getting a HTTP error >= 400 the MultiThreadedTaskMaster stops the process.
> {code} 
> 1:12:27.312 [pool-1-thread-1] ERROR org.apache.droids.AbstractDroid -
> org.apache.http.client.HttpResponseException: Not Found
>         at org.apache.droids.protocol.http.HttpProtocol.load(HttpProtocol.java:71) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:72) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:39) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.impl.MultiThreadedTaskMaster$TaskExecutor.run(MultiThreadedTaskMaster.java:335) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_24]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_24]
>         at java.lang.Thread.run(Thread.java:662) [na:1.6.0_24]
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (DROIDS-146) MultiThreadedTaskMaster stops on HTTP error code 404

Posted by "Bertil Chapuis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DROIDS-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066417#comment-13066417 ] 

Bertil Chapuis commented on DROIDS-146:
---------------------------------------

Personally, I often implement a custom TaskExceptionHandler. One of the advantages of handling the different cases in the TaskExceptionHandler is that the exception may as well occur during the parsing process. In my opinion, including all these cases in the TaskMaster won't be manageable. However, providing more TaskExceptionHandler implementations which answers common use cases could be a really interesting way to solve your issue.

> MultiThreadedTaskMaster stops on HTTP error code 404
> ----------------------------------------------------
>
>                 Key: DROIDS-146
>                 URL: https://issues.apache.org/jira/browse/DROIDS-146
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.0.2
>            Reporter: Tobias Rübner
>             Fix For: 0.0.2
>
>         Attachments: DROIDS-146.patch
>
>
> Crawling a site and getting a HTTP error >= 400 the MultiThreadedTaskMaster stops the process.
> {code} 
> 1:12:27.312 [pool-1-thread-1] ERROR org.apache.droids.AbstractDroid -
> org.apache.http.client.HttpResponseException: Not Found
>         at org.apache.droids.protocol.http.HttpProtocol.load(HttpProtocol.java:71) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:72) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:39) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.impl.MultiThreadedTaskMaster$TaskExecutor.run(MultiThreadedTaskMaster.java:335) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_24]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_24]
>         at java.lang.Thread.run(Thread.java:662) [na:1.6.0_24]
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (DROIDS-146) MultiThreadedTaskMaster stops on HTTP error code 404

Posted by "Tobias Rübner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DROIDS-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065959#comment-13065959 ] 

Tobias Rübner commented on DROIDS-146:
--------------------------------------

Currently the ExceptionHandler org.apache.droids.impl.DefaultTaskExceptionHandler does only return a warning.
But in the current code of the MultiThreadedTaskMaster each exception stops the process.


> MultiThreadedTaskMaster stops on HTTP error code 404
> ----------------------------------------------------
>
>                 Key: DROIDS-146
>                 URL: https://issues.apache.org/jira/browse/DROIDS-146
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.0.2
>            Reporter: Tobias Rübner
>             Fix For: 0.0.2
>
>         Attachments: DROIDS-146.patch
>
>
> Crawling a site and getting a HTTP error >= 400 the MultiThreadedTaskMaster stops the process.
> {code} 
> 1:12:27.312 [pool-1-thread-1] ERROR org.apache.droids.AbstractDroid -
> org.apache.http.client.HttpResponseException: Not Found
>         at org.apache.droids.protocol.http.HttpProtocol.load(HttpProtocol.java:71) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:72) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.robot.crawler.CrawlingWorker.execute(CrawlingWorker.java:39) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at org.apache.droids.impl.MultiThreadedTaskMaster$TaskExecutor.run(MultiThreadedTaskMaster.java:335) ~[droids-core-0.2-incubating-SNAPSHOT.jar:0.2-incubating-SNAPSHOT 1146608 - truebner]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_24]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_24]
>         at java.lang.Thread.run(Thread.java:662) [na:1.6.0_24]
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira