You are viewing a plain text version of this content. The canonical link for it is here.
Posted to droids-dev@incubator.apache.org by "Javier Puerto (JIRA)" <ji...@apache.org> on 2009/11/12 16:19:39 UTC

[jira] Created: (DROIDS-68) HandlerFactory fails with multithreaded implementation

HandlerFactory fails with multithreaded implementation
------------------------------------------------------

                 Key: DROIDS-68
                 URL: https://issues.apache.org/jira/browse/DROIDS-68
             Project: Droids
          Issue Type: Bug
          Components: core
         Environment: Ubuntu 9.04 i386
Java Runtime 1.6_16
            Reporter: Javier Puerto
            Priority: Critical


Hi, I'm working with Droids and made some URL crawlers to save a lot of web pages in disk. In JUnit test, I run a little http server and crawl 20 pages, the most times everything works ok but in rare cases I get an error. I found the problem in the HandlerFactory implementation, in the example the call to handlers is like this:

protected void handle(ContentEntity entity, Link link)
    throws DroidsException, IOException
{
  droid.getHandlerFactory().handle(link.getURI(), entity);
}


If two or more workers is trying to handle at same time, the HandlerFactory will handle with the same instance. The solution could be saving memory or improving performance.

The first solution could be implemented adding a "synchronized" to HandlerFactory.handle like this.

public synchronized boolean handle(URI uri, ContentEntity entity)
    throws DroidsException, IOException {
  for (Handler handler : getMap().values()) {
    handler.handle(uri, entity);
  }
  return true;
}
This solution works but it is a workaround.

The real solution was discussed in the dev list and it was make the Droid and the GenericFactory abstractions clonable and invoking the clone method in the Worker's constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (DROIDS-68) HandlerFactory fails with multithreaded implementation

Posted by "Thorsten Scherler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DROIDS-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thorsten Scherler closed DROIDS-68.
-----------------------------------

    Resolution: Fixed

Committed revision 892828.

thanks Javier

> HandlerFactory fails with multithreaded implementation
> ------------------------------------------------------
>
>                 Key: DROIDS-68
>                 URL: https://issues.apache.org/jira/browse/DROIDS-68
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>         Environment: Ubuntu 9.04 i386
> Java Runtime 1.6_16
>            Reporter: Javier Puerto
>            Priority: Critical
>         Attachments: droids_concurrency.patch, droidsTest_concurrency.patch
>
>
> Hi, I'm working with Droids and made some URL crawlers to save a lot of web pages in disk. In JUnit test, I run a little http server and crawl 20 pages, the most times everything works ok but in rare cases I get an error. I found the problem in the HandlerFactory implementation, in the example the call to handlers is like this:
> protected void handle(ContentEntity entity, Link link)
>     throws DroidsException, IOException
> {
>   droid.getHandlerFactory().handle(link.getURI(), entity);
> }
> If two or more workers is trying to handle at same time, the HandlerFactory will handle with the same instance. The solution could be saving memory or improving performance.
> The first solution could be implemented adding a "synchronized" to HandlerFactory.handle like this.
> public synchronized boolean handle(URI uri, ContentEntity entity)
>     throws DroidsException, IOException {
>   for (Handler handler : getMap().values()) {
>     handler.handle(uri, entity);
>   }
>   return true;
> }
> This solution works but it is a workaround.
> The real solution was discussed in the dev list and it was make the Droid and the GenericFactory abstractions clonable and invoking the clone method in the Worker's constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DROIDS-68) HandlerFactory fails with multithreaded implementation

Posted by "Javier Puerto (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DROIDS-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Javier Puerto updated DROIDS-68:
--------------------------------

    Attachment: droidsTest_concurrency.patch
                droids_concurrency.patch

There is two files. The first one change the default implementation of the crawler and makes it abstract. You must implement the method getWorker to detail the handlers that the worker can use.

The HandlerFactory is now with the workers.

The another patch is a fix for the actual testcases and removing the indexer that was not used.

> HandlerFactory fails with multithreaded implementation
> ------------------------------------------------------
>
>                 Key: DROIDS-68
>                 URL: https://issues.apache.org/jira/browse/DROIDS-68
>             Project: Droids
>          Issue Type: Bug
>          Components: core
>         Environment: Ubuntu 9.04 i386
> Java Runtime 1.6_16
>            Reporter: Javier Puerto
>            Priority: Critical
>         Attachments: droids_concurrency.patch, droidsTest_concurrency.patch
>
>
> Hi, I'm working with Droids and made some URL crawlers to save a lot of web pages in disk. In JUnit test, I run a little http server and crawl 20 pages, the most times everything works ok but in rare cases I get an error. I found the problem in the HandlerFactory implementation, in the example the call to handlers is like this:
> protected void handle(ContentEntity entity, Link link)
>     throws DroidsException, IOException
> {
>   droid.getHandlerFactory().handle(link.getURI(), entity);
> }
> If two or more workers is trying to handle at same time, the HandlerFactory will handle with the same instance. The solution could be saving memory or improving performance.
> The first solution could be implemented adding a "synchronized" to HandlerFactory.handle like this.
> public synchronized boolean handle(URI uri, ContentEntity entity)
>     throws DroidsException, IOException {
>   for (Handler handler : getMap().values()) {
>     handler.handle(uri, entity);
>   }
>   return true;
> }
> This solution works but it is a workaround.
> The real solution was discussed in the dev list and it was make the Droid and the GenericFactory abstractions clonable and invoking the clone method in the Worker's constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.