You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Fuad Efendi (JIRA)" <ji...@apache.org> on 2011/06/01 00:44:47 UTC

[jira] [Commented] (SOLR-2233) DataImportHandler - JdbcDataSource is not thread safe

    [ https://issues.apache.org/jira/browse/SOLR-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041884#comment-13041884 ] 

Fuad Efendi commented on SOLR-2233:
-----------------------------------

Hi Frank, thanks for the patch; unfortunately it is not thread safe... if you don't mind let me continue working on this, I want to use internal connection pool (if JNDI data source is not available)...

My initial patch already contains *too much*; and new one will remove ResultSetIterator and make it much simlper to understand (and multithreaded); and code shoulnd't have any dependency on rare *optionally supported* patterns such as ResultSet.TYPE_FORWARD_ONLY; READ_ONLY should be managed differently (and it is hard to manage if data size is huge and data is concurrently updated while we are importing it)
Possible solution could be connection.close() after reading each single record (and initial query should return PKs of records) - but it would be next step... I wrote initial patch for a production system where complex 10-query-based documents (about 500k docs) took many hours to import (and now it is about 40 minutes only) (and what happens if we have network problem and we are in the middre of Iterator?)

Thanks

> DataImportHandler - JdbcDataSource is not thread safe
> -----------------------------------------------------
>
>                 Key: SOLR-2233
>                 URL: https://issues.apache.org/jira/browse/SOLR-2233
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.5
>            Reporter: Fuad Efendi
>         Attachments: FE-patch.txt, SOLR-2233-JdbcDataSource.patch, SOLR-2233-JdbcDataSource.patch, SOLR-2233.patch
>
>
> Whenever Thread A spends more than 10 seconds on a Connection (by retrieving records in a batch), Thread B will close connection.
> Related exceptions happen when we use "threads=" attribute for entity; usually exception stack contains message "connection already closed"
> It shouldn't happen with some JNDI data source, where Connection.close() simply returns Connection to a pool of available connections, but we might get different errors.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org