You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shawn Heisey (JIRA)" <ji...@apache.org> on 2016/03/16 20:32:33 UTC
[jira] [Commented] (SOLR-2613) DIH Cache backed w/bdb-je
[ https://issues.apache.org/jira/browse/SOLR-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197981#comment-15197981 ]
Shawn Heisey commented on SOLR-2613:
------------------------------------
This came up in a discussion on IRC today, talking about nested entity situations where the inner entities have a very large number of rows, so memory-based caches would require far more memory than the machine can hold.
The Oracle Berkeley DB implementation was specifically mentioned, which is why I'm here instead of opening a new issue. This is licensed under the AGPL, so we can't distribute it, but I wonder if maybe we could implement enough of an API layer that a user could provide the jar themselves, tell Solr what class will be needed, and be in business. Is this what the patch on this issue does? I haven't looked deeply.
Other ideas, which might need a separate issue for disk-based caching implementations:
I had the idea of using SQLite for caching in a single-file database. SQLite is public domain, and there are ways to access it from Java.
Even just a simple implementation that writes little files to the disk would work. To avoid tons of files in a single directory, perhaps this idea could get a 32-bit hash of the key and write to a four-level directory structure where each directory is two hex characters. df/8c/12/b5
A disk-based solution would not be as fast as the memory-based solution already available, but as long as it was on a local physical disk, it would probably be faster than making N+1 queries to a remote database.
> DIH Cache backed w/bdb-je
> -------------------------
>
> Key: SOLR-2613
> URL: https://issues.apache.org/jira/browse/SOLR-2613
> Project: Solr
> Issue Type: Improvement
> Components: contrib - DataImportHandler
> Affects Versions: 4.0-ALPHA
> Reporter: James Dyer
> Priority: Minor
> Attachments: SOLR-2613.patch, SOLR-2613.patch, SOLR-2613.patch, SOLR-2613.patch, SOLR-2613.patch, SOLR-2613.patch, SOLR-2613.patch
>
>
> This is spun out of SOLR-2382, which provides a framework for multiple cacheing implementations with DIH. This cache implementation is fast & flexible, supporting persistence and delta updates. However, it depends on Berkley Database Java Edition so in order to evaluate this and use it you must download bdb-je from Oracle and accept the license requirements.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org