You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jason Rutherglen (JIRA)" <ji...@apache.org> on 2008/10/01 20:09:44 UTC

[jira] Updated: (LUCENE-1313) Ocean Realtime Search

     [ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Rutherglen updated LUCENE-1313:
-------------------------------------

    Attachment: LUCENE-1313.patch

LUCENE-1313.patch

Added javadocs.  Still needs the LUCENE-1314 completed which will be divided into multiple patches.

> Ocean Realtime Search
> ---------------------
>
>                 Key: LUCENE-1313
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1313
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/*
>            Reporter: Jason Rutherglen
>         Attachments: LUCENE-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch
>
>
> Provides realtime search using Lucene.  Conceptually, updates are divided into discrete transactions.  The transaction is recorded to a transaction log which is similar to the mysql bin log.  Deletes from the transaction are made to the existing indexes.  Document additions are made to an in memory InstantiatedIndex.  The transaction is then complete.  After each transaction TransactionSystem.getSearcher() may be called which allows searching over the index including the latest transaction.
> TransactionSystem is the main class.  Methods similar to IndexWriter are provided for updating.  getSearcher returns a Searcher class. 
> - getSearcher()
> - addDocument(Document document)
> - addDocument(Document document, Analyzer analyzer)
> - updateDocument(Term term, Document document)
> - updateDocument(Term term, Document document, Analyzer analyzer)
> - deleteDocument(Term term)
> - deleteDocument(Query query)
> - commitTransaction(List<Document> documents, Analyzer analyzer, List<Term> deleteByTerms, List<Query> deleteByQueries)
> Sample code:
> {code}
> // setup
> FSDirectoryMap directoryMap = new FSDirectoryMap(new File("/testocean"), "log");
> LogDirectory logDirectory = directoryMap.getLogDirectory();
> TransactionLog transactionLog = new TransactionLog(logDirectory);
> TransactionSystem system = new TransactionSystem(transactionLog, new SimpleAnalyzer(), directoryMap);
> // transaction
> Document d = new Document();
> d.add(new Field("contents", "hello world", Field.Store.YES, Field.Index.TOKENIZED));
> system.addDocument(d);
> // search
> OceanSearcher searcher = system.getSearcher();
> ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
> System.out.println(hits.length + " total results");
> for (int i = 0; i < hits.length && i < 10; i++) {
>   Document d = searcher.doc(hits[i].doc);
>   System.out.println(i + " " + hits[i].score+ " " + d.get("contents");
> }
> {code}
> There is a test class org.apache.lucene.ocean.TestSearch that was used for basic testing.  
> A sample disk directory structure is as follows:
> |/snapshot_105_00.xml | XML file containing which indexes and their generation numbers correspond to a snapshot.  Each transaction creates a new snapshot file.  In this file the 105 is the snapshotid, also known as the transactionid.  The 00 is the minor version of the snapshot corresponding to a merge.  A merge is a minor snapshot version because the data does not change, only the underlying structure of the index|
> |/3 | Directory containing an on disk Lucene index|
> |/log | Directory containing log files|
> |/log/log00000001.bin | Log file.  As new log files are created the suffix number is incremented|

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org