You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lenya.apache.org by Andreas Hartmann <an...@apache.org> on 2006/07/27 10:48:20 UTC

[1.4] Lucene: How to check index from Java?

Hi Lenya devs,

I'm currently writing a test which checks if the index has been
updated when a document was changed.

How can I do this from Java (do a search, or check the index for
a certain document)?

TIA!

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Lucene: How to check index from Java?

Posted by so...@apache.org.
On 8/1/06, Andreas Hartmann <an...@wyona.org> wrote:
> > On 7/27/06, Andreas Hartmann <an...@apache.org> wrote:
> >> Andreas Hartmann wrote:
> >> > I'm currently writing a test which checks if the index has been
> >> > updated when a document was changed.
> >> >
> >> > How can I do this from Java (do a search, or check the index for
> >> > a certain document)?
> >>
> >> I guess the easiest way is to issue a search:
> >>
> >>    cocoon://modules/lucene/search.xml?queryString=...
> >
> > How about checking the last modified date of:
> >    {pub}/work/search/lucene/index/live/index/segments
> > ?  Check if it changes with an incremental update.
>
> Thanks for the hint, but we shouldn't base our code on Lucene internals.
> The code will break e.g. when a DB is used to store the index, or if it
> is stored on a different machine. We would at least have to encapsulate
> this and make it configurable, but I think there should be a more direct
> way.

That is why we use APIs and Implementation classes?
...search.Search
...search.lucene.Lucene implements Search
getLastModifiedDate()

And publication.xconf configures which Search implementation to use
for the Publication?

You could use the Lucene's IndexReader's static function:
IndexReader.lastModified(Directory|File|String indexDirectory)
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.html#lastModified(java.lang.String)

Again, the function is probably implementation specific, so it should
be wrapped.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Lucene: How to check index from Java?

Posted by Andreas Hartmann <an...@wyona.org>.
> On 7/27/06, Andreas Hartmann <an...@apache.org> wrote:
>> Andreas Hartmann wrote:
>> > Hi Lenya devs,
>> >
>> > I'm currently writing a test which checks if the index has been
>> > updated when a document was changed.
>> >
>> > How can I do this from Java (do a search, or check the index for
>> > a certain document)?
>>
>> I guess the easiest way is to issue a search:
>>
>>    cocoon://modules/lucene/search.xml?queryString=...
>>
>> I'll try this and ask again if it doesn't work.
>>
>> -- Andreas
>
> How about checking the last modified date of:
>    {pub}/work/search/lucene/index/live/index/segments
> ?  Check if it changes with an incremental update.

Thanks for the hint, but we shouldn't base our code on Lucene internals.
The code will break e.g. when a DB is used to store the index, or if it
is stored on a different machine. We would at least have to encapsulate
this and make it configurable, but I think there should be a more direct
way.

-- Andreas


--------------------------------------------------------------
Andreas Hartmann     andreas.hartmann@wyona.com +41 1 272 9161
                     Wyona AG, Hardstrasse 219, CH-8005 Zurich
Open Source CMS      http://www.wyona.org http://www.wyona.com
--------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Lucene: How to check index from Java?

Posted by Renaud Richardet <re...@wyona.com>.
solprovider@apache.org wrote:
> On 7/27/06, Andreas Hartmann <an...@apache.org> wrote:
>> Andreas Hartmann wrote:
>> > Hi Lenya devs,
>> >
>> > I'm currently writing a test which checks if the index has been
>> > updated when a document was changed.
>> >
>> > How can I do this from Java (do a search, or check the index for
>> > a certain document)?
>>
>> I guess the easiest way is to issue a search:
>>
>>    cocoon://modules/lucene/search.xml?queryString=...
>>
>> I'll try this and ask again if it doesn't work.
>>
>> -- Andreas
>
> How about checking the last modified date of:
>   {pub}/work/search/lucene/index/live/index/segments
> ?  Check if it changes with an incremental update.
>
you could also use something similar to 
https://issues.apache.org/jira/browse/NUTCH-330:

public static void main(String[] args) throws Exception {
+    String indexPath = args[0];
+    String queryString = args[1];
+
+    // some testing
+    File indexDir = new File(indexPath);
+    File segmentFile = new File (indexPath + "/segments");
+    if (!indexDir.exists() || !indexDir.isDirectory() || 
!segmentFile.exists() || !segmentFile.isFile()) {
+      throw new Exception("The index directory you provided (" 
+indexDir.getAbsolutePath() +
+          " does not exist or is not an index directory. Hint: there 
should be a file called \"segments\"" +
+          "inside of your index directory");
+    }
+
+    Searcher searcher = new IndexSearcher(indexPath);
+    QueryParser parser = new QueryParser("content", new 
StandardAnalyzer());
+    Query q = parser.parse(queryString);
+    Hits hits = searcher.search(q);
+
+    System.out.println("Search term \"" + queryString + "\" found in 
document(s):");
+    for (int i = 0; i < hits.length(); i++) {
+      System.out.println(hits.doc(i).get("url") + "; Score: " + 
hits.score(i));
+    }
+    System.out.println("\n" + hits.length() + " document(s) found");
+    searcher.close();
+  }

HTH,
Renaud

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Lucene: How to check index from Java?

Posted by so...@apache.org.
On 7/27/06, Andreas Hartmann <an...@apache.org> wrote:
> Andreas Hartmann wrote:
> > Hi Lenya devs,
> >
> > I'm currently writing a test which checks if the index has been
> > updated when a document was changed.
> >
> > How can I do this from Java (do a search, or check the index for
> > a certain document)?
>
> I guess the easiest way is to issue a search:
>
>    cocoon://modules/lucene/search.xml?queryString=...
>
> I'll try this and ask again if it doesn't work.
>
> -- Andreas

How about checking the last modified date of:
   {pub}/work/search/lucene/index/live/index/segments
?  Check if it changes with an incremental update.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: [1.4] Lucene: How to check index from Java?

Posted by Andreas Hartmann <an...@apache.org>.
Andreas Hartmann wrote:
> Hi Lenya devs,
> 
> I'm currently writing a test which checks if the index has been
> updated when a document was changed.
> 
> How can I do this from Java (do a search, or check the index for
> a certain document)?

I guess the easiest way is to issue a search:

   cocoon://modules/lucene/search.xml?queryString=...

I'll try this and ask again if it doesn't work.

-- Andreas

-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org