You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@lucene.apache.org by anton feldmann <an...@uni-bielefeld.de> on 2006/03/20 21:13:48 UTC

lucene searching in pdf

I am writing a program to search into an PDF document. I have problems 
with generate an index file outof a lot of pdf documents. I want that i 
can store more than one pdfFile into the indexFile and i want to that 
the program is giving back the  1. file (apsolutepath) 2. word and lexem 
3. score 4. and line how do i get n pdf documents in one indexfile 
stored by 1, 2, 4?
i wrote a program that make an index of my filesystem and i can search 
in the filesystem to find files. i can not read pdf files and pars them 
with lucene.

i want to have an analyzer for all language lucene works with.

       IndexWriter write = new IndexWriter(index, new GermanAnalyzer(), 
true);

i use only the germananalyzer.

cheers

anton feldmann