You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by James Chirinos Pinedo <ja...@gmail.com> on 2014/09/16 09:41:21 UTC

Doubt Lucene

Hello,

I have a problem when i search with lucene, but i dont know if this is a
problem of my Index or my Search.
At Index, I have like parameter a list of data of an individual
The list has values like   "Name:PeterRooney", "Age:25", "From:NY" if the
individual is a person
but if the individual is an Event it has  "Date:20/09/2014".
However, the Docs have differente keys for the type of individual.

public void indexDocs(IndexWriter writer, List<String> listInd)
    throws IOException {
        try {
          Document doc = new Document();
            String aux;
            int pos;
            for(int i=0;i<listInd.size();i++){
                aux = listInd.get(i);
                pos = aux.indexOf(":");
                doc.add(new StringField(aux.substring(0,
pos),aux.substring(pos+1, aux.length()),Field.Store.YES));
            }
            if (writer.getConfig().getOpenMode() == OpenMode.CREATE) {
                writer.addDocument(doc);
            } else {
            // Existing index (an old copy of this document may have been
indexed) so
            // we use updateDocument instead to replace the old one
matching the exact
            // path, if present:
//            System.out.println("updating " + file);
//            writer.updateDocument(new StringBuffer("","",doc);
            }

        } finally {
//          fis.close();
        }
      }

When i make the search i have this code:

public class SearchEngine {
//  private IndexSearcher searcher = null;
//  private QueryParser parser = null;

  public SearchEngine(String line) throws IOException, ParseException {
    String index = "index";
    String field = "Name";
    String queries = null;
    int repeat = 1;
    boolean raw = false;
    String queryString = null;
    int hitsPerPage = 10;

    IndexReader reader = DirectoryReader.open(
FSDirectory.open(new File(index)));
    IndexSearcher searcher = new IndexSearcher(reader);
    // :Post-Release-Update-Version.LUCENE_XY:
    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_4_10_0);
    QueryParser parser = new QueryParser(Version.LUCENE_4_10_0, field,
analyzer);
    Query query = parser.parse(line);
    if (repeat > 0) {                           // repeat & time as
benchmark
        Date start = new Date();
        for (int i = 0; i < repeat; i++) {
          searcher.search(query, null, 100);
        }
    }


    doPagingSearch(searcher, query, hitsPerPage, raw, queries == null &&
queryString == null);
    reader.close();
  }

  public List<String> doPagingSearch(IndexSearcher searcher, Query query,
                                     int hitsPerPage, boolean raw, boolean
interactive) throws IOException {

    // Collect enough docs to show 5 pages
    TopDocs results = searcher.search(query, 1 * hitsPerPage);
    ScoreDoc[] hits = results.scoreDocs;
    List<String> result = new ArrayList<String>();
    int numTotalHits = results.totalHits;
    int start = 0;
    int end = Math.min(numTotalHits, hitsPerPage);

    while (true) {
      end = Math.min(hits.length, start + hitsPerPage);

      for (int i = start; i < end; i++) {
        if (raw) {                              // output raw format

          continue;
        }

        Document doc = searcher.doc(hits[i].doc);
        String path = doc.get("Name");
        result.add(path);
        System.out.println(path);
      }

      if (!interactive || end == 0) {
        break;
      }


        end = Math.min(numTotalHits, start + hitsPerPage);
    }
    return result;
  }
}

Re: Doubt Lucene

Posted by atawfik <co...@gmail.com>.

I tried to replicate your search scenario using the code below:

                Indexer ind = new Indexer();
		IndexWriter indW;
		List<String> listData = new LinkedList<>();
		listData.add("Name:PeterRooney");
		indW = ind.CreateIndexDir(listData);
		ind.indexDocs(indW, listData);
		ind.closeWriter(indW);

		SearchEngine sEng = new SearchEngine("PeterRooney");

And the console prints PeterRooney. Therefore, your problem is coming from
the *for* loop logic. There are three possible reasons. First, 
listaIndividuos's size is 0. The loop is never executed. Second,
listaIndividuos.get(i) returns empty lists. Third, you have many objects
that get indexed. However, because you have set *create=true* in the
*indexDocs* method, you always delete the old index and build a new one
causing the removal of old data. Therefore, You query was getting no
results.

Would mind to share the content of listaIndividuos and
ontM.getInfo(listaIndividuos.get(i))? It will be good to have a sample of
your documents.

Regards
Ameer



--
View this message in context: http://lucene.472066.n3.nabble.com/Doubt-Lucene-tp4159068p4159521.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Doubt Lucene

Posted by James Chirinos Pinedo <ja...@gmail.com>.

Hello,

I have a problem when i search with lucene, but i dont know if this is a
problem of my Index or my Search.
At Index, I have like parameter a list of data of an individual(the list is
with data of my Ontology Individuals)
The list has values like   "Name:PeterRooney", "Age:25", "From:NY" if the
individual is a person
but if the individual is an Event it has  "Date:20/09/2014" and the is more
types of individuals.
However, the Docs have differente keys for the type of individual , and the
Data is type String.

This is all my code for the index of the data


public class Indexer {
    public Indexer() {}

    /** Index all text files under a directory. */
    public IndexWriter CreateIndexDir(List<String> list) {
    String indexPath = "index";
    boolean create = true;

    try {
      Directory dir = FSDirectory.open(new File(indexPath));
      Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_4_10_0);
      IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_4_10_0,
analyzer);

      if (create) {
        // Create a new index in the directory, removing any
        // previously indexed documents:
        iwc.setOpenMode(OpenMode.CREATE);
      } else {
        // Add new documents to an existing index:
        iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);
      }

      // Optional: for better indexing performance, if you
      // are indexing many documents, increase the RAM
      // buffer.  But if you do this, increase the max heap
      // size to the JVM (eg add -Xmx512m or -Xmx1g):
      //
//       iwc.setRAMBufferSizeMB(256.0);

      IndexWriter writer = new IndexWriter(dir, iwc);
      return writer;
    } catch (IOException e) {
      System.out.println(" caught a " + e.getClass() +
       "\n with message: " + e.getMessage());
      return null;
    }
    }

  public void indexDocs(IndexWriter writer, List<String> listInd)
    throws IOException {
        try {
          Document doc = new Document();
            String aux;
            int pos;
            for(int i=0;i<listInd.size();i++){
                aux = listInd.get(i);
                pos = aux.indexOf(":");
                doc.add(new StringField(aux.substring(0,
pos),aux.substring(pos+1, aux.length()),Field.Store.YES));
            }
            if (writer.getConfig().getOpenMode() == OpenMode.CREATE) {
                writer.addDocument(doc);
            } else {
            // Existing index (an old copy of this document may have been
indexed) so
            // we use updateDocument instead to replace the old one
matching the exact
            // path, if present:
//            System.out.println("updating " + file);
//            writer.updateDocument(new StringBuffer("","",doc);
            }

        } finally {
//          fis.close();
        }
      }

     public void closeWriter(IndexWriter writer) throws IOException{
      writer.close();
    }

}


When i make the search i have this code:

public class SearchEngine {
//  private IndexSearcher searcher = null;
//  private QueryParser parser = null;

  public SearchEngine(String line) throws IOException, ParseException {
    String index = "index";
    String field = "Name";
    String queries = null;
    int repeat = 1;
    boolean raw = false;
    String queryString = null;
    int hitsPerPage = 10;

    IndexReader reader = DirectoryReader.open(
FSDirectory.open(new File(index)));
    IndexSearcher searcher = new IndexSearcher(reader);
    // :Post-Release-Update-Version.LUCENE_XY:
    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_4_10_0);
    QueryParser parser = new QueryParser(Version.LUCENE_4_10_0, field,
analyzer);
    Query query = parser.parse(line);
    if (repeat > 0) {                           // repeat & time as
benchmark
        Date start = new Date();
        for (int i = 0; i < repeat; i++) {
          searcher.search(query, null, 100);
        }
    }


    doPagingSearch(searcher, query, hitsPerPage, raw, queries == null &&
queryString == null);
    reader.close();
  }

  public List<String> doPagingSearch(IndexSearcher searcher, Query query,
                                     int hitsPerPage, boolean raw, boolean
interactive) throws IOException {

    // Collect enough docs to show 5 pages
    TopDocs results = searcher.search(query, 1 * hitsPerPage);
    ScoreDoc[] hits = results.scoreDocs;
    List<String> result = new ArrayList<String>();
    int numTotalHits = results.totalHits;
    int start = 0;
    int end = Math.min(numTotalHits, hitsPerPage);

    while (true) {
      end = Math.min(hits.length, start + hitsPerPage);

      for (int i = start; i < end; i++) {
        if (raw) {                              // output raw format

          continue;
        }

        Document doc = searcher.doc(hits[i].doc);
        String path = doc.get("Name");
        result.add(path);
        System.out.println(path);
      }

      if (!interactive || end == 0) {
        break;
      }


        end = Math.min(numTotalHits, start + hitsPerPage);
    }
    return result;
  }
}

I call these object in this form :

Indexer ind = new Indexer();
        IndexWriter indW;
        for(int i=0;i<listaIndividuos.size();i++){
            List<String> listData = ontM.getInfo(listaIndividuos.get(i));
            indW = ind.CreateIndexDir(listData);
            ind.indexDocs(indW, listData);
            ind.closeWriter(indW);
        }

        SearchEngine sEng  = new SearchEngine("PeterRooney");



2014-09-16 5:17 GMT-05:00 atawfik <co...@gmail.com>:

> Hi,
>
> Can you elaborate more on  the confusion or doubt you have? Can you provide
> a sample of your document and query that give you the trouble?
>
> I was not able to deduce what is the problem.
>
> Regards
> Ameer
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Doubt-Lucene-tp4159068p4159100.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Doubt Lucene

Posted by atawfik <co...@gmail.com>.

Hi,

Can you elaborate more on  the confusion or doubt you have? Can you provide
a sample of your document and query that give you the trouble?

I was not able to deduce what is the problem.

Regards
Ameer



--
View this message in context: http://lucene.472066.n3.nabble.com/Doubt-Lucene-tp4159068p4159100.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org