You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Claes Holmerson <cl...@polopoly.com> on 2005/02/03 18:02:07 UTC

Lock failure recovery

Hello

A commit.lock can get left by a process that dies in the middle of 
reading the index, for example because of an OutOfMemoryError. How can I 
handle such a left lock gracefully the next time the process runs? 
Checking if there is a lock is straight forward - but how can I be sure 
that it is not just a current lock created by another thread? The only 
methods I find to deal with the lock is IndexReader.isLocked() and 
IndexReader.unlock(). I would like to know the lock age - if it is older 
than a certain age then I can remove it. How do other people deal with 
left over locks?

Claes
-- 

Claes Holmerson
Polopoly - Cultivating the information garden
Kungsgatan 88, SE-112 27 Stockholm, SWEDEN
Direct: +46 8 506 782 59
Mobile: +46 704 47 82 59
Fax:  +46 8 506 782 51
claes.holmerson@polopoly.com, http://www.polopoly.com


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Lock failure recovery

Posted by Luke Shannon <ls...@futurebrand.com>.

The indexing process is totally synchronized in our system. Thus if an
Indexing thread starts up and the index exists, but is locked, I know this
to be the only indexing processing running so the lock must be from a
process that got stopped before it could finish.

So right before I begin writing to the index I have this check:

//if we have gotten to here that this is the only index running.
//the index should not be locked. if it is, the lock is "stale"
//and must be released before we can continue
        try {
            if (index.exists() && IndexReader.isLocked(indexFileLocation)) {
                Trace.ERROR("INDEX INFO: Had to clear a stale index lock");
                IndexReader.unlock(FSDirectory.getDirectory(index, false));
            }
        } catch (IOException e3) {
            Trace.ERROR("INDEX ERROR: Was unable to clear a stale index
lock: " + e3);
        }

Luke

----- Original Message ----- 
From: "Claes Holmerson" <cl...@polopoly.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, February 03, 2005 12:02 PM
Subject: Lock failure recovery


> Hello
>
> A commit.lock can get left by a process that dies in the middle of
> reading the index, for example because of an OutOfMemoryError. How can I
> handle such a left lock gracefully the next time the process runs?
> Checking if there is a lock is straight forward - but how can I be sure
> that it is not just a current lock created by another thread? The only
> methods I find to deal with the lock is IndexReader.isLocked() and
> IndexReader.unlock(). I would like to know the lock age - if it is older
> than a certain age then I can remove it. How do other people deal with
> left over locks?
>
> Claes
> -- 
>
> Claes Holmerson
> Polopoly - Cultivating the information garden
> Kungsgatan 88, SE-112 27 Stockholm, SWEDEN
> Direct: +46 8 506 782 59
> Mobile: +46 704 47 82 59
> Fax:  +46 8 506 782 51
> claes.holmerson@polopoly.com, http://www.polopoly.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Luke Shannon <ls...@futurebrand.com>.

Thanks for everyone who has been posting possible solutions. I am making
great progress and learning a lot.

This works, but the results include files that don't even contain a
"kcfileupload" field (not good):

query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
query2 = QueryParser.parse("stillhere", "olfaithfull", new
StandardAnalyzer());
BooleanQuery typeNegativeSearch = new BooleanQuery();
typeNegativeSearch.add(query1, false, true);
typeNegativeSearch.add(query2, true, false);

Someone meantioned a filter. So I have been playing with the test below.

The problem I have is this line:

Query query2 = QueryParser.parse("*", "kcfileupload", new
StandardAnalyzer());

Results in the following error:

org.apache.lucene.queryParser.ParseException: Lexical error at line 1,
column 2.  Encountered: <EOF> after : ""

I was hoping it would create a wild card search on kcfileupload. I feel like
I am getting close to a good solution. Any tips would help.

Thanks,

Luke

import junit.framework.TestCase;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.QueryFilter;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.store.RAMDirectory;

public class IsNotTypeTest extends TestCase {

    private RAMDirectory directory;

    protected void setUp() throws Exception {
        directory = new RAMDirectory();
        IndexWriter writer = new IndexWriter(directory,
            new StandardAnalyzer(), true);

        //jpg should show up in first query
        Document document = new Document();
        document.add(Field.Text("kcfileupload", "picture.jpg"));
        document.add(Field.Text("name", "pic one"));
        writer.addDocument(document);

        //jpg should show up in first query
        document = new Document();
        document.add(Field.Text("kcfileupload", "picture2.jpg"));
        document.add(Field.Text("name", "pic two"));
        writer.addDocument(document);

        //pdf should show up in second query
        document = new Document();
        document.add(Field.Text("kcfileupload", "file.pdf"));
        document.add(Field.Text("name", "pdf one"));
        writer.addDocument(document);

        //ppt should show up in second query
        document = new Document();
        document.add(Field.Text("kcfileupload", "file.ppt"));
        document.add(Field.Text("name", "power point one"));
        writer.addDocument(document);

        //ppt should show up in second query
        document = new Document();
        document.add(Field.Text("kcfileupload", "file2.ppt"));
        document.add(Field.Text("name", "power point two"));
        writer.addDocument(document);

        //other should not show in this test
        document = new Document();
        document.add(Field.Text("name", "link"));
        document.add(Field.Text("address", "www.cbc.ca"));
        writer.addDocument(document);

        writer.close();

    }

    public void testIsNotType() throws Exception {
        IndexSearcher searcher = new IndexSearcher(directory);
        Query query1 = QueryParser.parse("jpg", "kcfileupload", new
StandardAnalyzer());
        Query query2 = QueryParser.parse("*", "kcfileupload", new
StandardAnalyzer());
        QueryFilter jpgFilter = new QueryFilter(new TermQuery(new
Term("kcfileupload", "jpg")));
        Hits hits = searcher.search(query1);
        assertEquals(2, hits.length());
        int totalHits = hits.length();
        int count = 0;
        while (count < totalHits) {
            Document current = (Document)hits.doc(count);
            System.out.println("The upload is " + count + " is " +
current.getField("kcfileupload"));
            count++;
        }
        hits = searcher.search(query2, jpgFilter);
        assertEquals(3, hits.length());
        totalHits = hits.length();
        count = 0;
        while (count < totalHits) {
            Document current = (Document)hits.doc(count);
            System.out.println("The upload is " + count + " is " +
current.getField("kcfileupload"));
            count++;
        }
    }

}


----- Original Message ----- 
From: "张瑾" <pr...@gmail.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Friday, February 04, 2005 2:12 AM
Subject: Re: Parsing The Query: Every document that doesn't have a field
containing x


I  think you may can use a filter to get right result!
See examlples below
package lia.advsearching;

import junit.framework.TestCase;
import org.apache.lucene.analysis.WhitespaceAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.QueryFilter;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.store.RAMDirectory;

public class SecurityFilterTest extends TestCase {
  private RAMDirectory directory;

  protected void setUp() throws Exception {
    directory = new RAMDirectory();
    IndexWriter writer = new IndexWriter(directory,
        new WhitespaceAnalyzer(), true);

    // Elwood
    Document document = new Document();
    document.add(Field.Keyword("owner", "elwood"));
    document.add(Field.Text("keywords", "elwoods sensitive info"));
    writer.addDocument(document);

    // Jake
    document = new Document();
    document.add(Field.Keyword("owner", "jake"));
    document.add(Field.Text("keywords", "jakes sensitive info"));
    writer.addDocument(document);

    writer.close();
  }

  public void testSecurityFilter() throws Exception {
    TermQuery query = new TermQuery(new Term("keywords", "info"));

    IndexSearcher searcher = new IndexSearcher(directory);
    Hits hits = searcher.search(query);
    assertEquals("Both documents match", 2, hits.length());

    QueryFilter jakeFilter = new QueryFilter(
        new TermQuery(new Term("owner", "jake")));

    hits = searcher.search(query, jakeFilter);
    assertEquals(1, hits.length());
    assertEquals("elwood is safe",
        "jakes sensitive info", hits.doc(0).get("keywords"));
  }

}


On Thu, 3 Feb 2005 13:04:50 -0500, Luke Shannon
<ls...@futurebrand.com> wrote:
> Hello;
>
> I have a query that finds document that contain fields with a specific
> value.
>
> query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
>
> This works well.
>
> I would like a query that find documents containing all kcfileupload
fields
> that don't contain jpg.
>
> The example I found in the book that seems to relate shows me how to find
> documents without a specific term:
>
> QueryParser parser = new QueryParser("contents", analyzer);
> parser.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
>
> But than it says:
>
> Negating a term must be combined with at least one nonnegated term to
return
> documents; in other words, it isn't possible to use a query like NOT term
to
> find all documents that don't contain a term.
>
> So does that mean the above example wouldn't work?
>
> The API says:
>
>  a plus (+) or a minus (-) sign, indicating that the clause is required or
> prohibited respectively;
>
> I have been playing around with using the minus character without much
luck.
>
> Can someone give point me in the right direction to figure this out?
>
> Thanks,
>
> Luke
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


-- 
愿你快乐每一天

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Luke Shannon <ls...@futurebrand.com>.

Very Nice. Thanks!

Luke

----- Original Message ----- 
From: "张瑾" <pr...@gmail.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Friday, February 04, 2005 2:12 AM
Subject: Re: Parsing The Query: Every document that doesn't have a field
containing x


I  think you may can use a filter to get right result!
See examlples below
package lia.advsearching;

import junit.framework.TestCase;
import org.apache.lucene.analysis.WhitespaceAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.QueryFilter;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.store.RAMDirectory;

public class SecurityFilterTest extends TestCase {
  private RAMDirectory directory;

  protected void setUp() throws Exception {
    directory = new RAMDirectory();
    IndexWriter writer = new IndexWriter(directory,
        new WhitespaceAnalyzer(), true);

    // Elwood
    Document document = new Document();
    document.add(Field.Keyword("owner", "elwood"));
    document.add(Field.Text("keywords", "elwoods sensitive info"));
    writer.addDocument(document);

    // Jake
    document = new Document();
    document.add(Field.Keyword("owner", "jake"));
    document.add(Field.Text("keywords", "jakes sensitive info"));
    writer.addDocument(document);

    writer.close();
  }

  public void testSecurityFilter() throws Exception {
    TermQuery query = new TermQuery(new Term("keywords", "info"));

    IndexSearcher searcher = new IndexSearcher(directory);
    Hits hits = searcher.search(query);
    assertEquals("Both documents match", 2, hits.length());

    QueryFilter jakeFilter = new QueryFilter(
        new TermQuery(new Term("owner", "jake")));

    hits = searcher.search(query, jakeFilter);
    assertEquals(1, hits.length());
    assertEquals("elwood is safe",
        "jakes sensitive info", hits.doc(0).get("keywords"));
  }

}


On Thu, 3 Feb 2005 13:04:50 -0500, Luke Shannon
<ls...@futurebrand.com> wrote:
> Hello;
>
> I have a query that finds document that contain fields with a specific
> value.
>
> query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
>
> This works well.
>
> I would like a query that find documents containing all kcfileupload
fields
> that don't contain jpg.
>
> The example I found in the book that seems to relate shows me how to find
> documents without a specific term:
>
> QueryParser parser = new QueryParser("contents", analyzer);
> parser.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
>
> But than it says:
>
> Negating a term must be combined with at least one nonnegated term to
return
> documents; in other words, it isn't possible to use a query like NOT term
to
> find all documents that don't contain a term.
>
> So does that mean the above example wouldn't work?
>
> The API says:
>
>  a plus (+) or a minus (-) sign, indicating that the clause is required or
> prohibited respectively;
>
> I have been playing around with using the minus character without much
luck.
>
> Can someone give point me in the right direction to figure this out?
>
> Thanks,
>
> Luke
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


-- 
愿你快乐每一天

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by 张瑾 <pr...@gmail.com>.

I  think you may can use a filter to get right result!
See examlples below
package lia.advsearching;

import junit.framework.TestCase;
import org.apache.lucene.analysis.WhitespaceAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.QueryFilter;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.store.RAMDirectory;

public class SecurityFilterTest extends TestCase {
  private RAMDirectory directory;

  protected void setUp() throws Exception {
    directory = new RAMDirectory();
    IndexWriter writer = new IndexWriter(directory,
        new WhitespaceAnalyzer(), true);

    // Elwood
    Document document = new Document();
    document.add(Field.Keyword("owner", "elwood"));
    document.add(Field.Text("keywords", "elwoods sensitive info"));
    writer.addDocument(document);

    // Jake
    document = new Document();
    document.add(Field.Keyword("owner", "jake"));
    document.add(Field.Text("keywords", "jakes sensitive info"));
    writer.addDocument(document);

    writer.close();
  }

  public void testSecurityFilter() throws Exception {
    TermQuery query = new TermQuery(new Term("keywords", "info"));

    IndexSearcher searcher = new IndexSearcher(directory);
    Hits hits = searcher.search(query);
    assertEquals("Both documents match", 2, hits.length());

    QueryFilter jakeFilter = new QueryFilter(
        new TermQuery(new Term("owner", "jake")));

    hits = searcher.search(query, jakeFilter);
    assertEquals(1, hits.length());
    assertEquals("elwood is safe",
        "jakes sensitive info", hits.doc(0).get("keywords"));
  }

}


On Thu, 3 Feb 2005 13:04:50 -0500, Luke Shannon
<ls...@futurebrand.com> wrote:
> Hello;
> 
> I have a query that finds document that contain fields with a specific
> value.
> 
> query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
> 
> This works well.
> 
> I would like a query that find documents containing all kcfileupload fields
> that don't contain jpg.
> 
> The example I found in the book that seems to relate shows me how to find
> documents without a specific term:
> 
> QueryParser parser = new QueryParser("contents", analyzer);
> parser.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
> 
> But than it says:
> 
> Negating a term must be combined with at least one nonnegated term to return
> documents; in other words, it isn't possible to use a query like NOT term to
> find all documents that don't contain a term.
> 
> So does that mean the above example wouldn't work?
> 
> The API says:
> 
>  a plus (+) or a minus (-) sign, indicating that the clause is required or
> prohibited respectively;
> 
> I have been playing around with using the minus character without much luck.
> 
> Can someone give point me in the right direction to figure this out?
> 
> Thanks,
> 
> Luke
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


-- 
愿你快乐每一天

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Luke Shannon <ls...@futurebrand.com>.

This works:

query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
query2 = QueryParser.parse("stillHere", "olFaithFull", new
StandardAnalyzer());
BooleanQuery typeNegativeSearch = new BooleanQuery();
typeNegativeSearch.add(query1, false, false);
typeNegativeSearch.add(query2, false, false);

It returns 9 results. And in string form is: kcfileupload:jpg
olFaithFull:stillhere

But this:

query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
            query2 = QueryParser.parse("stillHere", "olFaithFull", new
StandardAnalyzer());
            BooleanQuery typeNegativeSearch = new BooleanQuery();
            typeNegativeSearch.add(query1, true, false);
            typeNegativeSearch.add(query2, true, false);

Reutrns 0 results and is in string form : +kcfileupload:jpg
+olFaithFull:stillhere

If I do the query kcfileupload:jpg in Luke I get 9 docs, each doc containing
a olFaithFull:stillHere. Why would +kcfileupload:jpg +olFaithFull:stillhere
return no results?

Thanks,

Luke

----- Original Message ----- 
From: "Maik Schreiber" <bl...@blizzy.de>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, February 03, 2005 4:55 PM
Subject: Re: Parsing The Query: Every document that doesn't have a field
containing x


> > Yes. There should be 119 with stillHere,
>
> You have double-checked that, haven't you? :)
>
> > and if I run a query in Luke on
> > kcfileupload = ppt, it returns one result. I am thinking I should at
least
> > get this result back with: -kcfileupload:jpg +olFaithFull:stillhere?
>
> You really should.
>
> -- 
> Maik Schreiber   *   http://www.blizzy.de <-- Get GMail invites here!
>
> GPG public key:
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
> Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Luke Shannon <ls...@futurebrand.com>.

I did, I have ran both queries in Luke.

kcfileupload:ppt

returns 1

olFaithfull:stillhere

returns 119

Luke

----- Original Message ----- 
From: "Maik Schreiber" <bl...@blizzy.de>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, February 03, 2005 4:55 PM
Subject: Re: Parsing The Query: Every document that doesn't have a field
containing x


> > Yes. There should be 119 with stillHere,
>
> You have double-checked that, haven't you? :)
>
> > and if I run a query in Luke on
> > kcfileupload = ppt, it returns one result. I am thinking I should at
least
> > get this result back with: -kcfileupload:jpg +olFaithFull:stillhere?
>
> You really should.
>
> -- 
> Maik Schreiber   *   http://www.blizzy.de <-- Get GMail invites here!
>
> GPG public key:
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
> Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Maik Schreiber <bl...@blizzy.de>.

> Yes. There should be 119 with stillHere,

You have double-checked that, haven't you? :)

> and if I run a query in Luke on
> kcfileupload = ppt, it returns one result. I am thinking I should at least
> get this result back with: -kcfileupload:jpg +olFaithFull:stillhere?

You really should.

-- 
Maik Schreiber   *   http://www.blizzy.de <-- Get GMail invites here!

GPG public key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Luke Shannon <ls...@futurebrand.com>.

Yes. There should be 119 with stillHere, and if I run a query in Luke on
kcfileupload = ppt, it returns one result. I am thinking I should at least
get this result back with: -kcfileupload:jpg +olFaithFull:stillhere?

Luke

----- Original Message ----- 
From: "Maik Schreiber" <bl...@blizzy.de>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, February 03, 2005 4:27 PM
Subject: Re: Parsing The Query: Every document that doesn't have a field
containing x


> > -kcfileupload:jpg +olFaithFull:stillhere
> >
> > This looks right to me. Why the 0 results?
>
> Looks good to me, too. You sure all your documents have
> olFaithFull:stillhere and there is at least a document with kcfileupload
not
> being "jpg"?
>
> -- 
> Maik Schreiber   *   http://www.blizzy.de <-- Get GMail invites here!
>
> GPG public key:
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
> Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Maik Schreiber <bl...@blizzy.de>.

> -kcfileupload:jpg +olFaithFull:stillhere
> 
> This looks right to me. Why the 0 results?

Looks good to me, too. You sure all your documents have 
olFaithFull:stillhere and there is at least a document with kcfileupload not 
being "jpg"?

-- 
Maik Schreiber   *   http://www.blizzy.de <-- Get GMail invites here!

GPG public key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Luke Shannon <ls...@futurebrand.com>.

Hello,

Still working on the same query, here is the code I am currently working
with.

I am thinking this should bring up all the documents that have
olFaithFull=stillHere and kcfileupload!=jpg (so anything else)

query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
query2 = QueryParser.parse("stillHere", "olFaithFull", new
StandardAnalyzer());
BooleanQuery typeNegativeSearch = new BooleanQuery();
typeNegativeSearch.add(query1, false, true);
typeNegativeSearch.add(query2, true, false);

There toString() on the query is:

-kcfileupload:jpg +olFaithFull:stillhere

This looks right to me. Why the 0 results?

Thanks,

Luke

----- Original Message ----- 
From: "Maik Schreiber" <bl...@blizzy.de>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, February 03, 2005 1:19 PM
Subject: Re: Parsing The Query: Every document that doesn't have a field
containing x


> > Negating a term must be combined with at least one nonnegated term to
return
> > documents; in other words, it isn't possible to use a query like NOT
term to
> > find all documents that don't contain a term.
> >
> > So does that mean the above example wouldn't work?
>
> Exactly. You cannot search for "-kcfileupload:jpg", you need at least one
> clause that actually _includes_ documents.
>
> Do you by chance have a field with known contents? If so, you could misuse
> that one and include it in your query (perhaps by doing range or
> wildcard/prefix search). If not, try IndexReader.terms() for building a
> Query yourself, then use that one for search.
>
> -- 
> Maik Schreiber   *   http://www.blizzy.de
>
> GPG public key:
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
> Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Luke Shannon <ls...@futurebrand.com>.

Ok.

I have added the following to every document:

doc.add(Field.UnIndexed("olFaithfull", "stillHere"));

The plan is a query that says: olFaithull = stillHere and kcfileupload!=jpg.

I have been experimenting with the MultiFieldQueryParser, this is not
working out for me. From a syntax how is this done? Does someone have an
example of a query similar to the one I am trying?

Thanks,

Luke

----- Original Message ----- 
From: "Maik Schreiber" <bl...@blizzy.de>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, February 03, 2005 1:19 PM
Subject: Re: Parsing The Query: Every document that doesn't have a field
containing x


> > Negating a term must be combined with at least one nonnegated term to
return
> > documents; in other words, it isn't possible to use a query like NOT
term to
> > find all documents that don't contain a term.
> >
> > So does that mean the above example wouldn't work?
>
> Exactly. You cannot search for "-kcfileupload:jpg", you need at least one
> clause that actually _includes_ documents.
>
> Do you by chance have a field with known contents? If so, you could misuse
> that one and include it in your query (perhaps by doing range or
> wildcard/prefix search). If not, try IndexReader.terms() for building a
> Query yourself, then use that one for search.
>
> -- 
> Maik Schreiber   *   http://www.blizzy.de
>
> GPG public key:
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
> Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x (but still has the field)

Posted by Luke Shannon <ls...@futurebrand.com>.

Hello;

I think Chris's approach might be helpfull, but I can't seems to get it to
work.

So since I running out of time and I still need to figure out "starts with"
and "ends with" queries, I have implemented a hacky solution to getting all
documents with a kcfileupload field present that does not contain jpg:

query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
query2 = QueryParser.parse("stillhere", "olfaithfull", new
StandardAnalyzer());//each document contains this
BooleanQuery typeNegativeSearch = new BooleanQuery();
typeNegativeSearch.add(query1, false, true);
typeNegativeSearch.add(query2, true, false);

What gets returned are all the documents without a kcfileupload = jpg. This
includes documents that don't even have a kcfileupload.

When I go through the results before displaying I check to make sure there
is a "kcfileupload" field.

This is not a good solution, and I hope to replace it soon. If anyone has
ideas please let me know.

Luke

----- Original Message ----- 
From: "Chris Hostetter" <ho...@fucit.org>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Friday, February 04, 2005 3:03 PM
Subject: Re: Parsing The Query: Every document that doesn't have a field
containing x



Another approach...

You can make a Filter that is the inverse of the output from another
filter, which means you can make a QueryFilter on the search, then wrap it
in your inverse Filter.

you can't execute a query on a filter without having a Query object, but
you can just apply the Filter directly to an IndexReader yourself, and get
back a BitSet containing the docIds of everydocument that does not contain
your term.

something like this should work...

   class NotFilter extends Filter {
      private Filter wraped;
      public NotFilter(Filter w) {
        wraped = w;
      }
      public BitSet bits(IndexReader r) {
        BitSet b = wraped.bits(r);
        b.flip(0,b.size());
        return b;
      }
   }
   ...
   BitSet results = (new NotFilter
                     (new QueryFilter
                      (new TermQuery(new Term("f","x"))))).bits(reader);




: Date: Thu, 3 Feb 2005 19:51:36 +0100
: From: Kelvin Tan <ke...@relevanz.com>
: Reply-To: Lucene Users List <lu...@jakarta.apache.org>
: To: Lucene Users List <lu...@jakarta.apache.org>
: Subject: Re: Parsing The Query: Every document that doesn't have a field
:     containing x
:
: Alternatively, add a dummy field-value to all documents, like
doc.add(Field.Keyword("foo", "bar"))
:
: Waste of space, but allows you to perform negated queries.
:
: On Thu, 03 Feb 2005 19:19:15 +0100, Maik Schreiber wrote:
: >> Negating a term must be combined with at least one nonnegated
: >> term to return documents; in other words, it isn't possible to
: >> use a query like NOT term to find all documents that don't
: >> contain a term.
: >>
: >> So does that mean the above example wouldn't work?
: >>
: > Exactly. You cannot search for "-kcfileupload:jpg", you need at
: > least one clause that actually _includes_ documents.
: >
: > Do you by chance have a field with known contents? If so, you could
: > misuse that one and include it in your query (perhaps by doing
: > range or wildcard/prefix search). If not, try IndexReader.terms()
: > for building a Query yourself, then use that one for search.
:
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
: For additional commands, e-mail: lucene-user-help@jakarta.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Luke Shannon <ls...@futurebrand.com>.

Hi Chris;

So the result would contain all documents that don't have field f containing
x?

What I need to figure out how to do is return all documents that have a
field f, but does not contain x.

Thanks for your post.

Luke


----- Original Message ----- 
From: "Chris Hostetter" <ho...@fucit.org>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Friday, February 04, 2005 3:03 PM
Subject: Re: Parsing The Query: Every document that doesn't have a field
containing x



Another approach...

You can make a Filter that is the inverse of the output from another
filter, which means you can make a QueryFilter on the search, then wrap it
in your inverse Filter.

you can't execute a query on a filter without having a Query object, but
you can just apply the Filter directly to an IndexReader yourself, and get
back a BitSet containing the docIds of everydocument that does not contain
your term.

something like this should work...

   class NotFilter extends Filter {
      private Filter wraped;
      public NotFilter(Filter w) {
        wraped = w;
      }
      public BitSet bits(IndexReader r) {
        BitSet b = wraped.bits(r);
        b.flip(0,b.size());
        return b;
      }
   }
   ...
   BitSet results = (new NotFilter
                     (new QueryFilter
                      (new TermQuery(new Term("f","x"))))).bits(reader);




: Date: Thu, 3 Feb 2005 19:51:36 +0100
: From: Kelvin Tan <ke...@relevanz.com>
: Reply-To: Lucene Users List <lu...@jakarta.apache.org>
: To: Lucene Users List <lu...@jakarta.apache.org>
: Subject: Re: Parsing The Query: Every document that doesn't have a field
:     containing x
:
: Alternatively, add a dummy field-value to all documents, like
doc.add(Field.Keyword("foo", "bar"))
:
: Waste of space, but allows you to perform negated queries.
:
: On Thu, 03 Feb 2005 19:19:15 +0100, Maik Schreiber wrote:
: >> Negating a term must be combined with at least one nonnegated
: >> term to return documents; in other words, it isn't possible to
: >> use a query like NOT term to find all documents that don't
: >> contain a term.
: >>
: >> So does that mean the above example wouldn't work?
: >>
: > Exactly. You cannot search for "-kcfileupload:jpg", you need at
: > least one clause that actually _includes_ documents.
: >
: > Do you by chance have a field with known contents? If so, you could
: > misuse that one and include it in your query (perhaps by doing
: > range or wildcard/prefix search). If not, try IndexReader.terms()
: > for building a Query yourself, then use that one for search.
:
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
: For additional commands, e-mail: lucene-user-help@jakarta.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Chris Hostetter <ho...@fucit.org>.

Another approach...

You can make a Filter that is the inverse of the output from another
filter, which means you can make a QueryFilter on the search, then wrap it
in your inverse Filter.

you can't execute a query on a filter without having a Query object, but
you can just apply the Filter directly to an IndexReader yourself, and get
back a BitSet containing the docIds of everydocument that does not contain
your term.

something like this should work...

   class NotFilter extends Filter {
      private Filter wraped;
      public NotFilter(Filter w) {
        wraped = w;
      }
      public BitSet bits(IndexReader r) {
        BitSet b = wraped.bits(r);
        b.flip(0,b.size());
        return b;
      }
   }
   ...
   BitSet results = (new NotFilter
                     (new QueryFilter
                      (new TermQuery(new Term("f","x"))))).bits(reader);




: Date: Thu, 3 Feb 2005 19:51:36 +0100
: From: Kelvin Tan <ke...@relevanz.com>
: Reply-To: Lucene Users List <lu...@jakarta.apache.org>
: To: Lucene Users List <lu...@jakarta.apache.org>
: Subject: Re: Parsing The Query: Every document that doesn't have a field
:     containing x
:
: Alternatively, add a dummy field-value to all documents, like doc.add(Field.Keyword("foo", "bar"))
:
: Waste of space, but allows you to perform negated queries.
:
: On Thu, 03 Feb 2005 19:19:15 +0100, Maik Schreiber wrote:
: >> Negating a term must be combined with at least one nonnegated
: >> term to return documents; in other words, it isn't possible to
: >> use a query like NOT term to find all documents that don't
: >> contain a term.
: >>
: >> So does that mean the above example wouldn't work?
: >>
: > Exactly. You cannot search for "-kcfileupload:jpg", you need at
: > least one clause that actually _includes_ documents.
: >
: > Do you by chance have a field with known contents? If so, you could
: > misuse that one and include it in your query (perhaps by doing
: > range or wildcard/prefix search). If not, try IndexReader.terms()
: > for building a Query yourself, then use that one for search.
:
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
: For additional commands, e-mail: lucene-user-help@jakarta.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Kelvin Tan <ke...@relevanz.com>.

Alternatively, add a dummy field-value to all documents, like doc.add(Field.Keyword("foo", "bar"))

Waste of space, but allows you to perform negated queries.

On Thu, 03 Feb 2005 19:19:15 +0100, Maik Schreiber wrote:
>> Negating a term must be combined with at least one nonnegated
>> term to return documents; in other words, it isn't possible to
>> use a query like NOT term to find all documents that don't
>> contain a term.
>>
>> So does that mean the above example wouldn't work?
>>
> Exactly. You cannot search for "-kcfileupload:jpg", you need at
> least one clause that actually _includes_ documents.
>
> Do you by chance have a field with known contents? If so, you could
> misuse that one and include it in your query (perhaps by doing
> range or wildcard/prefix search). If not, try IndexReader.terms()
> for building a Query yourself, then use that one for search.



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Parsing The Query: Every document that doesn't have a field containing x

Posted by Maik Schreiber <bl...@blizzy.de>.

> Negating a term must be combined with at least one nonnegated term to return
> documents; in other words, it isn't possible to use a query like NOT term to
> find all documents that don't contain a term.
> 
> So does that mean the above example wouldn't work?

Exactly. You cannot search for "-kcfileupload:jpg", you need at least one 
clause that actually _includes_ documents.

Do you by chance have a field with known contents? If so, you could misuse 
that one and include it in your query (perhaps by doing range or 
wildcard/prefix search). If not, try IndexReader.terms() for building a 
Query yourself, then use that one for search.

-- 
Maik Schreiber   *   http://www.blizzy.de

GPG public key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Parsing The Query: Every document that doesn't have a field containing x

Posted by Luke Shannon <ls...@futurebrand.com>.

Hello;

I have a query that finds document that contain fields with a specific
value.

query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());

This works well.

I would like a query that find documents containing all kcfileupload fields
that don't contain jpg.

The example I found in the book that seems to relate shows me how to find
documents without a specific term:

QueryParser parser = new QueryParser("contents", analyzer);
parser.setOperator(QueryParser.DEFAULT_OPERATOR_AND);

But than it says:

Negating a term must be combined with at least one nonnegated term to return
documents; in other words, it isn't possible to use a query like NOT term to
find all documents that don't contain a term.

So does that mean the above example wouldn't work?

The API says:

 a plus (+) or a minus (-) sign, indicating that the clause is required or
prohibited respectively;

I have been playing around with using the minus character without much luck.

Can someone give point me in the right direction to figure this out?

Thanks,

Luke




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org