You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Lohrenz, Steven" <St...@hmhpub.com> on 2010/11/30 12:57:17 UTC

Return Lucene DocId in Solr Results

Hi,

I was wondering how I would go about getting the lucene docid included in the results from a solr query?

I've built a QueryParser to query another solr instance and and join the results of the two instances through the use of a Filter.  The Filter needs the lucene docid to work. This is the only bit I'm missing right now.

Thanks,
Steve


Re: Return Lucene DocId in Solr Results

Posted by Sasank Mudunuri <sa...@gmail.com>.
Take this with a sizeable grain of salt as I haven't actually tried doing
this. But you might try using an IndexReader which it looks like you can get
from this class:

http://lucene.apache.org/solr/api/org/apache/solr/core/StandardIndexReaderFactory.html

sasank

On Tue, Nov 30, 2010 at 6:45 AM, Lohrenz, Steven
<St...@hmhpub.com>wrote:

> Hmm, I found some similar queries on stackoverflow and they did not
> recommend exposing the lucene docId.
>
> So, I guess my question becomes: What is the best way, from within my
> custom QParser, to take a list of solr primary keys (that were retrieved
> from elsewhere) and turn them into docIds? I also saw something about
> cacheing them using a Field Cache - how would I do that?
>
> Thanks,
> Steve
>
> -----Original Message-----
> From: Lohrenz, Steven [mailto:Steven.Lohrenz@hmhpub.com]
> Sent: 30 November 2010 11:57
> To: solr-user@lucene.apache.org
> Subject: Return Lucene DocId in Solr Results
>
> Hi,
>
> I was wondering how I would go about getting the lucene docid included in
> the results from a solr query?
>
> I've built a QueryParser to query another solr instance and and join the
> results of the two instances through the use of a Filter.  The Filter needs
> the lucene docid to work. This is the only bit I'm missing right now.
>
> Thanks,
> Steve
>
>

RE: Return Lucene DocId in Solr Results

Posted by "Lohrenz, Steven" <St...@hmhpub.com>.
Hmm, I found some similar queries on stackoverflow and they did not recommend exposing the lucene docId. 

So, I guess my question becomes: What is the best way, from within my custom QParser, to take a list of solr primary keys (that were retrieved from elsewhere) and turn them into docIds? I also saw something about cacheing them using a Field Cache - how would I do that?

Thanks,
Steve

-----Original Message-----
From: Lohrenz, Steven [mailto:Steven.Lohrenz@hmhpub.com] 
Sent: 30 November 2010 11:57
To: solr-user@lucene.apache.org
Subject: Return Lucene DocId in Solr Results

Hi,

I was wondering how I would go about getting the lucene docid included in the results from a solr query?

I've built a QueryParser to query another solr instance and and join the results of the two instances through the use of a Filter.  The Filter needs the lucene docid to work. This is the only bit I'm missing right now.

Thanks,
Steve


Re: Return Lucene DocId in Solr Results

Posted by Erick Erickson <er...@gmail.com>.
You have to call termDocs.next() after termDocs.seek. Something like
termDocs.seek().
if (termDocs.next()) {
   // means there was a term/doc matching and your references should be
valid.
}

On Thu, Dec 2, 2010 at 10:22 AM, Lohrenz, Steven
<St...@hmhpub.com>wrote:

> I must be missing something as I'm getting a NPE on the line: docIds[i] =
> termDocs.doc();
> here's what I came up with:
>
> private int[] getDocIdsFromPrimaryKey(SolrQueryRequest req, List<Favorites>
> favsBeans) throws ParseException {
>        // open the core & get data directory
>        String indexDir = req.getCore().getIndexDir();
>
>         FSDirectory indexDirectory = null;
>        try {
>            indexDirectory = FSDirectory.open(new File(indexDir));
>         } catch (IOException e) {
>            throw new ParseException("IOException, cannot open the index at:
> " + indexDir + " " + e.getMessage());
>        }
>
>         //String pkQueryString = "resourceId:" + favBean.getResourceId();
>         //Query pkQuery = new QueryParser(Version.LUCENE_CURRENT,
> "resourceId", new StandardAnalyzer()).parse(pkQueryString);
>
>        IndexSearcher searcher = null;
>        TopScoreDocCollector collector = null;
>         IndexReader indexReader = null;
>        TermDocs termDocs = null;
>
>        try {
>            searcher = new IndexSearcher(indexDirectory, true);
>            indexReader = new FilterIndexReader(searcher.getIndexReader());
>            termDocs = indexReader.termDocs();
>         } catch (IOException e) {
>            throw new ParseException("IOException, cannot open the index at:
> " + indexDir + " " + e.getMessage());
>        }
>
>        int[] docIds = new int[favsBeans.size()];
>        int i = 0;
>        for(Favorites favBean: favsBeans) {
>             Term term = new Term("resourceId", favBean.getResourceId());
>            try {
>                termDocs.seek(term);
>                docIds[i] = termDocs.doc();
>            } catch (IOException e) {
>                throw new ParseException("IOException, cannot seek to the
> primary key " + favBean.getResourceId() + " in : " + indexDir + " " +
> e.getMessage());
>             }
>            //ScoreDoc[] hits = collector.topDocs().scoreDocs;
>            //if(hits != null && hits[0] != null) {
>
>             i++;
>            //}
>        }
>
>        Arrays.sort(docIds);
>        return docIds;
>    }
>
> Thanks,
> Steve
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: 02 December 2010 14:20
> To: solr-user@lucene.apache.org
> Subject: Re: Return Lucene DocId in Solr Results
>
> Ahhh, you're already down in Lucene. That makes things easier...
>
> See TermDocs. Particularly seek(Term). That'll directly access the indexed
> unique key rather than having to form a bunch of queries.
>
> Best
> Erick
>
>
> On Thu, Dec 2, 2010 at 8:59 AM, Lohrenz, Steven
> <St...@hmhpub.com>wrote:
>
> > I would be interested in hearing about some ways to improve the
> algorithm.
> > I have done a very straightforward Lucene query within a loop to get the
> > docIds.
> >
> > Here's what I did to get it working where favsBean are objects returned
> > from a query of the second core, but there is probably a better way to do
> > it:
> >
> > private int[] getDocIdsFromPrimaryKey(SolrQueryRequest req,
> List<Favorites>
> > favsBeans) throws ParseException {
> >        // open the core & get data directory
> >        String indexDir = req.getCore().getIndexDir();
> >        FSDirectory index = null;
> >        try {
> >            index = FSDirectory.open(new File(indexDir));
> >        } catch (IOException e) {
> >            throw new ParseException("IOException, cannot open the index
> at:
> > " + indexDir + " " + e.getMessage());
> >        }
> >
> >        int[] docIds = new int[favsBeans.size()];
> >        int i = 0;
> >        for(Favorites favBean: favsBeans) {
> >            String pkQueryString = "resourceId:" +
> favBean.getResourceId();
> >            Query pkQuery = new QueryParser(Version.LUCENE_CURRENT,
> > "resourceId", new StandardAnalyzer()).parse(pkQueryString);
> >
> >            IndexSearcher searcher = null;
> >            TopScoreDocCollector collector = null;
> >            try {
> >                searcher = new IndexSearcher(index, true);
> >                collector = TopScoreDocCollector.create(1, true);
> >                searcher.search(pkQuery, collector);
> >            } catch (IOException e) {
> >                throw new ParseException("IOException, cannot search the
> > index at: " + indexDir + " " + e.getMessage());
> >            }
> >
> >            ScoreDoc[] hits = collector.topDocs().scoreDocs;
> >            if(hits != null && hits[0] != null) {
> >                docIds[i] = hits[0].doc;
> >                i++;
> >            }
> >        }
> >
> >        Arrays.sort(docIds);
> >        return docIds;
> >     }
> >
> > -----Original Message-----
> > From: Erick Erickson [mailto:erickerickson@gmail.com]
> > Sent: 02 December 2010 13:46
> > To: solr-user@lucene.apache.org
> > Subject: Re: Return Lucene DocId in Solr Results
> >
> > Sounds good, especially because your old scenario was fragile. The doc
> IDs
> > in
> > your first core could change as a result of a single doc deletion and
> > optimize. So
> > the doc IDs stored in the second core would then be wrong...
> >
> > Your user-defined unique key is definitely a better way to go. There are
> > some tricks
> > you could try if there are performance issues....
> >
> > Best
> > Erick
> >
> > On Thu, Dec 2, 2010 at 7:47 AM, Lohrenz, Steven
> > <St...@hmhpub.com>wrote:
> >
> > > I know the doc ids from one core have nothing to do with the other. I
> was
> > > going to use the docId returned from the first core in the solr results
> > and
> > > store it in the second core that way the second core knows about the
> doc
> > ids
> > > from the first core. So when you query the second core from the Filter
> in
> > > the first core you get returned a set of data that includes the docId
> > from
> > > the first core that the document relates to.
> > >
> > > I have backed off from this approach and have a user defined primary
> key
> > in
> > > the firstCore, which is stored as the reference in the secondCore and
> > when
> > > the filter performs the search it goes off and queries the firstCore
> for
> > > each primary key and gets the lucene docId from the returned doc.
> > >
> > > Thanks,
> > > Steve
> > >
> > > -----Original Message-----
> > > From: Erick Erickson [mailto:erickerickson@gmail.com]
> > > Sent: 02 December 2010 02:19
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Return Lucene DocId in Solr Results
> > >
> > > On the face of it, this doesn't make sense, so perhaps you can explain
> a
> > > bit.The doc IDs
> > > from one Solr instance have no relation to the doc IDs from another
> Solr
> > > instance. So anything
> > > that uses doc IDs from one Solr instance to create a filter on another
> > > instance doesn't seem
> > > to be something you'd want to do...
> > >
> > > Which may just mean I don't understand what you're trying to do. Can
> you
> > > back up a bit
> > > and describe the higher-level problem? This seems like it may be an XY
> > > problem, see:
> > > http://people.apache.org/~hossman/#xyproblem
> > >
> > > Best
> > > Erick
> > >
> > > On Tue, Nov 30, 2010 at 6:57 AM, Lohrenz, Steven
> > > <St...@hmhpub.com>wrote:
> > >
> > > > Hi,
> > > >
> > > > I was wondering how I would go about getting the lucene docid
> included
> > in
> > > > the results from a solr query?
> > > >
> > > > I've built a QueryParser to query another solr instance and and join
> > the
> > > > results of the two instances through the use of a Filter.  The Filter
> > > needs
> > > > the lucene docid to work. This is the only bit I'm missing right now.
> > > >
> > > > Thanks,
> > > > Steve
> > > >
> > > >
> > >
> >
>

RE: Return Lucene DocId in Solr Results

Posted by "Lohrenz, Steven" <St...@hmhpub.com>.
I must be missing something as I'm getting a NPE on the line: docIds[i] = termDocs.doc(); 
here's what I came up with:

private int[] getDocIdsFromPrimaryKey(SolrQueryRequest req, List<Favorites> favsBeans) throws ParseException {
        // open the core & get data directory
        String indexDir = req.getCore().getIndexDir();

        FSDirectory indexDirectory = null;
        try {
            indexDirectory = FSDirectory.open(new File(indexDir));
        } catch (IOException e) {
            throw new ParseException("IOException, cannot open the index at: " + indexDir + " " + e.getMessage());
        }

        //String pkQueryString = "resourceId:" + favBean.getResourceId();
        //Query pkQuery = new QueryParser(Version.LUCENE_CURRENT, "resourceId", new StandardAnalyzer()).parse(pkQueryString);

        IndexSearcher searcher = null;
        TopScoreDocCollector collector = null;
        IndexReader indexReader = null;
        TermDocs termDocs = null;

        try {
            searcher = new IndexSearcher(indexDirectory, true);
            indexReader = new FilterIndexReader(searcher.getIndexReader());
            termDocs = indexReader.termDocs();
        } catch (IOException e) {
            throw new ParseException("IOException, cannot open the index at: " + indexDir + " " + e.getMessage());
        }
        
        int[] docIds = new int[favsBeans.size()];
        int i = 0;
        for(Favorites favBean: favsBeans) {
            Term term = new Term("resourceId", favBean.getResourceId());
            try {
                termDocs.seek(term);
                docIds[i] = termDocs.doc();
            } catch (IOException e) {
                throw new ParseException("IOException, cannot seek to the primary key " + favBean.getResourceId() + " in : " + indexDir + " " + e.getMessage());
            }
            //ScoreDoc[] hits = collector.topDocs().scoreDocs;
            //if(hits != null && hits[0] != null) {

            i++;
            //}
        }
        
        Arrays.sort(docIds);
        return docIds;
    }

Thanks,
Steve
-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: 02 December 2010 14:20
To: solr-user@lucene.apache.org
Subject: Re: Return Lucene DocId in Solr Results

Ahhh, you're already down in Lucene. That makes things easier...

See TermDocs. Particularly seek(Term). That'll directly access the indexed
unique key rather than having to form a bunch of queries.

Best
Erick


On Thu, Dec 2, 2010 at 8:59 AM, Lohrenz, Steven
<St...@hmhpub.com>wrote:

> I would be interested in hearing about some ways to improve the algorithm.
> I have done a very straightforward Lucene query within a loop to get the
> docIds.
>
> Here's what I did to get it working where favsBean are objects returned
> from a query of the second core, but there is probably a better way to do
> it:
>
> private int[] getDocIdsFromPrimaryKey(SolrQueryRequest req, List<Favorites>
> favsBeans) throws ParseException {
>        // open the core & get data directory
>        String indexDir = req.getCore().getIndexDir();
>        FSDirectory index = null;
>        try {
>            index = FSDirectory.open(new File(indexDir));
>        } catch (IOException e) {
>            throw new ParseException("IOException, cannot open the index at:
> " + indexDir + " " + e.getMessage());
>        }
>
>        int[] docIds = new int[favsBeans.size()];
>        int i = 0;
>        for(Favorites favBean: favsBeans) {
>            String pkQueryString = "resourceId:" + favBean.getResourceId();
>            Query pkQuery = new QueryParser(Version.LUCENE_CURRENT,
> "resourceId", new StandardAnalyzer()).parse(pkQueryString);
>
>            IndexSearcher searcher = null;
>            TopScoreDocCollector collector = null;
>            try {
>                searcher = new IndexSearcher(index, true);
>                collector = TopScoreDocCollector.create(1, true);
>                searcher.search(pkQuery, collector);
>            } catch (IOException e) {
>                throw new ParseException("IOException, cannot search the
> index at: " + indexDir + " " + e.getMessage());
>            }
>
>            ScoreDoc[] hits = collector.topDocs().scoreDocs;
>            if(hits != null && hits[0] != null) {
>                docIds[i] = hits[0].doc;
>                i++;
>            }
>        }
>
>        Arrays.sort(docIds);
>        return docIds;
>     }
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: 02 December 2010 13:46
> To: solr-user@lucene.apache.org
> Subject: Re: Return Lucene DocId in Solr Results
>
> Sounds good, especially because your old scenario was fragile. The doc IDs
> in
> your first core could change as a result of a single doc deletion and
> optimize. So
> the doc IDs stored in the second core would then be wrong...
>
> Your user-defined unique key is definitely a better way to go. There are
> some tricks
> you could try if there are performance issues....
>
> Best
> Erick
>
> On Thu, Dec 2, 2010 at 7:47 AM, Lohrenz, Steven
> <St...@hmhpub.com>wrote:
>
> > I know the doc ids from one core have nothing to do with the other. I was
> > going to use the docId returned from the first core in the solr results
> and
> > store it in the second core that way the second core knows about the doc
> ids
> > from the first core. So when you query the second core from the Filter in
> > the first core you get returned a set of data that includes the docId
> from
> > the first core that the document relates to.
> >
> > I have backed off from this approach and have a user defined primary key
> in
> > the firstCore, which is stored as the reference in the secondCore and
> when
> > the filter performs the search it goes off and queries the firstCore for
> > each primary key and gets the lucene docId from the returned doc.
> >
> > Thanks,
> > Steve
> >
> > -----Original Message-----
> > From: Erick Erickson [mailto:erickerickson@gmail.com]
> > Sent: 02 December 2010 02:19
> > To: solr-user@lucene.apache.org
> > Subject: Re: Return Lucene DocId in Solr Results
> >
> > On the face of it, this doesn't make sense, so perhaps you can explain a
> > bit.The doc IDs
> > from one Solr instance have no relation to the doc IDs from another Solr
> > instance. So anything
> > that uses doc IDs from one Solr instance to create a filter on another
> > instance doesn't seem
> > to be something you'd want to do...
> >
> > Which may just mean I don't understand what you're trying to do. Can you
> > back up a bit
> > and describe the higher-level problem? This seems like it may be an XY
> > problem, see:
> > http://people.apache.org/~hossman/#xyproblem
> >
> > Best
> > Erick
> >
> > On Tue, Nov 30, 2010 at 6:57 AM, Lohrenz, Steven
> > <St...@hmhpub.com>wrote:
> >
> > > Hi,
> > >
> > > I was wondering how I would go about getting the lucene docid included
> in
> > > the results from a solr query?
> > >
> > > I've built a QueryParser to query another solr instance and and join
> the
> > > results of the two instances through the use of a Filter.  The Filter
> > needs
> > > the lucene docid to work. This is the only bit I'm missing right now.
> > >
> > > Thanks,
> > > Steve
> > >
> > >
> >
>

Re: Return Lucene DocId in Solr Results

Posted by Chris Hostetter <ho...@fucit.org>.
: Subject: Re: Return Lucene DocId in Solr Results
: 
: Ahhh, you're already down in Lucene. That makes things easier...
: 
: See TermDocs. Particularly seek(Term). That'll directly access the indexed
: unique key rather than having to form a bunch of queries.

you should also sort your "keys" lexigraphically first before you loop 
over them - that will let you reuse the same Term enumerator and always 
seek forward (single pass)


-Hoss

Re: Return Lucene DocId in Solr Results

Posted by Erick Erickson <er...@gmail.com>.
Ahhh, you're already down in Lucene. That makes things easier...

See TermDocs. Particularly seek(Term). That'll directly access the indexed
unique key rather than having to form a bunch of queries.

Best
Erick


On Thu, Dec 2, 2010 at 8:59 AM, Lohrenz, Steven
<St...@hmhpub.com>wrote:

> I would be interested in hearing about some ways to improve the algorithm.
> I have done a very straightforward Lucene query within a loop to get the
> docIds.
>
> Here's what I did to get it working where favsBean are objects returned
> from a query of the second core, but there is probably a better way to do
> it:
>
> private int[] getDocIdsFromPrimaryKey(SolrQueryRequest req, List<Favorites>
> favsBeans) throws ParseException {
>        // open the core & get data directory
>        String indexDir = req.getCore().getIndexDir();
>        FSDirectory index = null;
>        try {
>            index = FSDirectory.open(new File(indexDir));
>        } catch (IOException e) {
>            throw new ParseException("IOException, cannot open the index at:
> " + indexDir + " " + e.getMessage());
>        }
>
>        int[] docIds = new int[favsBeans.size()];
>        int i = 0;
>        for(Favorites favBean: favsBeans) {
>            String pkQueryString = "resourceId:" + favBean.getResourceId();
>            Query pkQuery = new QueryParser(Version.LUCENE_CURRENT,
> "resourceId", new StandardAnalyzer()).parse(pkQueryString);
>
>            IndexSearcher searcher = null;
>            TopScoreDocCollector collector = null;
>            try {
>                searcher = new IndexSearcher(index, true);
>                collector = TopScoreDocCollector.create(1, true);
>                searcher.search(pkQuery, collector);
>            } catch (IOException e) {
>                throw new ParseException("IOException, cannot search the
> index at: " + indexDir + " " + e.getMessage());
>            }
>
>            ScoreDoc[] hits = collector.topDocs().scoreDocs;
>            if(hits != null && hits[0] != null) {
>                docIds[i] = hits[0].doc;
>                i++;
>            }
>        }
>
>        Arrays.sort(docIds);
>        return docIds;
>     }
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: 02 December 2010 13:46
> To: solr-user@lucene.apache.org
> Subject: Re: Return Lucene DocId in Solr Results
>
> Sounds good, especially because your old scenario was fragile. The doc IDs
> in
> your first core could change as a result of a single doc deletion and
> optimize. So
> the doc IDs stored in the second core would then be wrong...
>
> Your user-defined unique key is definitely a better way to go. There are
> some tricks
> you could try if there are performance issues....
>
> Best
> Erick
>
> On Thu, Dec 2, 2010 at 7:47 AM, Lohrenz, Steven
> <St...@hmhpub.com>wrote:
>
> > I know the doc ids from one core have nothing to do with the other. I was
> > going to use the docId returned from the first core in the solr results
> and
> > store it in the second core that way the second core knows about the doc
> ids
> > from the first core. So when you query the second core from the Filter in
> > the first core you get returned a set of data that includes the docId
> from
> > the first core that the document relates to.
> >
> > I have backed off from this approach and have a user defined primary key
> in
> > the firstCore, which is stored as the reference in the secondCore and
> when
> > the filter performs the search it goes off and queries the firstCore for
> > each primary key and gets the lucene docId from the returned doc.
> >
> > Thanks,
> > Steve
> >
> > -----Original Message-----
> > From: Erick Erickson [mailto:erickerickson@gmail.com]
> > Sent: 02 December 2010 02:19
> > To: solr-user@lucene.apache.org
> > Subject: Re: Return Lucene DocId in Solr Results
> >
> > On the face of it, this doesn't make sense, so perhaps you can explain a
> > bit.The doc IDs
> > from one Solr instance have no relation to the doc IDs from another Solr
> > instance. So anything
> > that uses doc IDs from one Solr instance to create a filter on another
> > instance doesn't seem
> > to be something you'd want to do...
> >
> > Which may just mean I don't understand what you're trying to do. Can you
> > back up a bit
> > and describe the higher-level problem? This seems like it may be an XY
> > problem, see:
> > http://people.apache.org/~hossman/#xyproblem
> >
> > Best
> > Erick
> >
> > On Tue, Nov 30, 2010 at 6:57 AM, Lohrenz, Steven
> > <St...@hmhpub.com>wrote:
> >
> > > Hi,
> > >
> > > I was wondering how I would go about getting the lucene docid included
> in
> > > the results from a solr query?
> > >
> > > I've built a QueryParser to query another solr instance and and join
> the
> > > results of the two instances through the use of a Filter.  The Filter
> > needs
> > > the lucene docid to work. This is the only bit I'm missing right now.
> > >
> > > Thanks,
> > > Steve
> > >
> > >
> >
>

RE: Return Lucene DocId in Solr Results

Posted by "Lohrenz, Steven" <St...@hmhpub.com>.
I would be interested in hearing about some ways to improve the algorithm. I have done a very straightforward Lucene query within a loop to get the docIds.

Here's what I did to get it working where favsBean are objects returned from a query of the second core, but there is probably a better way to do it:

private int[] getDocIdsFromPrimaryKey(SolrQueryRequest req, List<Favorites> favsBeans) throws ParseException {
        // open the core & get data directory
        String indexDir = req.getCore().getIndexDir();
        FSDirectory index = null;
        try {
            index = FSDirectory.open(new File(indexDir));
        } catch (IOException e) {
            throw new ParseException("IOException, cannot open the index at: " + indexDir + " " + e.getMessage());
        }
        
        int[] docIds = new int[favsBeans.size()];
        int i = 0;
        for(Favorites favBean: favsBeans) {
            String pkQueryString = "resourceId:" + favBean.getResourceId();
            Query pkQuery = new QueryParser(Version.LUCENE_CURRENT, "resourceId", new StandardAnalyzer()).parse(pkQueryString);

            IndexSearcher searcher = null;
            TopScoreDocCollector collector = null;
            try {
                searcher = new IndexSearcher(index, true);
                collector = TopScoreDocCollector.create(1, true);
                searcher.search(pkQuery, collector);
            } catch (IOException e) {
                throw new ParseException("IOException, cannot search the index at: " + indexDir + " " + e.getMessage());
            }

            ScoreDoc[] hits = collector.topDocs().scoreDocs;
            if(hits != null && hits[0] != null) {
                docIds[i] = hits[0].doc;
                i++;
            }
        }
        
        Arrays.sort(docIds);
        return docIds;
    }

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: 02 December 2010 13:46
To: solr-user@lucene.apache.org
Subject: Re: Return Lucene DocId in Solr Results

Sounds good, especially because your old scenario was fragile. The doc IDs
in
your first core could change as a result of a single doc deletion and
optimize. So
the doc IDs stored in the second core would then be wrong...

Your user-defined unique key is definitely a better way to go. There are
some tricks
you could try if there are performance issues....

Best
Erick

On Thu, Dec 2, 2010 at 7:47 AM, Lohrenz, Steven
<St...@hmhpub.com>wrote:

> I know the doc ids from one core have nothing to do with the other. I was
> going to use the docId returned from the first core in the solr results and
> store it in the second core that way the second core knows about the doc ids
> from the first core. So when you query the second core from the Filter in
> the first core you get returned a set of data that includes the docId from
> the first core that the document relates to.
>
> I have backed off from this approach and have a user defined primary key in
> the firstCore, which is stored as the reference in the secondCore and when
> the filter performs the search it goes off and queries the firstCore for
> each primary key and gets the lucene docId from the returned doc.
>
> Thanks,
> Steve
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: 02 December 2010 02:19
> To: solr-user@lucene.apache.org
> Subject: Re: Return Lucene DocId in Solr Results
>
> On the face of it, this doesn't make sense, so perhaps you can explain a
> bit.The doc IDs
> from one Solr instance have no relation to the doc IDs from another Solr
> instance. So anything
> that uses doc IDs from one Solr instance to create a filter on another
> instance doesn't seem
> to be something you'd want to do...
>
> Which may just mean I don't understand what you're trying to do. Can you
> back up a bit
> and describe the higher-level problem? This seems like it may be an XY
> problem, see:
> http://people.apache.org/~hossman/#xyproblem
>
> Best
> Erick
>
> On Tue, Nov 30, 2010 at 6:57 AM, Lohrenz, Steven
> <St...@hmhpub.com>wrote:
>
> > Hi,
> >
> > I was wondering how I would go about getting the lucene docid included in
> > the results from a solr query?
> >
> > I've built a QueryParser to query another solr instance and and join the
> > results of the two instances through the use of a Filter.  The Filter
> needs
> > the lucene docid to work. This is the only bit I'm missing right now.
> >
> > Thanks,
> > Steve
> >
> >
>

Re: Return Lucene DocId in Solr Results

Posted by Erick Erickson <er...@gmail.com>.
Sounds good, especially because your old scenario was fragile. The doc IDs
in
your first core could change as a result of a single doc deletion and
optimize. So
the doc IDs stored in the second core would then be wrong...

Your user-defined unique key is definitely a better way to go. There are
some tricks
you could try if there are performance issues....

Best
Erick

On Thu, Dec 2, 2010 at 7:47 AM, Lohrenz, Steven
<St...@hmhpub.com>wrote:

> I know the doc ids from one core have nothing to do with the other. I was
> going to use the docId returned from the first core in the solr results and
> store it in the second core that way the second core knows about the doc ids
> from the first core. So when you query the second core from the Filter in
> the first core you get returned a set of data that includes the docId from
> the first core that the document relates to.
>
> I have backed off from this approach and have a user defined primary key in
> the firstCore, which is stored as the reference in the secondCore and when
> the filter performs the search it goes off and queries the firstCore for
> each primary key and gets the lucene docId from the returned doc.
>
> Thanks,
> Steve
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: 02 December 2010 02:19
> To: solr-user@lucene.apache.org
> Subject: Re: Return Lucene DocId in Solr Results
>
> On the face of it, this doesn't make sense, so perhaps you can explain a
> bit.The doc IDs
> from one Solr instance have no relation to the doc IDs from another Solr
> instance. So anything
> that uses doc IDs from one Solr instance to create a filter on another
> instance doesn't seem
> to be something you'd want to do...
>
> Which may just mean I don't understand what you're trying to do. Can you
> back up a bit
> and describe the higher-level problem? This seems like it may be an XY
> problem, see:
> http://people.apache.org/~hossman/#xyproblem
>
> Best
> Erick
>
> On Tue, Nov 30, 2010 at 6:57 AM, Lohrenz, Steven
> <St...@hmhpub.com>wrote:
>
> > Hi,
> >
> > I was wondering how I would go about getting the lucene docid included in
> > the results from a solr query?
> >
> > I've built a QueryParser to query another solr instance and and join the
> > results of the two instances through the use of a Filter.  The Filter
> needs
> > the lucene docid to work. This is the only bit I'm missing right now.
> >
> > Thanks,
> > Steve
> >
> >
>

RE: Return Lucene DocId in Solr Results

Posted by "Lohrenz, Steven" <St...@hmhpub.com>.
I know the doc ids from one core have nothing to do with the other. I was going to use the docId returned from the first core in the solr results and store it in the second core that way the second core knows about the doc ids from the first core. So when you query the second core from the Filter in the first core you get returned a set of data that includes the docId from the first core that the document relates to. 

I have backed off from this approach and have a user defined primary key in the firstCore, which is stored as the reference in the secondCore and when the filter performs the search it goes off and queries the firstCore for each primary key and gets the lucene docId from the returned doc. 

Thanks,
Steve

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: 02 December 2010 02:19
To: solr-user@lucene.apache.org
Subject: Re: Return Lucene DocId in Solr Results

On the face of it, this doesn't make sense, so perhaps you can explain a
bit.The doc IDs
from one Solr instance have no relation to the doc IDs from another Solr
instance. So anything
that uses doc IDs from one Solr instance to create a filter on another
instance doesn't seem
to be something you'd want to do...

Which may just mean I don't understand what you're trying to do. Can you
back up a bit
and describe the higher-level problem? This seems like it may be an XY
problem, see:
http://people.apache.org/~hossman/#xyproblem

Best
Erick

On Tue, Nov 30, 2010 at 6:57 AM, Lohrenz, Steven
<St...@hmhpub.com>wrote:

> Hi,
>
> I was wondering how I would go about getting the lucene docid included in
> the results from a solr query?
>
> I've built a QueryParser to query another solr instance and and join the
> results of the two instances through the use of a Filter.  The Filter needs
> the lucene docid to work. This is the only bit I'm missing right now.
>
> Thanks,
> Steve
>
>

Re: Return Lucene DocId in Solr Results

Posted by Erick Erickson <er...@gmail.com>.
On the face of it, this doesn't make sense, so perhaps you can explain a
bit.The doc IDs
from one Solr instance have no relation to the doc IDs from another Solr
instance. So anything
that uses doc IDs from one Solr instance to create a filter on another
instance doesn't seem
to be something you'd want to do...

Which may just mean I don't understand what you're trying to do. Can you
back up a bit
and describe the higher-level problem? This seems like it may be an XY
problem, see:
http://people.apache.org/~hossman/#xyproblem

Best
Erick

On Tue, Nov 30, 2010 at 6:57 AM, Lohrenz, Steven
<St...@hmhpub.com>wrote:

> Hi,
>
> I was wondering how I would go about getting the lucene docid included in
> the results from a solr query?
>
> I've built a QueryParser to query another solr instance and and join the
> results of the two instances through the use of a Filter.  The Filter needs
> the lucene docid to work. This is the only bit I'm missing right now.
>
> Thanks,
> Steve
>
>