You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Marc Sturlese <ma...@gmail.com> on 2009/07/28 12:49:07 UTC

Re: update some index documents after indexing process is done with DIH

Ok, but if I handle it in a newSearcher listener it will be executed every
time I reload a core, isn't it? The thing is that I want to use an
IndexReader to load in a HashMap some doc fields of the index and depending
of the values of some field docs modify other docs. Its very memory
consuming (I have tested it with a simple lucene script). Thats why I wanted
to do it just after the indexing process.

My ideal case would be to do it in the commit function of
DirectUpdatehandler2.java just before
writer.optimize(cmd.maxOptimizeSegments); is executed. But I don't want to
mess that code... so trying to find out the best way to do that as a plugin
instead of a hack as possible.

Thanks in advance


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> It is best handled as a 'newSearcher' listener in solrconfig.xml.
> onImportEnd is invoked before committing
> 
> On Tue, Jul 28, 2009 at 3:13 PM, Marc Sturlese<ma...@gmail.com>
> wrote:
>>
>> Hey there,
>> I would like to be able to do something like: After the indexing process
>> is
>> done with DIH I would like to open an indexreader, iterate over all docs,
>> modify some of them depending on others and delete some others. I can
>> easy
>> do this directly coding with lucene but would like to know if there's a
>> way
>> to do it with Solr using SolrDocument or SolrInputDocument classes.
>> I have thougth in using SolrJ or DIH listener onImportEnd but not sure if
>> I
>> can get an IndexReader in there.
>> Any advice?
>> Thanks in advance
>> --
>> View this message in context:
>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24695947.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -----------------------------------------------------
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24696872.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: update some index documents after indexing process is done with DIH

Posted by Chris Hostetter <ho...@fucit.org>.
: What is confusing me now is that I have to implement my logic in

you're certianly in a fuzzy grey area here ... none of this stuff was 
designed for the kind of thing you're doing.

: But in processCommit, having access to the core I can get the IndexReader
: but I still don't know how to get the IndexWriter and SolrInputDocuments in

you don't get direct access ot the IndexWriter ... instead your 
UpdateProcessor uses the SolrCore to get an UpdateRequestProcessorChain to 
add (ie: replace) the SolrInputDocuments you made based on what you saw in 
the orriginal SolrInputDocuments.

for a second i was thinking that you'd have to worry about checking some 
threadlocal variable to keep yourself from going into an infinite loop, 
but then i remembered that you can configured named 
UpdateRequestProcessorChains ... so your default Chain can use your custom 
component, and you can create a simple chain (that bybasses your custom 
component) for your component to call processAdd()/processCommit() on.



-Hoss


Re: update some index documents after indexing process is done with DIH

Posted by Marc Sturlese <ma...@gmail.com>.
Hoss I see what you mean. I am trying to implement a CustomUpdateProcessor
checking out here:
http://wiki.apache.org/solr/UpdateRequestProcessor
What is confusing me now is that I have to implement my logic in
processComit as you said:

>>you'll still need the "double commit" (once so you can see the 
>>main changes, and once so the rest of the world can see your 
>>modifications) but you can execute them both directly in your 
>>processCommit(CommitUpdateCommand)

I have noticed that in the processAdd you have acces to the concrete
SolrInpuntDocument you are going to add:
SolrInputDocument doc = cmd.getSolrInputDocument();

But in processCommit, having access to the core I can get the IndexReader
but I still don't know how to get the IndexWriter and SolrInputDocuments in
there.
My idea is to do something like:

   @Override
    public void processCommit(CommitUpdateCommand cmd) throws IOException {
      //first commit that show me modification
      //open and iterate over the reader and create solrDocuments list
      //close reader
      //openwriter and update the docs in the list
      //close writer and second commit that shows my changes to the world
      
      if (next != null)
        next.processCommit(cmd);

    }

As I understood the process, the commitCommand will be sent to the
DirectUpdateHandler2. that will proper do the commit via
UpdateRequestProcessor.
Am I in the right way?  I haven't dealed with CustomUpdateProcessor for
doing something after a commit is executed so I am a bit confused...

Thanks in advance.




hossman wrote:
> 
> 
> This thread all sounds really kludgy ... among other things the 
> newSearcher listener is going to need to some how keep track of when it 
> was called as a result of a "real" commit, vs when it was called as the 
> result of a commit it itself triggered to make changes.
> 
> wouldn't an easier place to implement this logic be in an UpdateProcessor?  
> you'll still need the "double commit" (once so you can see the 
> main changes, and once so the rest of the world can see your 
> modifications) but you can execute them both directly in your 
> processCommit(CommitUpdateCommand) method (so you don't have to worry 
> about being able to tell them apart)
> 
> : Date: Thu, 30 Jul 2009 10:14:16 +0530
> : From:
> :    
> =?UTF-8?B?Tm9ibGUgUGF1bCDgtKjgtYvgtKzgtL/gtLPgtY3igI0gIOCkqOCli+CkrOCljeCk
> :     s+CljQ==?= <no...@corp.aol.com>
> : Reply-To: solr-user@lucene.apache.org, noble.paul@gmail.com
> : To: solr-user@lucene.apache.org
> : Subject: Re: update some index documents after indexing process is done
> with 
> :     DIH
> : 
> : If you make your EventListener implements SolrCoreAware you can get
> : hold of the core on inform. use that to get hold of the
> : SolrIndexWriter
> : 
> : On Wed, Jul 29, 2009 at 9:20 PM, Marc Sturlese<ma...@gmail.com>
> wrote:
> : >
> : > From the newSearcher(..) of a CustomEventListener which extends of
> : > AbstractSolrEventListener  can access to SolrIndexSearcher and all
> core
> : > properties but can't get a SolrIndexWriter. Do you now how can I get
> from
> : > there a SolrIndexWriter? This way I would be able to modify the
> documents (I
> : > need to modify them depending on values of other documents, that's why
> I
> : > can't do it with DIH delta-import).
> : > Thanks in advance
> : >
> : >
> : > Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> : >>
> : >> On Tue, Jul 28, 2009 at 5:17 PM, Marc
> Sturlese<ma...@gmail.com>
> : >> wrote:
> : >>>
> : >>> That really sounds the best way to reach my goal. How could I
> invoque a
> : >>> listener from the newSearcher?Would be something like:
> : >>>    <listener event="newSearcher" class="solr.QuerySenderListener">
> : >>>      <arr name="queries">
> : >>>        <lst> <str name="q">solr</str> <str name="start">0</str> <str
> : >>> name="rows">10</str> </lst>
> : >>>        <lst> <str name="q">rocks</str> <str name="start">0</str>
> <str
> : >>> name="rows">10</str> </lst>
> : >>>        <lst><str name="q">static newSearcher warming query from
> : >>> solrconfig.xml</str></lst>
> : >>>      </arr>
> : >>>    </listener>
> : >>>    <listener event="newSearcher" class="solr.MyCustomListener">
> : >>>
> : >>> And MyCustomListener would be the class who open the reader:
> : >>>
> : >>>        RefCounted<SolrIndexSearcher> searchHolder = null;
> : >>>        try {
> : >>>          searchHolder = dataImporter.getCore().getSearcher();
> : >>>          IndexReader reader = searchHolder.get().getReader();
> : >>>
> : >>>          //Here I iterate over the reader doing docuemnt
> modifications
> : >>>
> : >>>        } finally {
> : >>>           if (searchHolder != null) searchHolder.decref();
> : >>>        }
> : >>>        } catch (Exception ex) {
> : >>>            LOG.info("error");
> : >>>        }
> : >>
> : >> you may not be able to access the DIH API from a newSearcher event .
> : >> But the API would give you the searcher directly as a method
> : >> parameter.
> : >>>
> : >>> Finally, to access to documents and add fields to some of them, I
> have
> : >>> thought in using SolrDocument classes. Can you please point me where
> : >>> something similar is done in solr source (I mean creation of
> : >>> SolrDocuemnts
> : >>> and conversion of them to proper lucene docuements).
> : >>>
> : >>> Does this way for reaching the goal makes sense?
> : >>>
> : >>> Thanks in advance
> : >>>
> : >>>
> : >>>
> : >>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> : >>>>
> : >>>> when a core is reloaded the event fired is firstSearcher.
> newSearcher
> : >>>> is fired when a commit happens
> : >>>>
> : >>>>
> : >>>> On Tue, Jul 28, 2009 at 4:19 PM, Marc
> Sturlese<ma...@gmail.com>
> : >>>> wrote:
> : >>>>>
> : >>>>> Ok, but if I handle it in a newSearcher listener it will be
> executed
> : >>>>> every
> : >>>>> time I reload a core, isn't it? The thing is that I want to use an
> : >>>>> IndexReader to load in a HashMap some doc fields of the index and
> : >>>>> depending
> : >>>>> of the values of some field docs modify other docs. Its very
> memory
> : >>>>> consuming (I have tested it with a simple lucene script). Thats
> why I
> : >>>>> wanted
> : >>>>> to do it just after the indexing process.
> : >>>>>
> : >>>>> My ideal case would be to do it in the commit function of
> : >>>>> DirectUpdatehandler2.java just before
> : >>>>> writer.optimize(cmd.maxOptimizeSegments); is executed. But I don't
> want
> : >>>>> to
> : >>>>> mess that code... so trying to find out the best way to do that as
> a
> : >>>>> plugin
> : >>>>> instead of a hack as possible.
> : >>>>>
> : >>>>> Thanks in advance
> : >>>>>
> : >>>>>
> : >>>>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> : >>>>>>
> : >>>>>> It is best handled as a 'newSearcher' listener in solrconfig.xml.
> : >>>>>> onImportEnd is invoked before committing
> : >>>>>>
> : >>>>>> On Tue, Jul 28, 2009 at 3:13 PM, Marc
> : >>>>>> Sturlese<ma...@gmail.com>
> : >>>>>> wrote:
> : >>>>>>>
> : >>>>>>> Hey there,
> : >>>>>>> I would like to be able to do something like: After the indexing
> : >>>>>>> process
> : >>>>>>> is
> : >>>>>>> done with DIH I would like to open an indexreader, iterate over
> all
> : >>>>>>> docs,
> : >>>>>>> modify some of them depending on others and delete some others.
> I can
> : >>>>>>> easy
> : >>>>>>> do this directly coding with lucene but would like to know if
> there's
> : >>>>>>> a
> : >>>>>>> way
> : >>>>>>> to do it with Solr using SolrDocument or SolrInputDocument
> classes.
> : >>>>>>> I have thougth in using SolrJ or DIH listener onImportEnd but
> not
> : >>>>>>> sure
> : >>>>>>> if
> : >>>>>>> I
> : >>>>>>> can get an IndexReader in there.
> : >>>>>>> Any advice?
> : >>>>>>> Thanks in advance
> : >>>>>>> --
> : >>>>>>> View this message in context:
> : >>>>>>>
> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24695947.html
> : >>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
> : >>>>>>>
> : >>>>>>>
> : >>>>>>
> : >>>>>>
> : >>>>>>
> : >>>>>> --
> : >>>>>> -----------------------------------------------------
> : >>>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
> : >>>>>>
> : >>>>>>
> : >>>>>
> : >>>>> --
> : >>>>> View this message in context:
> : >>>>>
> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24696872.html
> : >>>>> Sent from the Solr - User mailing list archive at Nabble.com.
> : >>>>>
> : >>>>>
> : >>>>
> : >>>>
> : >>>>
> : >>>> --
> : >>>> -----------------------------------------------------
> : >>>> Noble Paul | Principal Engineer| AOL | http://aol.com
> : >>>>
> : >>>>
> : >>>
> : >>> --
> : >>> View this message in context:
> : >>>
> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24697751.html
> : >>> Sent from the Solr - User mailing list archive at Nabble.com.
> : >>>
> : >>>
> : >>
> : >>
> : >>
> : >> --
> : >> -----------------------------------------------------
> : >> Noble Paul | Principal Engineer| AOL | http://aol.com
> : >>
> : >>
> : >
> : > --
> : > View this message in context:
> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24722111.html
> : > Sent from the Solr - User mailing list archive at Nabble.com.
> : >
> : >
> : 
> : 
> : 
> : -- 
> : -----------------------------------------------------
> : Noble Paul | Principal Engineer| AOL | http://aol.com
> : 
> 
> 
> 
> -Hoss
> 
> 

-- 
View this message in context: http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24741985.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: update some index documents after indexing process is done with DIH

Posted by Chris Hostetter <ho...@fucit.org>.
This thread all sounds really kludgy ... among other things the 
newSearcher listener is going to need to some how keep track of when it 
was called as a result of a "real" commit, vs when it was called as the 
result of a commit it itself triggered to make changes.

wouldn't an easier place to implement this logic be in an UpdateProcessor?  
you'll still need the "double commit" (once so you can see the 
main changes, and once so the rest of the world can see your 
modifications) but you can execute them both directly in your 
processCommit(CommitUpdateCommand) method (so you don't have to worry 
about being able to tell them apart)

: Date: Thu, 30 Jul 2009 10:14:16 +0530
: From:
:     =?UTF-8?B?Tm9ibGUgUGF1bCDgtKjgtYvgtKzgtL/gtLPgtY3igI0gIOCkqOCli+CkrOCljeCk
:     s+CljQ==?= <no...@corp.aol.com>
: Reply-To: solr-user@lucene.apache.org, noble.paul@gmail.com
: To: solr-user@lucene.apache.org
: Subject: Re: update some index documents after indexing process is done with 
:     DIH
: 
: If you make your EventListener implements SolrCoreAware you can get
: hold of the core on inform. use that to get hold of the
: SolrIndexWriter
: 
: On Wed, Jul 29, 2009 at 9:20 PM, Marc Sturlese<ma...@gmail.com> wrote:
: >
: > From the newSearcher(..) of a CustomEventListener which extends of
: > AbstractSolrEventListener  can access to SolrIndexSearcher and all core
: > properties but can't get a SolrIndexWriter. Do you now how can I get from
: > there a SolrIndexWriter? This way I would be able to modify the documents (I
: > need to modify them depending on values of other documents, that's why I
: > can't do it with DIH delta-import).
: > Thanks in advance
: >
: >
: > Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
: >>
: >> On Tue, Jul 28, 2009 at 5:17 PM, Marc Sturlese<ma...@gmail.com>
: >> wrote:
: >>>
: >>> That really sounds the best way to reach my goal. How could I invoque a
: >>> listener from the newSearcher?Would be something like:
: >>>    <listener event="newSearcher" class="solr.QuerySenderListener">
: >>>      <arr name="queries">
: >>>        <lst> <str name="q">solr</str> <str name="start">0</str> <str
: >>> name="rows">10</str> </lst>
: >>>        <lst> <str name="q">rocks</str> <str name="start">0</str> <str
: >>> name="rows">10</str> </lst>
: >>>        <lst><str name="q">static newSearcher warming query from
: >>> solrconfig.xml</str></lst>
: >>>      </arr>
: >>>    </listener>
: >>>    <listener event="newSearcher" class="solr.MyCustomListener">
: >>>
: >>> And MyCustomListener would be the class who open the reader:
: >>>
: >>>        RefCounted<SolrIndexSearcher> searchHolder = null;
: >>>        try {
: >>>          searchHolder = dataImporter.getCore().getSearcher();
: >>>          IndexReader reader = searchHolder.get().getReader();
: >>>
: >>>          //Here I iterate over the reader doing docuemnt modifications
: >>>
: >>>        } finally {
: >>>           if (searchHolder != null) searchHolder.decref();
: >>>        }
: >>>        } catch (Exception ex) {
: >>>            LOG.info("error");
: >>>        }
: >>
: >> you may not be able to access the DIH API from a newSearcher event .
: >> But the API would give you the searcher directly as a method
: >> parameter.
: >>>
: >>> Finally, to access to documents and add fields to some of them, I have
: >>> thought in using SolrDocument classes. Can you please point me where
: >>> something similar is done in solr source (I mean creation of
: >>> SolrDocuemnts
: >>> and conversion of them to proper lucene docuements).
: >>>
: >>> Does this way for reaching the goal makes sense?
: >>>
: >>> Thanks in advance
: >>>
: >>>
: >>>
: >>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
: >>>>
: >>>> when a core is reloaded the event fired is firstSearcher. newSearcher
: >>>> is fired when a commit happens
: >>>>
: >>>>
: >>>> On Tue, Jul 28, 2009 at 4:19 PM, Marc Sturlese<ma...@gmail.com>
: >>>> wrote:
: >>>>>
: >>>>> Ok, but if I handle it in a newSearcher listener it will be executed
: >>>>> every
: >>>>> time I reload a core, isn't it? The thing is that I want to use an
: >>>>> IndexReader to load in a HashMap some doc fields of the index and
: >>>>> depending
: >>>>> of the values of some field docs modify other docs. Its very memory
: >>>>> consuming (I have tested it with a simple lucene script). Thats why I
: >>>>> wanted
: >>>>> to do it just after the indexing process.
: >>>>>
: >>>>> My ideal case would be to do it in the commit function of
: >>>>> DirectUpdatehandler2.java just before
: >>>>> writer.optimize(cmd.maxOptimizeSegments); is executed. But I don't want
: >>>>> to
: >>>>> mess that code... so trying to find out the best way to do that as a
: >>>>> plugin
: >>>>> instead of a hack as possible.
: >>>>>
: >>>>> Thanks in advance
: >>>>>
: >>>>>
: >>>>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
: >>>>>>
: >>>>>> It is best handled as a 'newSearcher' listener in solrconfig.xml.
: >>>>>> onImportEnd is invoked before committing
: >>>>>>
: >>>>>> On Tue, Jul 28, 2009 at 3:13 PM, Marc
: >>>>>> Sturlese<ma...@gmail.com>
: >>>>>> wrote:
: >>>>>>>
: >>>>>>> Hey there,
: >>>>>>> I would like to be able to do something like: After the indexing
: >>>>>>> process
: >>>>>>> is
: >>>>>>> done with DIH I would like to open an indexreader, iterate over all
: >>>>>>> docs,
: >>>>>>> modify some of them depending on others and delete some others. I can
: >>>>>>> easy
: >>>>>>> do this directly coding with lucene but would like to know if there's
: >>>>>>> a
: >>>>>>> way
: >>>>>>> to do it with Solr using SolrDocument or SolrInputDocument classes.
: >>>>>>> I have thougth in using SolrJ or DIH listener onImportEnd but not
: >>>>>>> sure
: >>>>>>> if
: >>>>>>> I
: >>>>>>> can get an IndexReader in there.
: >>>>>>> Any advice?
: >>>>>>> Thanks in advance
: >>>>>>> --
: >>>>>>> View this message in context:
: >>>>>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24695947.html
: >>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
: >>>>>>>
: >>>>>>>
: >>>>>>
: >>>>>>
: >>>>>>
: >>>>>> --
: >>>>>> -----------------------------------------------------
: >>>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
: >>>>>>
: >>>>>>
: >>>>>
: >>>>> --
: >>>>> View this message in context:
: >>>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24696872.html
: >>>>> Sent from the Solr - User mailing list archive at Nabble.com.
: >>>>>
: >>>>>
: >>>>
: >>>>
: >>>>
: >>>> --
: >>>> -----------------------------------------------------
: >>>> Noble Paul | Principal Engineer| AOL | http://aol.com
: >>>>
: >>>>
: >>>
: >>> --
: >>> View this message in context:
: >>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24697751.html
: >>> Sent from the Solr - User mailing list archive at Nabble.com.
: >>>
: >>>
: >>
: >>
: >>
: >> --
: >> -----------------------------------------------------
: >> Noble Paul | Principal Engineer| AOL | http://aol.com
: >>
: >>
: >
: > --
: > View this message in context: http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24722111.html
: > Sent from the Solr - User mailing list archive at Nabble.com.
: >
: >
: 
: 
: 
: -- 
: -----------------------------------------------------
: Noble Paul | Principal Engineer| AOL | http://aol.com
: 



-Hoss

Re: update some index documents after indexing process is done with DIH

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@corp.aol.com>.
If you make your EventListener implements SolrCoreAware you can get
hold of the core on inform. use that to get hold of the
SolrIndexWriter

On Wed, Jul 29, 2009 at 9:20 PM, Marc Sturlese<ma...@gmail.com> wrote:
>
> From the newSearcher(..) of a CustomEventListener which extends of
> AbstractSolrEventListener  can access to SolrIndexSearcher and all core
> properties but can't get a SolrIndexWriter. Do you now how can I get from
> there a SolrIndexWriter? This way I would be able to modify the documents (I
> need to modify them depending on values of other documents, that's why I
> can't do it with DIH delta-import).
> Thanks in advance
>
>
> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>
>> On Tue, Jul 28, 2009 at 5:17 PM, Marc Sturlese<ma...@gmail.com>
>> wrote:
>>>
>>> That really sounds the best way to reach my goal. How could I invoque a
>>> listener from the newSearcher?Would be something like:
>>>    <listener event="newSearcher" class="solr.QuerySenderListener">
>>>      <arr name="queries">
>>>        <lst> <str name="q">solr</str> <str name="start">0</str> <str
>>> name="rows">10</str> </lst>
>>>        <lst> <str name="q">rocks</str> <str name="start">0</str> <str
>>> name="rows">10</str> </lst>
>>>        <lst><str name="q">static newSearcher warming query from
>>> solrconfig.xml</str></lst>
>>>      </arr>
>>>    </listener>
>>>    <listener event="newSearcher" class="solr.MyCustomListener">
>>>
>>> And MyCustomListener would be the class who open the reader:
>>>
>>>        RefCounted<SolrIndexSearcher> searchHolder = null;
>>>        try {
>>>          searchHolder = dataImporter.getCore().getSearcher();
>>>          IndexReader reader = searchHolder.get().getReader();
>>>
>>>          //Here I iterate over the reader doing docuemnt modifications
>>>
>>>        } finally {
>>>           if (searchHolder != null) searchHolder.decref();
>>>        }
>>>        } catch (Exception ex) {
>>>            LOG.info("error");
>>>        }
>>
>> you may not be able to access the DIH API from a newSearcher event .
>> But the API would give you the searcher directly as a method
>> parameter.
>>>
>>> Finally, to access to documents and add fields to some of them, I have
>>> thought in using SolrDocument classes. Can you please point me where
>>> something similar is done in solr source (I mean creation of
>>> SolrDocuemnts
>>> and conversion of them to proper lucene docuements).
>>>
>>> Does this way for reaching the goal makes sense?
>>>
>>> Thanks in advance
>>>
>>>
>>>
>>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>>
>>>> when a core is reloaded the event fired is firstSearcher. newSearcher
>>>> is fired when a commit happens
>>>>
>>>>
>>>> On Tue, Jul 28, 2009 at 4:19 PM, Marc Sturlese<ma...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Ok, but if I handle it in a newSearcher listener it will be executed
>>>>> every
>>>>> time I reload a core, isn't it? The thing is that I want to use an
>>>>> IndexReader to load in a HashMap some doc fields of the index and
>>>>> depending
>>>>> of the values of some field docs modify other docs. Its very memory
>>>>> consuming (I have tested it with a simple lucene script). Thats why I
>>>>> wanted
>>>>> to do it just after the indexing process.
>>>>>
>>>>> My ideal case would be to do it in the commit function of
>>>>> DirectUpdatehandler2.java just before
>>>>> writer.optimize(cmd.maxOptimizeSegments); is executed. But I don't want
>>>>> to
>>>>> mess that code... so trying to find out the best way to do that as a
>>>>> plugin
>>>>> instead of a hack as possible.
>>>>>
>>>>> Thanks in advance
>>>>>
>>>>>
>>>>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>>>>
>>>>>> It is best handled as a 'newSearcher' listener in solrconfig.xml.
>>>>>> onImportEnd is invoked before committing
>>>>>>
>>>>>> On Tue, Jul 28, 2009 at 3:13 PM, Marc
>>>>>> Sturlese<ma...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hey there,
>>>>>>> I would like to be able to do something like: After the indexing
>>>>>>> process
>>>>>>> is
>>>>>>> done with DIH I would like to open an indexreader, iterate over all
>>>>>>> docs,
>>>>>>> modify some of them depending on others and delete some others. I can
>>>>>>> easy
>>>>>>> do this directly coding with lucene but would like to know if there's
>>>>>>> a
>>>>>>> way
>>>>>>> to do it with Solr using SolrDocument or SolrInputDocument classes.
>>>>>>> I have thougth in using SolrJ or DIH listener onImportEnd but not
>>>>>>> sure
>>>>>>> if
>>>>>>> I
>>>>>>> can get an IndexReader in there.
>>>>>>> Any advice?
>>>>>>> Thanks in advance
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24695947.html
>>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -----------------------------------------------------
>>>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24696872.html
>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -----------------------------------------------------
>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24697751.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>
>>
>
> --
> View this message in context: http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24722111.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: update some index documents after indexing process is done with DIH

Posted by Marc Sturlese <ma...@gmail.com>.
>From the newSearcher(..) of a CustomEventListener which extends of
AbstractSolrEventListener  can access to SolrIndexSearcher and all core
properties but can't get a SolrIndexWriter. Do you now how can I get from
there a SolrIndexWriter? This way I would be able to modify the documents (I
need to modify them depending on values of other documents, that's why I
can't do it with DIH delta-import).
Thanks in advance


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> On Tue, Jul 28, 2009 at 5:17 PM, Marc Sturlese<ma...@gmail.com>
> wrote:
>>
>> That really sounds the best way to reach my goal. How could I invoque a
>> listener from the newSearcher?Would be something like:
>>    <listener event="newSearcher" class="solr.QuerySenderListener">
>>      <arr name="queries">
>>        <lst> <str name="q">solr</str> <str name="start">0</str> <str
>> name="rows">10</str> </lst>
>>        <lst> <str name="q">rocks</str> <str name="start">0</str> <str
>> name="rows">10</str> </lst>
>>        <lst><str name="q">static newSearcher warming query from
>> solrconfig.xml</str></lst>
>>      </arr>
>>    </listener>
>>    <listener event="newSearcher" class="solr.MyCustomListener">
>>
>> And MyCustomListener would be the class who open the reader:
>>
>>        RefCounted<SolrIndexSearcher> searchHolder = null;
>>        try {
>>          searchHolder = dataImporter.getCore().getSearcher();
>>          IndexReader reader = searchHolder.get().getReader();
>>
>>          //Here I iterate over the reader doing docuemnt modifications
>>
>>        } finally {
>>           if (searchHolder != null) searchHolder.decref();
>>        }
>>        } catch (Exception ex) {
>>            LOG.info("error");
>>        }
> 
> you may not be able to access the DIH API from a newSearcher event .
> But the API would give you the searcher directly as a method
> parameter.
>>
>> Finally, to access to documents and add fields to some of them, I have
>> thought in using SolrDocument classes. Can you please point me where
>> something similar is done in solr source (I mean creation of
>> SolrDocuemnts
>> and conversion of them to proper lucene docuements).
>>
>> Does this way for reaching the goal makes sense?
>>
>> Thanks in advance
>>
>>
>>
>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>
>>> when a core is reloaded the event fired is firstSearcher. newSearcher
>>> is fired when a commit happens
>>>
>>>
>>> On Tue, Jul 28, 2009 at 4:19 PM, Marc Sturlese<ma...@gmail.com>
>>> wrote:
>>>>
>>>> Ok, but if I handle it in a newSearcher listener it will be executed
>>>> every
>>>> time I reload a core, isn't it? The thing is that I want to use an
>>>> IndexReader to load in a HashMap some doc fields of the index and
>>>> depending
>>>> of the values of some field docs modify other docs. Its very memory
>>>> consuming (I have tested it with a simple lucene script). Thats why I
>>>> wanted
>>>> to do it just after the indexing process.
>>>>
>>>> My ideal case would be to do it in the commit function of
>>>> DirectUpdatehandler2.java just before
>>>> writer.optimize(cmd.maxOptimizeSegments); is executed. But I don't want
>>>> to
>>>> mess that code... so trying to find out the best way to do that as a
>>>> plugin
>>>> instead of a hack as possible.
>>>>
>>>> Thanks in advance
>>>>
>>>>
>>>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>>>
>>>>> It is best handled as a 'newSearcher' listener in solrconfig.xml.
>>>>> onImportEnd is invoked before committing
>>>>>
>>>>> On Tue, Jul 28, 2009 at 3:13 PM, Marc
>>>>> Sturlese<ma...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hey there,
>>>>>> I would like to be able to do something like: After the indexing
>>>>>> process
>>>>>> is
>>>>>> done with DIH I would like to open an indexreader, iterate over all
>>>>>> docs,
>>>>>> modify some of them depending on others and delete some others. I can
>>>>>> easy
>>>>>> do this directly coding with lucene but would like to know if there's
>>>>>> a
>>>>>> way
>>>>>> to do it with Solr using SolrDocument or SolrInputDocument classes.
>>>>>> I have thougth in using SolrJ or DIH listener onImportEnd but not
>>>>>> sure
>>>>>> if
>>>>>> I
>>>>>> can get an IndexReader in there.
>>>>>> Any advice?
>>>>>> Thanks in advance
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24695947.html
>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -----------------------------------------------------
>>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24696872.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> -----------------------------------------------------
>>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24697751.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -----------------------------------------------------
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24722111.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: update some index documents after indexing process is done with DIH

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@corp.aol.com>.
On Tue, Jul 28, 2009 at 5:17 PM, Marc Sturlese<ma...@gmail.com> wrote:
>
> That really sounds the best way to reach my goal. How could I invoque a
> listener from the newSearcher?Would be something like:
>    <listener event="newSearcher" class="solr.QuerySenderListener">
>      <arr name="queries">
>        <lst> <str name="q">solr</str> <str name="start">0</str> <str
> name="rows">10</str> </lst>
>        <lst> <str name="q">rocks</str> <str name="start">0</str> <str
> name="rows">10</str> </lst>
>        <lst><str name="q">static newSearcher warming query from
> solrconfig.xml</str></lst>
>      </arr>
>    </listener>
>    <listener event="newSearcher" class="solr.MyCustomListener">
>
> And MyCustomListener would be the class who open the reader:
>
>        RefCounted<SolrIndexSearcher> searchHolder = null;
>        try {
>          searchHolder = dataImporter.getCore().getSearcher();
>          IndexReader reader = searchHolder.get().getReader();
>
>          //Here I iterate over the reader doing docuemnt modifications
>
>        } finally {
>           if (searchHolder != null) searchHolder.decref();
>        }
>        } catch (Exception ex) {
>            LOG.info("error");
>        }

you may not be able to access the DIH API from a newSearcher event .
But the API would give you the searcher directly as a method
parameter.
>
> Finally, to access to documents and add fields to some of them, I have
> thought in using SolrDocument classes. Can you please point me where
> something similar is done in solr source (I mean creation of SolrDocuemnts
> and conversion of them to proper lucene docuements).
>
> Does this way for reaching the goal makes sense?
>
> Thanks in advance
>
>
>
> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>
>> when a core is reloaded the event fired is firstSearcher. newSearcher
>> is fired when a commit happens
>>
>>
>> On Tue, Jul 28, 2009 at 4:19 PM, Marc Sturlese<ma...@gmail.com>
>> wrote:
>>>
>>> Ok, but if I handle it in a newSearcher listener it will be executed
>>> every
>>> time I reload a core, isn't it? The thing is that I want to use an
>>> IndexReader to load in a HashMap some doc fields of the index and
>>> depending
>>> of the values of some field docs modify other docs. Its very memory
>>> consuming (I have tested it with a simple lucene script). Thats why I
>>> wanted
>>> to do it just after the indexing process.
>>>
>>> My ideal case would be to do it in the commit function of
>>> DirectUpdatehandler2.java just before
>>> writer.optimize(cmd.maxOptimizeSegments); is executed. But I don't want
>>> to
>>> mess that code... so trying to find out the best way to do that as a
>>> plugin
>>> instead of a hack as possible.
>>>
>>> Thanks in advance
>>>
>>>
>>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>>
>>>> It is best handled as a 'newSearcher' listener in solrconfig.xml.
>>>> onImportEnd is invoked before committing
>>>>
>>>> On Tue, Jul 28, 2009 at 3:13 PM, Marc Sturlese<ma...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hey there,
>>>>> I would like to be able to do something like: After the indexing
>>>>> process
>>>>> is
>>>>> done with DIH I would like to open an indexreader, iterate over all
>>>>> docs,
>>>>> modify some of them depending on others and delete some others. I can
>>>>> easy
>>>>> do this directly coding with lucene but would like to know if there's a
>>>>> way
>>>>> to do it with Solr using SolrDocument or SolrInputDocument classes.
>>>>> I have thougth in using SolrJ or DIH listener onImportEnd but not sure
>>>>> if
>>>>> I
>>>>> can get an IndexReader in there.
>>>>> Any advice?
>>>>> Thanks in advance
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24695947.html
>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -----------------------------------------------------
>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24696872.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>
>>
>
> --
> View this message in context: http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24697751.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: update some index documents after indexing process is done with DIH

Posted by Marc Sturlese <ma...@gmail.com>.
That really sounds the best way to reach my goal. How could I invoque a
listener from the newSearcher?Would be something like:
    <listener event="newSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <lst> <str name="q">solr</str> <str name="start">0</str> <str
name="rows">10</str> </lst>
        <lst> <str name="q">rocks</str> <str name="start">0</str> <str
name="rows">10</str> </lst>
        <lst><str name="q">static newSearcher warming query from
solrconfig.xml</str></lst>
      </arr>
    </listener>
    <listener event="newSearcher" class="solr.MyCustomListener">

And MyCustomListener would be the class who open the reader: 
       
        RefCounted<SolrIndexSearcher> searchHolder = null;
        try {
          searchHolder = dataImporter.getCore().getSearcher();
          IndexReader reader = searchHolder.get().getReader();
          
          //Here I iterate over the reader doing docuemnt modifications

        } finally {
           if (searchHolder != null) searchHolder.decref();
        }
        } catch (Exception ex) {
            LOG.info("error");  
        }

Finally, to access to documents and add fields to some of them, I have
thought in using SolrDocument classes. Can you please point me where
something similar is done in solr source (I mean creation of SolrDocuemnts
and conversion of them to proper lucene docuements).

Does this way for reaching the goal makes sense?

Thanks in advance
    


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> when a core is reloaded the event fired is firstSearcher. newSearcher
> is fired when a commit happens
> 
> 
> On Tue, Jul 28, 2009 at 4:19 PM, Marc Sturlese<ma...@gmail.com>
> wrote:
>>
>> Ok, but if I handle it in a newSearcher listener it will be executed
>> every
>> time I reload a core, isn't it? The thing is that I want to use an
>> IndexReader to load in a HashMap some doc fields of the index and
>> depending
>> of the values of some field docs modify other docs. Its very memory
>> consuming (I have tested it with a simple lucene script). Thats why I
>> wanted
>> to do it just after the indexing process.
>>
>> My ideal case would be to do it in the commit function of
>> DirectUpdatehandler2.java just before
>> writer.optimize(cmd.maxOptimizeSegments); is executed. But I don't want
>> to
>> mess that code... so trying to find out the best way to do that as a
>> plugin
>> instead of a hack as possible.
>>
>> Thanks in advance
>>
>>
>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>
>>> It is best handled as a 'newSearcher' listener in solrconfig.xml.
>>> onImportEnd is invoked before committing
>>>
>>> On Tue, Jul 28, 2009 at 3:13 PM, Marc Sturlese<ma...@gmail.com>
>>> wrote:
>>>>
>>>> Hey there,
>>>> I would like to be able to do something like: After the indexing
>>>> process
>>>> is
>>>> done with DIH I would like to open an indexreader, iterate over all
>>>> docs,
>>>> modify some of them depending on others and delete some others. I can
>>>> easy
>>>> do this directly coding with lucene but would like to know if there's a
>>>> way
>>>> to do it with Solr using SolrDocument or SolrInputDocument classes.
>>>> I have thougth in using SolrJ or DIH listener onImportEnd but not sure
>>>> if
>>>> I
>>>> can get an IndexReader in there.
>>>> Any advice?
>>>> Thanks in advance
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24695947.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> -----------------------------------------------------
>>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24696872.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -----------------------------------------------------
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24697751.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: update some index documents after indexing process is done with DIH

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@corp.aol.com>.
when a core is reloaded the event fired is firstSearcher. newSearcher
is fired when a commit happens


On Tue, Jul 28, 2009 at 4:19 PM, Marc Sturlese<ma...@gmail.com> wrote:
>
> Ok, but if I handle it in a newSearcher listener it will be executed every
> time I reload a core, isn't it? The thing is that I want to use an
> IndexReader to load in a HashMap some doc fields of the index and depending
> of the values of some field docs modify other docs. Its very memory
> consuming (I have tested it with a simple lucene script). Thats why I wanted
> to do it just after the indexing process.
>
> My ideal case would be to do it in the commit function of
> DirectUpdatehandler2.java just before
> writer.optimize(cmd.maxOptimizeSegments); is executed. But I don't want to
> mess that code... so trying to find out the best way to do that as a plugin
> instead of a hack as possible.
>
> Thanks in advance
>
>
> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>
>> It is best handled as a 'newSearcher' listener in solrconfig.xml.
>> onImportEnd is invoked before committing
>>
>> On Tue, Jul 28, 2009 at 3:13 PM, Marc Sturlese<ma...@gmail.com>
>> wrote:
>>>
>>> Hey there,
>>> I would like to be able to do something like: After the indexing process
>>> is
>>> done with DIH I would like to open an indexreader, iterate over all docs,
>>> modify some of them depending on others and delete some others. I can
>>> easy
>>> do this directly coding with lucene but would like to know if there's a
>>> way
>>> to do it with Solr using SolrDocument or SolrInputDocument classes.
>>> I have thougth in using SolrJ or DIH listener onImportEnd but not sure if
>>> I
>>> can get an IndexReader in there.
>>> Any advice?
>>> Thanks in advance
>>> --
>>> View this message in context:
>>> http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24695947.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>
>>
>
> --
> View this message in context: http://www.nabble.com/update-some-index-documents-after-indexing-process-is-done-with-DIH-tp24695947p24696872.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com