You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jozef Vilcek <jo...@gmail.com> on 2012/08/29 11:27:15 UTC

Re: Custom close to index metadata / pass commit data to writer.commit

Hi,

I just wanted to check if someone have an idea about intentions with this issue:
https://issues.apache.org/jira/browse/SOLR-2701

It is marked for 4.0-Alpha and there is already Beta out there.
Can anyone tell if it planed to be part of 4.0 release.

Best,
Jozef

On Sun, Jun 24, 2012 at 1:18 AM, Erick Erickson <er...@gmail.com> wrote:
> see: https://issues.apache.org/jira/browse/SOLR-2701.
>
> But there's an easier alternative. Just have a _very special_ document
> with a known<unqueKey> that you index at the end of the run that
> 1> has no fields in common with any other document (except uniqueKey)
> 2> contains whatever data you want to carry around in whatever format you want.
>
> Now whenever you query for that document by ID, you get your info. And
> since you can't search the doc until after it's been committed, you know
> that the preceding documents have all been persisted....
>
> Of course whenever you send a version of the doc it will overwrite the
> one before since it has the same <uniqueKey>
>
> Best
> Erick
>
> On Fri, Jun 22, 2012 at 5:34 AM, Jozef Vilcek <jo...@gmail.com> wrote:
>> Hi everyone,
>>
>> I am seeking to solution to store some custom data very close to /
>> within index. I have found a possibility to pass commit "user" data to
>> IndexWriter:
>> http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexWriter.html#commit(java.util.Map)
>> which are from what I understand stored somewhere close to segments
>> "metadata" like index version, generation, ...
>>
>> Now, I see no easy way to accumulate and pass along such data with
>> Solr 3.6. DirectUpdateHandler2 is committing implicitly via close
>> rather than invoking commit API. I can extend DirectUpdateHander2 and
>> alter closeWriter method but still ... I am not yet clear how to pass
>> along request level params which are not available at
>> DirectUpdateHandler2 level. It seems that passing commitData is not
>> supported ( maybe not wanted to by by design ) and not going to be as
>> when I look at Solr trunk, I see implicit commit removed,
>> writer.commit with passing commitData used but no easy way how to pass
>> custom commit data nor how to easily hook in.
>>
>> Any recommendations for how to store some data close to index?
>>
>> To throw some light why I what this ... Basically I want to store
>> there some kind of time stamp, which defines what is already in the
>> index with respect to feeding updates from external world. Now, my
>> index is replicated to other index instance in different data center
>> (serving traffic as well). When default document feed in DC1 go south
>> for some reason, backup in DC2 bumps in to keep updates alive ... but
>> it has to know from where the feed should start ... that would be that
>> kind of time stamp stored and replicated with index.
>>
>> Many thanks in advance.
>>
>> Best,
>> Jozef

Re: Custom close to index metadata / pass commit data to writer.commit

Posted by Erick Erickson <er...@gmail.com>.
You have to look at the "Resolution" entry. It's currently "unresolved", so
it hasn't been committed.

Best
Erick

On Wed, Aug 29, 2012 at 5:27 AM, Jozef Vilcek <jo...@gmail.com> wrote:
> Hi,
>
> I just wanted to check if someone have an idea about intentions with this issue:
> https://issues.apache.org/jira/browse/SOLR-2701
>
> It is marked for 4.0-Alpha and there is already Beta out there.
> Can anyone tell if it planed to be part of 4.0 release.
>
> Best,
> Jozef
>
> On Sun, Jun 24, 2012 at 1:18 AM, Erick Erickson <er...@gmail.com> wrote:
>> see: https://issues.apache.org/jira/browse/SOLR-2701.
>>
>> But there's an easier alternative. Just have a _very special_ document
>> with a known<unqueKey> that you index at the end of the run that
>> 1> has no fields in common with any other document (except uniqueKey)
>> 2> contains whatever data you want to carry around in whatever format you want.
>>
>> Now whenever you query for that document by ID, you get your info. And
>> since you can't search the doc until after it's been committed, you know
>> that the preceding documents have all been persisted....
>>
>> Of course whenever you send a version of the doc it will overwrite the
>> one before since it has the same <uniqueKey>
>>
>> Best
>> Erick
>>
>> On Fri, Jun 22, 2012 at 5:34 AM, Jozef Vilcek <jo...@gmail.com> wrote:
>>> Hi everyone,
>>>
>>> I am seeking to solution to store some custom data very close to /
>>> within index. I have found a possibility to pass commit "user" data to
>>> IndexWriter:
>>> http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexWriter.html#commit(java.util.Map)
>>> which are from what I understand stored somewhere close to segments
>>> "metadata" like index version, generation, ...
>>>
>>> Now, I see no easy way to accumulate and pass along such data with
>>> Solr 3.6. DirectUpdateHandler2 is committing implicitly via close
>>> rather than invoking commit API. I can extend DirectUpdateHander2 and
>>> alter closeWriter method but still ... I am not yet clear how to pass
>>> along request level params which are not available at
>>> DirectUpdateHandler2 level. It seems that passing commitData is not
>>> supported ( maybe not wanted to by by design ) and not going to be as
>>> when I look at Solr trunk, I see implicit commit removed,
>>> writer.commit with passing commitData used but no easy way how to pass
>>> custom commit data nor how to easily hook in.
>>>
>>> Any recommendations for how to store some data close to index?
>>>
>>> To throw some light why I what this ... Basically I want to store
>>> there some kind of time stamp, which defines what is already in the
>>> index with respect to feeding updates from external world. Now, my
>>> index is replicated to other index instance in different data center
>>> (serving traffic as well). When default document feed in DC1 go south
>>> for some reason, backup in DC2 bumps in to keep updates alive ... but
>>> it has to know from where the feed should start ... that would be that
>>> kind of time stamp stored and replicated with index.
>>>
>>> Many thanks in advance.
>>>
>>> Best,
>>> Jozef