You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Robert Koberg <ro...@koberg.com> on 2010/01/31 17:51:35 UTC

best way to compare Documents

Hi,

Just coming back to Lucene after a few years.

Is there some convenient way to compare Lucene Documents?

I want to check if I should update a document based on whether field values have changed and whether fields have been added or removed. 

Is it as simple as:

newDoc.equals(oldDoc)

?

I don't need to create the newDoc first, so I could compare by field. I am creating Fields like so:

new Field("modified", modified, Field.Store.YES, Field.Index.NOT_ANALYZED)

So, would it be better to:
* check oldDoc's fields count against the to be created documents desired fields count, and
* loop through the fields and compare values

?

Is there a better way to create Fields and/or Documents for this type of thing?

thanks,
-Rob



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: best way to compare Documents

Posted by Ian Lea <ia...@gmail.com>.
> ...
> Is there some convenient way to compare Lucene Documents?

Not that I know of.

> I want to check if I should update a document based on whether field values have changed and whether fields have been added or removed.
>
> Is it as simple as:
>
> newDoc.equals(oldDoc)

No!

> I don't need to create the newDoc first, so I could compare by field. I am creating Fields like so:
>
> new Field("modified", modified, Field.Store.YES, Field.Index.NOT_ANALYZED)
>
> So, would it be better to:
> * check oldDoc's fields count against the to be created documents desired fields count, and
> * loop through the fields and compare values

Yes.  Of course you won't be able to compare unstored fields.

> Is there a better way to create Fields and/or Documents for this type of thing?

Hash or CRC as Shashi suggested, but you'll still need compare old and
new hash values.


--
Ian.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: best way to compare Documents

Posted by Shashi Kant <sk...@sloan.mit.edu>.
or a CRC

On Sun, Jan 31, 2010 at 11:58 AM, Shashi Kant <sk...@sloan.mit.edu> wrote:
> If all you want is to flag a document "dirty" you could hash the
> fields in the document and and check for an update.
>
>
>
> On Sun, Jan 31, 2010 at 11:51 AM, Robert Koberg <ro...@koberg.com> wrote:
>> Hi,
>>
>> Just coming back to Lucene after a few years.
>>
>> Is there some convenient way to compare Lucene Documents?
>>
>> I want to check if I should update a document based on whether field values have changed and whether fields have been added or removed.
>>
>> Is it as simple as:
>>
>> newDoc.equals(oldDoc)
>>
>> ?
>>
>> I don't need to create the newDoc first, so I could compare by field. I am creating Fields like so:
>>
>> new Field("modified", modified, Field.Store.YES, Field.Index.NOT_ANALYZED)
>>
>> So, would it be better to:
>> * check oldDoc's fields count against the to be created documents desired fields count, and
>> * loop through the fields and compare values
>>
>> ?
>>
>> Is there a better way to create Fields and/or Documents for this type of thing?
>>
>> thanks,
>> -Rob
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: best way to compare Documents

Posted by Shashi Kant <sk...@sloan.mit.edu>.
If all you want is to flag a document "dirty" you could hash the
fields in the document and and check for an update.



On Sun, Jan 31, 2010 at 11:51 AM, Robert Koberg <ro...@koberg.com> wrote:
> Hi,
>
> Just coming back to Lucene after a few years.
>
> Is there some convenient way to compare Lucene Documents?
>
> I want to check if I should update a document based on whether field values have changed and whether fields have been added or removed.
>
> Is it as simple as:
>
> newDoc.equals(oldDoc)
>
> ?
>
> I don't need to create the newDoc first, so I could compare by field. I am creating Fields like so:
>
> new Field("modified", modified, Field.Store.YES, Field.Index.NOT_ANALYZED)
>
> So, would it be better to:
> * check oldDoc's fields count against the to be created documents desired fields count, and
> * loop through the fields and compare values
>
> ?
>
> Is there a better way to create Fields and/or Documents for this type of thing?
>
> thanks,
> -Rob
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org