You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Roy <ro...@gmail.com> on 2004/07/20 02:55:40 UTC
Re: Binary stored fields (was Re: suggestions for a student project)
Drew,
I am very interested in the binary field feature. Do you have any
updates on that?
Thanks.
Roy
On Thu, 27 May 2004 21:26:04 -0400, Drew Farris <al...@prodigy.net> wrote:
> On Thu, 2004-05-27 at 17:34, Dmitry Serebrennikov wrote:
> > Drew Farris wrote:
> > >
> > >I send out a patch for this soon after the original discussion...
> > >
> >
> > Drew, I'm sorry I missed this. Definetely didn't mean to ignore you
> > work! I just assumed that this is still undone. Oops....
>
> No worries, it might have helped for me to include [PATCH] in the
> subject line as the Jakarta guidelines suggest :) Thanks for the good
> words and taking the time to look at it now -- I will be glad if I can
> contribute something of value. Comments follow inline.
>
> > I just took a look at your patch. Looks great! Simple, and does the trick.
> > One comment:
> > - I understand why getBinaryValues(String fieldName) has to return
> > byte[][]. This is to deal with multiple fields with the same name,
> > right?
>
> Yep.
>
> > I think this is a relatively rare case, so perhaps it would be
> > good to have a method getBinaryValue(String fieldName) that just returns
> > the first byte[] for a given name. I think you can avoid allocating an
> > array and most applications would find it more convinient.
>
> Sounds great, I'll implement byte[] getBinaryValue(String fieldName)
> which returns the first binary field of the given name encountered in
> the results from getFields(fieldName);
>
> > As far as testing, I think unit tests are enough. It would be good to
> > have a few tests though. One test to have as with a large amount of data.
>
> Ok. I've updated DocHelper, TestFieldInfos, TestSegmentReader in
> addition to TestDocument which I had already implemented in the last
> patch. I will implement a few more as well including one with a ton of
> data.
>
> > What happens if there are two fields with the same name but one binary
> > and one not? Will there be an error? If so, when will it be given?
>
> The code in the patch doesn't address this very well at all, stuffing
> nulls into arrays which requires the api user to do some extra checking
> which . If binary and non binary fields with the same name can co-exist
> in a document, I could either:
>
> 1) Do away with the call to getFields(fieldName) in getValues and
> getBinaryValue/s entirely and walk the fields member in these methods
> checking for name/binary-ness.
>
> 2) Change Field[] getFields(String fieldName) so it only returns
> non-binary fields and implement Field[] getBinaryFields(String
> fieldName) that only returns binary fields
>
> Does anyone have a preference? I'm on the fence because I'm not sure if
> it's ok to change getFields to not return all of the fields of a given
> name.
>
> > Is it possible to set confliciting flags on a field if one does not use
> > the new Field.Binary method?
>
> I don't think so. The only way to set the isBinary flag in Field is by
> using Field.Binary, or new Field(String name, byte[] value); Each of the
> flag member variables are private and there are no public set methods
> for any of them.
>
> > I'm +1 for including this (provided it works and all :), I have not
> > actually tested it myself).
>
> Excellent! I'll work on some more extensive unit tests to really
> exercise this addition and hopefully that plus your suggestions will
> whip it into shape.
>
> > Again, Drew, great work! Thanks for your contribution to Lucene.
> > Sorry to have missed your original message.
>
> No problem! Thanks and for taking the time to review it and offering the
> great suggestions.
>
> Drew
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
Re: Binary stored fields (was Re: suggestions for a student project)
Posted by Roy <ro...@gmail.com>.
Drew,
Thanks for the pointer! I will test it out somtime this week.
Roy
On Tue, 20 Jul 2004 10:10:10 -0400, Drew Farris <dr...@gmail.com> wrote:
> Hi Roy,
>
> The binary field feature has not been rolled into a release of Lucene,
> but there is a patch available as an attachment to the bugzilla entry:
> http://issues.apache.org/bugzilla/show_bug.cgi?id=29370
>
> The patch can be accessed directly here:
> http://issues.apache.org/bugzilla/showattachment.cgi?attach_id=11751
>
> You will need to check out the lucene source code from CVS, apply the
> patch and recompile. Although the patch was generated against a CVS
> snapshot prior to the final 1.4 release, I just tested it against a
> checkout using the lucene_1_4_final tag and it seems to be ok.
>
> I haven't tested this patch extensively, but the JUnit tests work. Let
> me know if you run into any problems.
>
> Drew
>
>
>
> On Mon, 19 Jul 2004 17:55:40 -0700, Roy <ro...@gmail.com> wrote:
> > Drew,
> >
> > I am very interested in the binary field feature. Do you have any
> > updates on that?
> >
> > Thanks.
> >
> > Roy
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
Re: Binary stored fields (was Re: suggestions for a student project)
Posted by Drew Farris <dr...@gmail.com>.
Hi Roy,
The binary field feature has not been rolled into a release of Lucene,
but there is a patch available as an attachment to the bugzilla entry:
http://issues.apache.org/bugzilla/show_bug.cgi?id=29370
The patch can be accessed directly here:
http://issues.apache.org/bugzilla/showattachment.cgi?attach_id=11751
You will need to check out the lucene source code from CVS, apply the
patch and recompile. Although the patch was generated against a CVS
snapshot prior to the final 1.4 release, I just tested it against a
checkout using the lucene_1_4_final tag and it seems to be ok.
I haven't tested this patch extensively, but the JUnit tests work. Let
me know if you run into any problems.
Drew
On Mon, 19 Jul 2004 17:55:40 -0700, Roy <ro...@gmail.com> wrote:
> Drew,
>
> I am very interested in the binary field feature. Do you have any
> updates on that?
>
> Thanks.
>
> Roy
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org