You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Roy <ro...@gmail.com> on 2004/07/20 02:55:40 UTC

Re: Binary stored fields (was Re: suggestions for a student project)

Drew,

I am very interested in the binary field feature. Do you have any
updates on that?

Thanks.

Roy

On Thu, 27 May 2004 21:26:04 -0400, Drew Farris <al...@prodigy.net> wrote:
> On Thu, 2004-05-27 at 17:34, Dmitry Serebrennikov wrote:
> > Drew Farris wrote:
> > >
> > >I send out a patch for this soon after the original discussion...
> > >
> >
> > Drew, I'm sorry I missed this. Definetely didn't mean to ignore you
> > work! I just assumed that this is still undone. Oops....
> 
> No worries, it might have helped for me to include [PATCH] in the
> subject line as the Jakarta guidelines suggest :) Thanks for the good
> words and taking the time to look at it now -- I will be glad if I can
> contribute something of value. Comments follow inline.
> 
> > I just took a look at your patch. Looks great! Simple, and does the trick.
> > One comment:
> > - I understand why getBinaryValues(String fieldName) has to return
> > byte[][]. This is to deal with multiple fields with the same name,
> > right?
> 
> Yep.
> 
> > I think this is a relatively rare case, so perhaps it would be
> > good to have a method getBinaryValue(String fieldName) that just returns
> > the first byte[] for a given name. I think you can avoid allocating an
> > array and most applications would find it more convinient.
> 
> Sounds great, I'll implement byte[] getBinaryValue(String fieldName)
> which returns the first binary field of the given name encountered in
> the results from getFields(fieldName);
> 
> > As far as testing, I think unit tests are enough. It would be good to
> > have a few tests though. One test to have as with a large amount of data.
> 
> Ok. I've updated DocHelper, TestFieldInfos, TestSegmentReader in
> addition to TestDocument which I had already implemented in the last
> patch. I will implement a few more as well including one with a ton of
> data.
> 
> > What happens if there are two fields with the same name but one binary
> > and one not? Will there be an error? If so, when will it be given?
> 
> The code in the patch doesn't address this very well at all, stuffing
> nulls into arrays which requires the api user to do some extra checking
> which . If binary and non binary fields with the same name can co-exist
> in a document, I could either:
> 
> 1) Do away with the call to getFields(fieldName) in getValues and
> getBinaryValue/s entirely and walk the fields member in these methods
> checking for name/binary-ness.
> 
> 2) Change Field[] getFields(String fieldName) so it only returns
> non-binary fields and implement Field[] getBinaryFields(String
> fieldName) that only returns binary fields
> 
> Does anyone have a preference? I'm on the fence because I'm not sure if
> it's ok to change getFields to not return all of the fields of a given
> name.
> 
> > Is it possible to set confliciting flags on a field if one does not use
> > the new Field.Binary method?
> 
> I don't think so. The only way to set the isBinary flag in Field is by
> using Field.Binary, or new Field(String name, byte[] value); Each of the
> flag member variables are private and there are no public set methods
> for any of them.
> 
> > I'm +1 for including this (provided it works and all :), I have not
> > actually tested it myself).
> 
> Excellent! I'll work on some more extensive unit tests to really
> exercise this addition and hopefully that plus your suggestions will
> whip it into shape.
> 
> > Again, Drew, great work! Thanks for your contribution to Lucene.
> > Sorry to have missed your original message.
> 
> No problem! Thanks and for taking the time to review it and offering the
> great suggestions.
> 
> Drew
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: Binary stored fields (was Re: suggestions for a student project)

Posted by Roy <ro...@gmail.com>.
Drew,

Thanks for the pointer! I will test it out somtime this week.

Roy

On Tue, 20 Jul 2004 10:10:10 -0400, Drew Farris <dr...@gmail.com> wrote:
> Hi Roy,
> 
> The binary field feature has not been rolled into a release of Lucene,
> but there is a patch available as an attachment to the bugzilla entry:
> http://issues.apache.org/bugzilla/show_bug.cgi?id=29370
> 
> The patch can be accessed directly here:
> http://issues.apache.org/bugzilla/showattachment.cgi?attach_id=11751
> 
> You will need to check out the lucene source code from CVS, apply the
> patch and recompile. Although the patch was generated against a CVS
> snapshot prior to the final 1.4 release, I just tested it against a
> checkout using the lucene_1_4_final tag and it seems to be ok.
> 
> I haven't tested this patch extensively, but the JUnit tests work. Let
> me know if you run into any problems.
> 
> Drew
> 
> 
> 
> On Mon, 19 Jul 2004 17:55:40 -0700, Roy <ro...@gmail.com> wrote:
> > Drew,
> >
> > I am very interested in the binary field feature. Do you have any
> > updates on that?
> >
> > Thanks.
> >
> > Roy
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: Binary stored fields (was Re: suggestions for a student project)

Posted by Drew Farris <dr...@gmail.com>.
Hi Roy,

The binary field feature has not been rolled into a release of Lucene,
but there is a patch available as an attachment to the bugzilla entry:
http://issues.apache.org/bugzilla/show_bug.cgi?id=29370

The patch can be accessed directly here:
http://issues.apache.org/bugzilla/showattachment.cgi?attach_id=11751

You will need to check out the lucene source code from CVS, apply the
patch and recompile. Although the patch was generated against a CVS
snapshot prior to the final 1.4 release, I just tested it against a
checkout using the lucene_1_4_final tag and it seems to be ok.

I haven't tested this patch extensively, but the JUnit tests work. Let
me know if you run into any problems.

Drew

On Mon, 19 Jul 2004 17:55:40 -0700, Roy <ro...@gmail.com> wrote:
> Drew,
> 
> I am very interested in the binary field feature. Do you have any
> updates on that?
> 
> Thanks.
> 
> Roy
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org