You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Frank Nestel <in...@doris-frank.de> on 2001/10/12 12:52:41 UTC

feature suggestion

Maybe I shouldn't, but it looks like the Token retrieval question
seems to have triggered great thoughts in your guru brains. 

Here is another thing I could use, actually browsing through the
FieldWriter
and FieldReader stuff I do not see that it is principal problem.
For me it even seems simpler than the Token retrieval problem.

Actually Lucene is kind of a special purpose database and I realized
that some parts of my application would be even easier if I could store
additional information on documents which do not fit into the current
realm of Fields. Lucene would give me a thread safe efficient repository
for my data, especially since I use it anyway for the pure searching.

For a short moment I considered marshalling this extra info into XML or
at least text. But this means an considerable overhead and another 
developers inconvenience. It should be possible to have a different kind
of Field which holds a Serializable Object just stored, untokenized and
unindexed just for retrieval together with other document data. 
Probably there might be other applications where even unstored Fields
of that kind would make sense, but tokenizing should be impossible. And
I do not want to dream here about what one could do if indexing was
possible.

Of course I understand this is not a thing to appear soon.

But in case it appears, I'll throw away nearly half of my document 
storing code since I'd fully rely on the Lucene engine.

Thanks for your patience,

Regards,
Frank

-- 
------------------------------------------ooO---"---Ooo-------------------
info@doris-frank.de,                "I hate this game, lets play it
again"
http://doris-frank.de,            
http://duf.spieleck.de/mailman/listinfo   
Dr. Frank  Sven  Nestel,      http://spieleck.de,
http://frank.spieleck.de 
Spiele von Doris und Frank, Wolfsstaudenring 32, D-91056 Erlangen,
GERMANY

Re: feature suggestion

Posted by Dmitry Serebrennikov <dm...@earthlink.net>.

Frank Nestel wrote:

>Yes, I would prefer to have native Object support. It's
>not that I do not know how to wrap arround that, it is
>just that I thought this would be a nice and lean extension of
>the Lucene API without loosing the primary focus of Lucene.
>
>To make this really lean in terms of generated index size
>I guess it would be better to have those serialized Objects
>stored as byte[] rather than String?
>
Not only that, but I think they are required to be byte[]s because of 
the separation between Java characters treams (which are modified by 
character encoding in effect) and data streams that are pure bytes.


Re: feature suggestion

Posted by Frank Nestel <in...@doris-frank.de>.
Yes, I would prefer to have native Object support. It's
not that I do not know how to wrap arround that, it is
just that I thought this would be a nice and lean extension of
the Lucene API without loosing the primary focus of Lucene.

To make this really lean in terms of generated index size
I guess it would be better to have those serialized Objects
stored as byte[] rather than String?

Dmitry Serebrennikov schrieb:
> 
> Frank Nestel wrote:
> 
> >[...]
> >
> >
> >For a short moment I considered marshalling this extra info into XML or
> >at least text. But this means an considerable overhead and another
> >developers inconvenience. It should be possible to have a different kind
> >of Field which holds a Serializable Object just stored, untokenized and
> >unindexed just for retrieval together with other document data.
> >Probably there might be other applications where even unstored Fields
> >of that kind would make sense, but tokenizing should be impossible. And
> >I do not want to dream here about what one could do if indexing was
> >possible.
> >
> What's preventing you from doing this now? I think you could declare a
> stored / untokenized / unindexed field. Maybe the problem is that it has
> to contain a string, whereas serialized objects should really be stored
> as byte[]. Is that the deal?

-- 
------------------------------------------ooO---"---Ooo-------------------
info@doris-frank.de,                "I hate this game, lets play it
again"
http://doris-frank.de,            
http://duf.spieleck.de/mailman/listinfo   
Dr. Frank  Sven  Nestel,      http://spieleck.de,
http://frank.spieleck.de 
Spiele von Doris und Frank, Wolfsstaudenring 32, D-91056 Erlangen,
GERMANY



Re: feature suggestion

Posted by Dmitry Serebrennikov <dm...@earthlink.net>.

Frank Nestel wrote:

>[...]
>
>
>For a short moment I considered marshalling this extra info into XML or
>at least text. But this means an considerable overhead and another 
>developers inconvenience. It should be possible to have a different kind
>of Field which holds a Serializable Object just stored, untokenized and
>unindexed just for retrieval together with other document data. 
>Probably there might be other applications where even unstored Fields
>of that kind would make sense, but tokenizing should be impossible. And
>I do not want to dream here about what one could do if indexing was
>possible.
>
What's preventing you from doing this now? I think you could declare a 
stored / untokenized / unindexed field. Maybe the problem is that it has 
to contain a string, whereas serialized objects should really be stored 
as byte[]. Is that the deal?