You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Renaud Richardet <re...@wyona.com> on 2005/09/28 16:54:06 UTC
indexing documents from 1857
Hello,
>From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
there are conflicts with dates that pass this line. For one of our
projects, we will need to be able to move past Jan 1, 1970 date as far
as 1857.
Is there any workaround this?
Thanks,
Renaud
--
Renaud Richardet
COO America
Wyona Inc. - Open Source Content Management - Apache Lenya
office +1 857 776-3195 mobile +1 617 230 9112
renaud.richardet@wyona.com http://www.wyona.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: indexing documents from 1857
Posted by Paul Elschot <pa...@xs4all.nl>.
On Wednesday 28 September 2005 16:54, Renaud Richardet wrote:
> Hello,
>
> From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
> there are conflicts with dates that pass this line. For one of our
> projects, we will need to be able to move past Jan 1, 1970 date as far
> as 1857.
>
> Is there any workaround this?
One way is to index the date as a string YYYYMMDD.
Regards,
Paul Elschot
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Mirroring a remote index using only metadata
Posted by Murat Yakici <Mu...@cis.strath.ac.uk>.
Hi,
Let's assume that there is one remote index and one local index. I would
like to create a mirror of the remote locally. I'm using a kind of
protocol in between (which is not important) to only transfer each
document ID, the unique terms in the document and the frequencies. Right
now I'm not interested in the term position information.
What is the best way to create/update the local index given only the
above metadata?
The very simplest idea is (and a bit silly though) to use
Document-Field classes on the surface which is given by the Lucene API
For each document and
For each term unique term in the document
Add the term to the field by the number of time indicated by the
frequency
Is there another way to achieve this without creating a big mess in the
Lucene design?(such as using a low level class)
Cheers,
Murat
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: indexing documents from 1857
Posted by Renaud Richardet <re...@wyona.com>.
Hello Luke, hello Paul
Thanks for your quick response!
Best,
Renaud
Luke Francl wrote:
>Index your dates as strings (yyyymmdd).
>
>This works better anyway because range searches work over a wider range
>of dates than when you index the full precision.
>
>On Wed, 2005-09-28 at 09:54, Renaud Richardet wrote:
>
>
>>Hello,
>>
>>>From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
>>there are conflicts with dates that pass this line. For one of our
>>projects, we will need to be able to move past Jan 1, 1970 date as far
>>as 1857.
>>
>>Is there any workaround this?
>>
>>Thanks,
>>Renaud
>>
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
--
Renaud Richardet
COO America
Wyona Inc. - Open Source Content Management - Apache Lenya
office +1 857 776-3195 mobile +1 617 230 9112
renaud.richardet@wyona.com http://www.wyona.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: indexing documents from 1857
Posted by Luke Francl <lu...@stellent.com>.
Index your dates as strings (yyyymmdd).
This works better anyway because range searches work over a wider range
of dates than when you index the full precision.
On Wed, 2005-09-28 at 09:54, Renaud Richardet wrote:
> Hello,
>
> From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
> there are conflicts with dates that pass this line. For one of our
> projects, we will need to be able to move past Jan 1, 1970 date as far
> as 1857.
>
> Is there any workaround this?
>
> Thanks,
> Renaud
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: indexing documents from 1857
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 28, 2005, at 10:54 AM, Renaud Richardet wrote:
> Hello,
>
>
>> From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
>>
> there are conflicts with dates that pass this line. For one of our
> projects, we will need to be able to move past Jan 1, 1970 date as far
> as 1857.
>
> Is there any workaround this?
I'd be in big trouble if Lucene couldn't handle dates in the
1850's :) My life revolves around 19th century literature at http://
www.nines.org where I'm building the "Google of 19th century
literature". I hope to have a working system deployed in the next
few weeks. I've got development prototypes with faceted browsing
(much like the CNET interface, as well as like the cool Flamenco
interface) along with full-text search.
To echo what others have said... index years as a String YYYY if only
years are relevant, or YYYYMM if you need months, or YYYYMMDD if you
need full dates.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org