You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Renaud Richardet <re...@wyona.com> on 2005/09/28 16:54:06 UTC

indexing documents from 1857

Hello,

>From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
there are conflicts with dates that pass this line. For one of our
projects, we will need to be able to move past Jan 1, 1970 date as far
as 1857.

Is there any workaround this?

Thanks,
Renaud

-- 
Renaud Richardet
COO America
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
office +1 857 776-3195                     mobile +1 617 230 9112
renaud.richardet@wyona.com                   http://www.wyona.com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: indexing documents from 1857

Posted by Paul Elschot <pa...@xs4all.nl>.
On Wednesday 28 September 2005 16:54, Renaud Richardet wrote:
> Hello,
> 
> From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
> there are conflicts with dates that pass this line. For one of our
> projects, we will need to be able to move past Jan 1, 1970 date as far
> as 1857.
> 
> Is there any workaround this?

One way is to index the date as a string YYYYMMDD.

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mirroring a remote index using only metadata

Posted by Murat Yakici <Mu...@cis.strath.ac.uk>.
Hi,

Let's assume that there is one remote index and one local index. I would 
like to create a mirror of the remote locally. I'm using a kind of 
protocol in between (which is not important) to only transfer each 
document ID, the unique terms in the document and the frequencies. Right 
now I'm not interested in the term position information.

What is the best way to create/update the local index given only the 
above metadata?

The very simplest idea is (and a bit silly though)  to use 
Document-Field classes on the surface which is given by the Lucene API

For each document and
	For each term unique term in the document
		Add the term to the field by the number of time 				indicated by the 
frequency

Is there another way to achieve this without creating a big mess in the 
Lucene design?(such as using a low level class)

Cheers,
Murat

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: indexing documents from 1857

Posted by Renaud Richardet <re...@wyona.com>.
Hello Luke, hello Paul

Thanks for your quick response!

Best,
Renaud


Luke Francl wrote:

>Index your dates as strings (yyyymmdd). 
>
>This works better anyway because range searches work over a wider range
>of dates than when you index the full precision.
>
>On Wed, 2005-09-28 at 09:54, Renaud Richardet wrote:
>  
>
>>Hello,
>>
>>>From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
>>there are conflicts with dates that pass this line. For one of our
>>projects, we will need to be able to move past Jan 1, 1970 date as far
>>as 1857.
>>
>>Is there any workaround this?
>>
>>Thanks,
>>Renaud
>>    
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>  
>


-- 
Renaud Richardet
COO America
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
office +1 857 776-3195                     mobile +1 617 230 9112
renaud.richardet@wyona.com                   http://www.wyona.com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: indexing documents from 1857

Posted by Luke Francl <lu...@stellent.com>.
Index your dates as strings (yyyymmdd). 

This works better anyway because range searches work over a wider range
of dates than when you index the full precision.

On Wed, 2005-09-28 at 09:54, Renaud Richardet wrote:
> Hello,
> 
> From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
> there are conflicts with dates that pass this line. For one of our
> projects, we will need to be able to move past Jan 1, 1970 date as far
> as 1857.
> 
> Is there any workaround this?
> 
> Thanks,
> Renaud


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: indexing documents from 1857

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 28, 2005, at 10:54 AM, Renaud Richardet wrote:

> Hello,
>
>
>> From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and
>>
> there are conflicts with dates that pass this line. For one of our
> projects, we will need to be able to move past Jan 1, 1970 date as far
> as 1857.
>
> Is there any workaround this?

I'd be in big trouble if Lucene couldn't handle dates in the  
1850's :)  My life revolves around 19th century literature at http:// 
www.nines.org where I'm building the "Google of 19th century  
literature".  I hope to have a working system deployed in the next  
few weeks.  I've got development prototypes with faceted browsing  
(much like the CNET interface, as well as like the cool Flamenco  
interface) along with full-text search.

To echo what others have said... index years as a String YYYY if only  
years are relevant, or YYYYMM if you need months, or YYYYMMDD if you  
need full dates.

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org