You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Bernhard Messer <Be...@intrafind.de> on 2004/10/01 17:14:51 UTC
Re: strange behaviour in CompoundFileReader fileModified and touchFile
Dmitry,
> Bernhard Messer wrote:
>
>> hi,
>>
>> CompoundFileReader class contains some code where i can't follow the
>> idea behind it. Maybe somebody else can switch on the light for me,
>> so i can see the track. There are 2 public methods which definitly
>> don't work as expected. I know, extending Directory forces one to
>> implement the methods, but in that particular, case the
>> implementation is just confusing me and my be other people too.
>>
>> public long fileModified(String name) throws IOException {
>> return directory.fileModified(fileName);
>> }
>>
>> public void touchFile(String name) throws IOException {
>> directory.touchFile(fileName);
>> }
>>
>> Looking at the implementation, both methods are working on the
>> compound filename itself, regardless what the filename passed in has
>> as it's value. It would be much more understandable, if these methods
>> throw some UnsupportedOperationException. The other way is to to
>> change them in a way, that the underlaying directory method calls
>> will get the real filename passed in and not the compound filename
>> itself.
>
>
> Well, the reason I did it this way is because I thought this would be
> the least amount of disruption to the programs out there that might be
> using these APIs. You can't really pass the "name" into the directory
> since it doesn't know about these as individual files. Directoy only
> knows about the compound file.
I'm not sure if this is correct. Looking at the implementation for
example in FSDirectory, every file, doesn't matter if it is related to
Lucene or not can be touched.
> To implement the fileModified() fully, you could just store timestamps
> in the file, but then they would just the same as the timestamp on the
> overall file, unless there was also touchFile() support.To implement
> touch file, you'd have to open the file in random access and update
> the timestamp field of an individual file. This can certainly be done,
> but I didn't have a need for it. You could throw the Unsupported
> exception, but this could make callers have to change. Anyway, the
> compromise I chose was to treat a "touch" on one file as if a "touch"
> on all files for the segment. This works in most usages. The only time
> this would be a problem is if you implemented some kind of timestamp
> set/check that would depend on files in a segment having different
> timestamps. This might be important for updating segments, but since
> this is never done, I'm not sure this is really that useful. Do you
> have case in mind when this is proving to be a limitation?
>
Agree with you. I don't see the need for a full implementation of touch
file and lastModified for the internal used compound file parts or any
other file. But the way it is implemented now, it just does something
different than it looks for the user of the API. The idea i had in mind,
was to implement it in a way that the compound file can be touched and
lastModified can be read also. If the user passes in a filename,
different to the compound file name, either an
UnsupportedOperationException or even better an IOException could be thrown.
what do you think ?
Bernhard
>>
>> just a thought ;-)
>>
>> bernhard
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
Re: strange behaviour in CompoundFileReader fileModified and touchFile
Posted by Dmitry Serebrennikov <dm...@earthlink.net>.
Bernhard Messer wrote:
> Dmitry,
>
>> Bernhard Messer wrote:
>>
>>> hi,
>>>
>>> CompoundFileReader class contains some code where i can't follow the
>>> idea behind it. Maybe somebody else can switch on the light for me,
>>> so i can see the track. There are 2 public methods which definitly
>>> don't work as expected. I know, extending Directory forces one to
>>> implement the methods, but in that particular, case the
>>> implementation is just confusing me and my be other people too.
>>>
>>> public long fileModified(String name) throws IOException {
>>> return directory.fileModified(fileName);
>>> }
>>>
>>> public void touchFile(String name) throws IOException {
>>> directory.touchFile(fileName);
>>> }
>>>
>>> Looking at the implementation, both methods are working on the
>>> compound filename itself, regardless what the filename passed in has
>>> as it's value. It would be much more understandable, if these
>>> methods throw some UnsupportedOperationException. The other way is
>>> to to change them in a way, that the underlaying directory method
>>> calls will get the real filename passed in and not the compound
>>> filename itself.
>>
>>
>>
>> Well, the reason I did it this way is because I thought this would be
>> the least amount of disruption to the programs out there that might
>> be using these APIs. You can't really pass the "name" into the
>> directory since it doesn't know about these as individual files.
>> Directoy only knows about the compound file.
>
>
> I'm not sure if this is correct. Looking at the implementation for
> example in FSDirectory, every file, doesn't matter if it is related to
> Lucene or not can be touched.
Yes, but the usual files that you find in the old-style segment, the
ones that the CompoundFileReader and the rest of Lucene know about, are
not present on the file system when the compound files are used. So
FSDirectory only knows about the compound file, while everything up from
the CompoundFileReader still thinks that there are multiple files in a
given segment.
>
>> To implement the fileModified() fully, you could just store
>> timestamps in the file, but then they would just the same as the
>> timestamp on the overall file, unless there was also touchFile()
>> support.To implement touch file, you'd have to open the file in
>> random access and update the timestamp field of an individual file.
>> This can certainly be done, but I didn't have a need for it. You
>> could throw the Unsupported exception, but this could make callers
>> have to change. Anyway, the compromise I chose was to treat a "touch"
>> on one file as if a "touch" on all files for the segment. This works
>> in most usages. The only time this would be a problem is if you
>> implemented some kind of timestamp set/check that would depend on
>> files in a segment having different timestamps. This might be
>> important for updating segments, but since this is never done, I'm
>> not sure this is really that useful. Do you have case in mind when
>> this is proving to be a limitation?
>>
> Agree with you. I don't see the need for a full implementation of
> touch file and lastModified for the internal used compound file parts
> or any other file. But the way it is implemented now, it just does
> something different than it looks for the user of the API. The idea i
> had in mind, was to implement it in a way that the compound file can
> be touched and lastModified can be read also. If the user passes in a
> filename, different to the compound file name, either an
> UnsupportedOperationException or even better an IOException could be
> thrown.
See above. The user knows about the old-style files only, it does not
know about the cfs file. On the other hand, FSDirectory knows only of
cfs file and not of the .f1, .f2, .fdt, and so on.
What the implementation is trying to do (unless I'm forgetting
something) is to accept the .f1, .f2, etc names as input and change the
timestamp of the .cfs file regardless of which particular segment file
was requested. This makes it look like your call resulted in the
expected behavior (in that calling fileModified with the same name will
give back the same timestamp), but also that *someone else* has also
called touchFile on all other files as well. I think this is not
unreasonable and provides the most compatible behavior for the upper
layers, short of a full implementation. Does this make sense?
>
> what do you think ?
>
> Bernhard
>
>>>
>>> just a thought ;-)
>>>
>>> bernhard
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org