You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Jason Gerlowski (Jira)" <ji...@apache.org> on 2021/03/30 12:32:00 UTC

[jira] [Comment Edited] (LUCENE-9893) Document or fix CodecUtil's codec requirements/limitations

    [ https://issues.apache.org/jira/browse/LUCENE-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17311482#comment-17311482 ] 

Jason Gerlowski edited comment on LUCENE-9893 at 3/30/21, 12:31 PM:
--------------------------------------------------------------------

Saying "if you don't like it, don't use it" is all well and good, but users need information to help them make that decision.  How is a Java user expected to know that this utility isn't what they want from the naming and Javadocs currently in place?

I can appreciate reluctance to adding this complexity to CodecUtil.  Fair enough. But that's only one possible route here - is there a downside I'm missing to going the Javadoc route?


was (Author: gerlowskija):
Saying "if you don't like it, don't use it" is all well and good, but users need information to help them make that decision.  How is a Java user expected to know that this utility isn't what they want from the naming and Javadocs currently in place?

I can reluctance to adding this complexity to CodecUtil.  Fair enough. But Is there a downside I'm missing to going the Javadoc route?

> Document or fix CodecUtil's codec requirements/limitations
> ----------------------------------------------------------
>
>                 Key: LUCENE-9893
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9893
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 8.8
>            Reporter: Jason Gerlowski
>            Priority: Minor
>
> Lucene's {{CodecUtil}} has methods which do most of the heavy lifting for reading, writing, and validating the headers and footers used by _most_ Lucene codecs.
> But not all codecs make use of the standard header/footer format supported by CodecUtil.  SimpleTextCodec is one example: it's avoidance of non-text data causes it to skip the standard footer format in favor of a custom text-based footer format.  {{CodecUtil.checkFooter}} (for example) called with a SimpleText-based {{IndexInput}} will produce a CorruptIndexException when it doesn't find the 'magic-number' it expects to lead off the footer:
> {code}
> org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=808464432 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/home/jenkins/workspace/Lucene-Solr-8.x-Linux/solr/....")))
> {code}
> This undocumented limitation makes it hard for consumers to use CodecUtil generically for checksum validation in their code.  If it's the consumer's responsibility to check the codec for calling, then CodecUtil should mention this responsibility in Javadocs.  Alternatively, if CodecUtil is meant to handle all codecs, then it needs some additional logic to handle some of the "oddball" codecs that don't use the standard footers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org