You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Mark Miller <ma...@gmail.com> on 2009/07/02 15:40:04 UTC

Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Hudson runs all the tests and emails java-dev if any of them fail.

On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA) <ji...@apache.org> wrote:

>
>    [
> https://issues.apache.org/jira/browse/LUCENE-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726479#action_12726479]
>
> Robert Muir commented on LUCENE-1707:
> -------------------------------------
>
> bq. Why doesn't Hudson encounter this problem?
>
> Forgive my ignorance, does hudson also run tests or just verify build?
> These files are only used in tests!
>
> I agree we should correct it, and perhaps to prevent other problems these
> files should be converted to UTF-8.
>
> For the record I am still confused about these java-code analyzers that
> implement snowball algorithms, why do they exist when the same functionality
> is in contrib/snowball?
>
>
> > Don't use ensureOpen() excessively in IndexReader and IndexWriter
> > -----------------------------------------------------------------
> >
> >                 Key: LUCENE-1707
> >                 URL: https://issues.apache.org/jira/browse/LUCENE-1707
> >             Project: Lucene - Java
> >          Issue Type: Improvement
> >          Components: Index
> >            Reporter: Shai Erera
> >             Fix For: 2.9
> >
> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
> >
> >
> > A spin off from here:
> http://www.nabble.com/Excessive-use-of-ensureOpen()-td24127806.html<http://www.nabble.com/Excessive-use-of-ensureOpen%28%29-td24127806.html>
> .
> > We should stop calling this method when it's not necessary for any
> internal Lucene code. Currently, this code seems to hurt properly written
> apps, unnecessarily.
> > Will post a patch soon
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
-- 
- Mark

http://www.lucidimagination.com

Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Mon, Jul 6, 2009 at 11:40 AM, Uwe Schindler<uw...@thetaphi.de> wrote:

> Wonderful, and the tests (TestRussianStems) pass?

Yup!

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Uwe Schindler <uw...@thetaphi.de>.
Wonderful, and the tests (TestRussianStems) pass?

Thanks,
Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> Sent: Monday, July 06, 2009 5:37 PM
> To: java-dev@lucene.apache.org
> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
> excessively in IndexReader and IndexWriter
> 
> contrib/analyzers/src/test/org/apache/lucene/analysis/ru/stemsUTF8.txt
> looks right on OpenSolaris (unix EOLs).
> 
> Mike
> 
> On Mon, Jul 6, 2009 at 9:53 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
> > I fixed the encoding problem by convertig the test files to UTF-8 and
> > changed the Reader charset parameter to UTF-8. All files now have old-
> style
> > native again. Could somebody check if in unix, the files only have LF
> (and
> > in windows the files have CRLF, which is the state how I committed it)?
> >
> > The overall strange/incorrect charset conversion is not touched at all,
> but
> > I strongly agree to remove it (and only keep UnicodeRussian as charset
> > parmeter allowed to the analyzer) or remove the analyzer at all.
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >> -----Original Message-----
> >> From: Robert Muir [mailto:rcmuir@gmail.com]
> >> Sent: Monday, July 06, 2009 3:26 PM
> >> To: java-dev@lucene.apache.org
> >> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
> >> excessively in IndexReader and IndexWriter
> >>
> >> uwe I completely agree.
> >>
> >> to add the icing on the cake the entire analyzer appears to be just a
> >> duplication of the contrib/snowball Russian functionality...!
> >>
> >> On Mon, Jul 6, 2009 at 9:19 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
> >> > The whole russian analyzer is very strange and works against all
> >> > charset/unicode conventions. It defines own "charsets" (the only
> valid
> >> one
> >> > is UNICODE), which are all applied to standard java 16 bit chars. The
> >> test
> >> > shows, how this works: It open a text file in KOI8 using the "ISO-
> 88591-
> >> 1"
> >> > charset (just to not modify the codepoints when converting to 16bit
> java
> >> > chars (in principle it does a deprecated "new String(byte[],0)").
> These
> >> > completely wrong java chars are then handled by an analyzers's
> internal
> >> > charset conversion (working on the 16 bit chars).
> >> >
> >> > The only correct usage of this package is:
> >> > - open file with correct encoding (when instantiating the Reader, so
> >> specify
> >> > KOI8 or windows1251 to the Reader). The string is then correctly UTF-
> 16
> >> > encoded java chars. On this string the "pseudo-charset" UNICODE of
> this
> >> > analyzer can be used.
> >> >
> >> > In my opinion, this invalid usage of java chars should be deprecated,
> >> the
> >> > only correct pseudo-charset should be the one specified by UNICODE
> and
> >> all
> >> > charset conversions should be done using the Reader.
> >> >
> >> > Uwe
> >> >
> >> > -----
> >> > Uwe Schindler
> >> > H.-H.-Meier-Allee 63, D-28213 Bremen
> >> > http://www.thetaphi.de
> >> > eMail: uwe@thetaphi.de
> >> >
> >> >> -----Original Message-----
> >> >> From: Robert Muir [mailto:rcmuir@gmail.com]
> >> >> Sent: Monday, July 06, 2009 3:08 PM
> >> >> To: java-dev@lucene.apache.org
> >> >> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
> >> >> excessively in IndexReader and IndexWriter
> >> >>
> >> >> Uwe, I think so too. This way it will not be prone to breakage
> again.
> >> >>
> >> >> On Mon, Jul 6, 2009 at 8:38 AM, Uwe Schindler<uw...@thetaphi.de>
> wrote:
> >> >> > In my opinion, these files should be converted to UTF-8 and
> committed
> >> >> again
> >> >> > (and the Reader in the test recondigured for UTF-8). Then they can
> be
> >> >> native
> >> >> > EOL style again. The problem is that SVN can only handle the EOL
> >> style
> >> >> for
> >> >> > one-byte-per-char and UTF-8 files.
> >> >> >
> >> >> > I give it a try here (and I have a converter).
> >> >> >
> >> >> > -----
> >> >> > Uwe Schindler
> >> >> > H.-H.-Meier-Allee 63, D-28213 Bremen
> >> >> > http://www.thetaphi.de
> >> >> > eMail: uwe@thetaphi.de
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: Robert Muir [mailto:rcmuir@gmail.com]
> >> >> >> Sent: Monday, July 06, 2009 1:11 PM
> >> >> >> To: java-dev@lucene.apache.org
> >> >> >> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use
> ensureOpen()
> >> >> >> excessively in IndexReader and IndexWriter
> >> >> >>
> >> >> >> yeah, its fixed now.
> >> >> >>
> >> >> >> On Mon, Jul 6, 2009 at 7:06 AM, Michael
> >> >> >> McCandless<lu...@mikemccandless.com> wrote:
> >> >> >> > Is this the native vs LF svn:eol-style that Uwe already fixed?
> >> >> >> >
> >> >> >> > Mike
> >> >> >> >
> >> >> >> > On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera<se...@gmail.com>
> >> wrote:
> >> >> >> >> Can somebody try to revert the change and test it on Windows?
> >> >> >> >>
> >> >> >> >> On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com>
> >> >> wrote:
> >> >> >> >>>
> >> >> >> >>> well then I have no idea why it doesn't fail. Except that
> >> perhaps
> >> >> its
> >> >> >> >>> EOL-related (as Shai said), and that the failure is somehow
> >> >> >> >>> platform-dependent due to newline differences between windows
> >> and
> >> >> unix
> >> >> >> >>> (and the way these are encoded in UTF-16/stored in SVN)?
> >> >> >> >>>
> >> >> >> >>> I don't do really any work with files in UTF-16 so this is
> just
> >> a
> >> >> >> theory.
> >> >> >> >>>
> >> >> >> >>> On Thu, Jul 2, 2009 at 9:40 AM, Mark
> >> Miller<ma...@gmail.com>
> >> >> >> wrote:
> >> >> >> >>> > Hudson runs all the tests and emails java-dev if any of
> them
> >> >> fail.
> >> >> >> >>> >
> >> >> >> >>> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA)
> >> >> <ji...@apache.org>
> >> >> >> >>> > wrote:
> >> >> >> >>> >>
> >> >> >> >>> >>    [
> >> >> >> >>> >>
> >> >> >> >>> >> https://issues.apache.org/jira/browse/LUCENE-
> >> >> >>
> 1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
> >> >> >> tabpanel&focusedCommentId=12726479#action_12726479
> >> >> >> >>> >> ]
> >> >> >> >>> >>
> >> >> >> >>> >> Robert Muir commented on LUCENE-1707:
> >> >> >> >>> >> -------------------------------------
> >> >> >> >>> >>
> >> >> >> >>> >> bq. Why doesn't Hudson encounter this problem?
> >> >> >> >>> >>
> >> >> >> >>> >> Forgive my ignorance, does hudson also run tests or just
> >> verify
> >> >> >> build?
> >> >> >> >>> >> These files are only used in tests!
> >> >> >> >>> >>
> >> >> >> >>> >> I agree we should correct it, and perhaps to prevent other
> >> >> problems
> >> >> >> >>> >> these
> >> >> >> >>> >> files should be converted to UTF-8.
> >> >> >> >>> >>
> >> >> >> >>> >> For the record I am still confused about these java-code
> >> >> analyzers
> >> >> >> that
> >> >> >> >>> >> implement snowball algorithms, why do they exist when the
> >> same
> >> >> >> >>> >> functionality
> >> >> >> >>> >> is in contrib/snowball?
> >> >> >> >>> >>
> >> >> >> >>> >>
> >> >> >> >>> >> > Don't use ensureOpen() excessively in IndexReader and
> >> >> IndexWriter
> >> >> >> >>> >> > --------------------------------------------------------
> ---
> >> ---
> >> >> ---
> >> >> >> >>> >> >
> >> >> >> >>> >> >                 Key: LUCENE-1707
> >> >> >> >>> >> >                 URL:
> >> >> >> >>> >> > https://issues.apache.org/jira/browse/LUCENE-1707
> >> >> >> >>> >> >             Project: Lucene - Java
> >> >> >> >>> >> >          Issue Type: Improvement
> >> >> >> >>> >> >          Components: Index
> >> >> >> >>> >> >            Reporter: Shai Erera
> >> >> >> >>> >> >             Fix For: 2.9
> >> >> >> >>> >> >
> >> >> >> >>> >> >         Attachments: LUCENE-1707.patch, LUCENE-
> 1707.patch
> >> >> >> >>> >> >
> >> >> >> >>> >> >
> >> >> >> >>> >> > A spin off from here:
> >> >> >> >>> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-
> >> >> >> td24127806.html.
> >> >> >> >>> >> > We should stop calling this method when it's not
> necessary
> >> for
> >> >> >> any
> >> >> >> >>> >> > internal Lucene code. Currently, this code seems to hurt
> >> >> properly
> >> >> >> >>> >> > written
> >> >> >> >>> >> > apps, unnecessarily.
> >> >> >> >>> >> > Will post a patch soon
> >> >> >> >>> >>
> >> >> >> >>> >> --
> >> >> >> >>> >> This message is automatically generated by JIRA.
> >> >> >> >>> >> -
> >> >> >> >>> >> You can reply to this email to add a comment to the issue
> >> >> online.
> >> >> >> >>> >>
> >> >> >> >>> >>
> >> >> >> >>> >> ----------------------------------------------------------
> ---
> >> ---
> >> >> ---
> >> >> >> --
> >> >> >> >>> >> To unsubscribe, e-mail: java-dev-
> >> unsubscribe@lucene.apache.org
> >> >> >> >>> >> For additional commands, e-mail: java-dev-
> >> help@lucene.apache.org
> >> >> >> >>> >>
> >> >> >> >>> >
> >> >> >> >>> >
> >> >> >> >>> >
> >> >> >> >>> > --
> >> >> >> >>> > --
> >> >> >> >>> > - Mark
> >> >> >> >>> >
> >> >> >> >>> > http://www.lucidimagination.com
> >> >> >> >>> >
> >> >> >> >>> >
> >> >> >> >>>
> >> >> >> >>>
> >> >> >> >>>
> >> >> >> >>> --
> >> >> >> >>> Robert Muir
> >> >> >> >>> rcmuir@gmail.com
> >> >> >> >>>
> >> >> >> >>> -------------------------------------------------------------
> ---
> >> ---
> >> >> --
> >> >> >> >>> To unsubscribe, e-mail: java-dev-
> unsubscribe@lucene.apache.org
> >> >> >> >>> For additional commands, e-mail: java-dev-
> help@lucene.apache.org
> >> >> >> >>>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >
> >> >> >> > ---------------------------------------------------------------
> ---
> >> ---
> >> >> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> >> >> > For additional commands, e-mail: java-dev-
> help@lucene.apache.org
> >> >> >> >
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Robert Muir
> >> >> >> rcmuir@gmail.com
> >> >> >>
> >> >> >> -----------------------------------------------------------------
> ---
> >> -
> >> >> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> >> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >> >
> >> >> >
> >> >> >
> >> >> > ------------------------------------------------------------------
> ---
> >> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> >> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Robert Muir
> >> >> rcmuir@gmail.com
> >> >>
> >> >> --------------------------------------------------------------------
> -
> >> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Robert Muir
> >> rcmuir@gmail.com
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Michael McCandless <lu...@mikemccandless.com>.
contrib/analyzers/src/test/org/apache/lucene/analysis/ru/stemsUTF8.txt
looks right on OpenSolaris (unix EOLs).

Mike

On Mon, Jul 6, 2009 at 9:53 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
> I fixed the encoding problem by convertig the test files to UTF-8 and
> changed the Reader charset parameter to UTF-8. All files now have old-style
> native again. Could somebody check if in unix, the files only have LF (and
> in windows the files have CRLF, which is the state how I committed it)?
>
> The overall strange/incorrect charset conversion is not touched at all, but
> I strongly agree to remove it (and only keep UnicodeRussian as charset
> parmeter allowed to the analyzer) or remove the analyzer at all.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: Robert Muir [mailto:rcmuir@gmail.com]
>> Sent: Monday, July 06, 2009 3:26 PM
>> To: java-dev@lucene.apache.org
>> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
>> excessively in IndexReader and IndexWriter
>>
>> uwe I completely agree.
>>
>> to add the icing on the cake the entire analyzer appears to be just a
>> duplication of the contrib/snowball Russian functionality...!
>>
>> On Mon, Jul 6, 2009 at 9:19 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
>> > The whole russian analyzer is very strange and works against all
>> > charset/unicode conventions. It defines own "charsets" (the only valid
>> one
>> > is UNICODE), which are all applied to standard java 16 bit chars. The
>> test
>> > shows, how this works: It open a text file in KOI8 using the "ISO-88591-
>> 1"
>> > charset (just to not modify the codepoints when converting to 16bit java
>> > chars (in principle it does a deprecated "new String(byte[],0)"). These
>> > completely wrong java chars are then handled by an analyzers's internal
>> > charset conversion (working on the 16 bit chars).
>> >
>> > The only correct usage of this package is:
>> > - open file with correct encoding (when instantiating the Reader, so
>> specify
>> > KOI8 or windows1251 to the Reader). The string is then correctly UTF-16
>> > encoded java chars. On this string the "pseudo-charset" UNICODE of this
>> > analyzer can be used.
>> >
>> > In my opinion, this invalid usage of java chars should be deprecated,
>> the
>> > only correct pseudo-charset should be the one specified by UNICODE and
>> all
>> > charset conversions should be done using the Reader.
>> >
>> > Uwe
>> >
>> > -----
>> > Uwe Schindler
>> > H.-H.-Meier-Allee 63, D-28213 Bremen
>> > http://www.thetaphi.de
>> > eMail: uwe@thetaphi.de
>> >
>> >> -----Original Message-----
>> >> From: Robert Muir [mailto:rcmuir@gmail.com]
>> >> Sent: Monday, July 06, 2009 3:08 PM
>> >> To: java-dev@lucene.apache.org
>> >> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
>> >> excessively in IndexReader and IndexWriter
>> >>
>> >> Uwe, I think so too. This way it will not be prone to breakage again.
>> >>
>> >> On Mon, Jul 6, 2009 at 8:38 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
>> >> > In my opinion, these files should be converted to UTF-8 and committed
>> >> again
>> >> > (and the Reader in the test recondigured for UTF-8). Then they can be
>> >> native
>> >> > EOL style again. The problem is that SVN can only handle the EOL
>> style
>> >> for
>> >> > one-byte-per-char and UTF-8 files.
>> >> >
>> >> > I give it a try here (and I have a converter).
>> >> >
>> >> > -----
>> >> > Uwe Schindler
>> >> > H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> > http://www.thetaphi.de
>> >> > eMail: uwe@thetaphi.de
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Robert Muir [mailto:rcmuir@gmail.com]
>> >> >> Sent: Monday, July 06, 2009 1:11 PM
>> >> >> To: java-dev@lucene.apache.org
>> >> >> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
>> >> >> excessively in IndexReader and IndexWriter
>> >> >>
>> >> >> yeah, its fixed now.
>> >> >>
>> >> >> On Mon, Jul 6, 2009 at 7:06 AM, Michael
>> >> >> McCandless<lu...@mikemccandless.com> wrote:
>> >> >> > Is this the native vs LF svn:eol-style that Uwe already fixed?
>> >> >> >
>> >> >> > Mike
>> >> >> >
>> >> >> > On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera<se...@gmail.com>
>> wrote:
>> >> >> >> Can somebody try to revert the change and test it on Windows?
>> >> >> >>
>> >> >> >> On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com>
>> >> wrote:
>> >> >> >>>
>> >> >> >>> well then I have no idea why it doesn't fail. Except that
>> perhaps
>> >> its
>> >> >> >>> EOL-related (as Shai said), and that the failure is somehow
>> >> >> >>> platform-dependent due to newline differences between windows
>> and
>> >> unix
>> >> >> >>> (and the way these are encoded in UTF-16/stored in SVN)?
>> >> >> >>>
>> >> >> >>> I don't do really any work with files in UTF-16 so this is just
>> a
>> >> >> theory.
>> >> >> >>>
>> >> >> >>> On Thu, Jul 2, 2009 at 9:40 AM, Mark
>> Miller<ma...@gmail.com>
>> >> >> wrote:
>> >> >> >>> > Hudson runs all the tests and emails java-dev if any of them
>> >> fail.
>> >> >> >>> >
>> >> >> >>> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA)
>> >> <ji...@apache.org>
>> >> >> >>> > wrote:
>> >> >> >>> >>
>> >> >> >>> >>    [
>> >> >> >>> >>
>> >> >> >>> >> https://issues.apache.org/jira/browse/LUCENE-
>> >> >> 1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
>> >> >> tabpanel&focusedCommentId=12726479#action_12726479
>> >> >> >>> >> ]
>> >> >> >>> >>
>> >> >> >>> >> Robert Muir commented on LUCENE-1707:
>> >> >> >>> >> -------------------------------------
>> >> >> >>> >>
>> >> >> >>> >> bq. Why doesn't Hudson encounter this problem?
>> >> >> >>> >>
>> >> >> >>> >> Forgive my ignorance, does hudson also run tests or just
>> verify
>> >> >> build?
>> >> >> >>> >> These files are only used in tests!
>> >> >> >>> >>
>> >> >> >>> >> I agree we should correct it, and perhaps to prevent other
>> >> problems
>> >> >> >>> >> these
>> >> >> >>> >> files should be converted to UTF-8.
>> >> >> >>> >>
>> >> >> >>> >> For the record I am still confused about these java-code
>> >> analyzers
>> >> >> that
>> >> >> >>> >> implement snowball algorithms, why do they exist when the
>> same
>> >> >> >>> >> functionality
>> >> >> >>> >> is in contrib/snowball?
>> >> >> >>> >>
>> >> >> >>> >>
>> >> >> >>> >> > Don't use ensureOpen() excessively in IndexReader and
>> >> IndexWriter
>> >> >> >>> >> > -----------------------------------------------------------
>> ---
>> >> ---
>> >> >> >>> >> >
>> >> >> >>> >> >                 Key: LUCENE-1707
>> >> >> >>> >> >                 URL:
>> >> >> >>> >> > https://issues.apache.org/jira/browse/LUCENE-1707
>> >> >> >>> >> >             Project: Lucene - Java
>> >> >> >>> >> >          Issue Type: Improvement
>> >> >> >>> >> >          Components: Index
>> >> >> >>> >> >            Reporter: Shai Erera
>> >> >> >>> >> >             Fix For: 2.9
>> >> >> >>> >> >
>> >> >> >>> >> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
>> >> >> >>> >> >
>> >> >> >>> >> >
>> >> >> >>> >> > A spin off from here:
>> >> >> >>> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-
>> >> >> td24127806.html.
>> >> >> >>> >> > We should stop calling this method when it's not necessary
>> for
>> >> >> any
>> >> >> >>> >> > internal Lucene code. Currently, this code seems to hurt
>> >> properly
>> >> >> >>> >> > written
>> >> >> >>> >> > apps, unnecessarily.
>> >> >> >>> >> > Will post a patch soon
>> >> >> >>> >>
>> >> >> >>> >> --
>> >> >> >>> >> This message is automatically generated by JIRA.
>> >> >> >>> >> -
>> >> >> >>> >> You can reply to this email to add a comment to the issue
>> >> online.
>> >> >> >>> >>
>> >> >> >>> >>
>> >> >> >>> >> -------------------------------------------------------------
>> ---
>> >> ---
>> >> >> --
>> >> >> >>> >> To unsubscribe, e-mail: java-dev-
>> unsubscribe@lucene.apache.org
>> >> >> >>> >> For additional commands, e-mail: java-dev-
>> help@lucene.apache.org
>> >> >> >>> >>
>> >> >> >>> >
>> >> >> >>> >
>> >> >> >>> >
>> >> >> >>> > --
>> >> >> >>> > --
>> >> >> >>> > - Mark
>> >> >> >>> >
>> >> >> >>> > http://www.lucidimagination.com
>> >> >> >>> >
>> >> >> >>> >
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> --
>> >> >> >>> Robert Muir
>> >> >> >>> rcmuir@gmail.com
>> >> >> >>>
>> >> >> >>> ----------------------------------------------------------------
>> ---
>> >> --
>> >> >> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> >> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >> >> >>>
>> >> >> >>
>> >> >> >>
>> >> >> >
>> >> >> > ------------------------------------------------------------------
>> ---
>> >> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> >> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Robert Muir
>> >> >> rcmuir@gmail.com
>> >> >>
>> >> >> --------------------------------------------------------------------
>> -
>> >> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >> >
>> >> >
>> >> >
>> >> > ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Robert Muir
>> >> rcmuir@gmail.com
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >
>> >
>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Uwe Schindler <uw...@thetaphi.de>.
I fixed the encoding problem by convertig the test files to UTF-8 and
changed the Reader charset parameter to UTF-8. All files now have old-style
native again. Could somebody check if in unix, the files only have LF (and
in windows the files have CRLF, which is the state how I committed it)?

The overall strange/incorrect charset conversion is not touched at all, but
I strongly agree to remove it (and only keep UnicodeRussian as charset
parmeter allowed to the analyzer) or remove the analyzer at all.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Monday, July 06, 2009 3:26 PM
> To: java-dev@lucene.apache.org
> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
> excessively in IndexReader and IndexWriter
> 
> uwe I completely agree.
> 
> to add the icing on the cake the entire analyzer appears to be just a
> duplication of the contrib/snowball Russian functionality...!
> 
> On Mon, Jul 6, 2009 at 9:19 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
> > The whole russian analyzer is very strange and works against all
> > charset/unicode conventions. It defines own "charsets" (the only valid
> one
> > is UNICODE), which are all applied to standard java 16 bit chars. The
> test
> > shows, how this works: It open a text file in KOI8 using the "ISO-88591-
> 1"
> > charset (just to not modify the codepoints when converting to 16bit java
> > chars (in principle it does a deprecated "new String(byte[],0)"). These
> > completely wrong java chars are then handled by an analyzers's internal
> > charset conversion (working on the 16 bit chars).
> >
> > The only correct usage of this package is:
> > - open file with correct encoding (when instantiating the Reader, so
> specify
> > KOI8 or windows1251 to the Reader). The string is then correctly UTF-16
> > encoded java chars. On this string the "pseudo-charset" UNICODE of this
> > analyzer can be used.
> >
> > In my opinion, this invalid usage of java chars should be deprecated,
> the
> > only correct pseudo-charset should be the one specified by UNICODE and
> all
> > charset conversions should be done using the Reader.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >> -----Original Message-----
> >> From: Robert Muir [mailto:rcmuir@gmail.com]
> >> Sent: Monday, July 06, 2009 3:08 PM
> >> To: java-dev@lucene.apache.org
> >> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
> >> excessively in IndexReader and IndexWriter
> >>
> >> Uwe, I think so too. This way it will not be prone to breakage again.
> >>
> >> On Mon, Jul 6, 2009 at 8:38 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
> >> > In my opinion, these files should be converted to UTF-8 and committed
> >> again
> >> > (and the Reader in the test recondigured for UTF-8). Then they can be
> >> native
> >> > EOL style again. The problem is that SVN can only handle the EOL
> style
> >> for
> >> > one-byte-per-char and UTF-8 files.
> >> >
> >> > I give it a try here (and I have a converter).
> >> >
> >> > -----
> >> > Uwe Schindler
> >> > H.-H.-Meier-Allee 63, D-28213 Bremen
> >> > http://www.thetaphi.de
> >> > eMail: uwe@thetaphi.de
> >> >
> >> >> -----Original Message-----
> >> >> From: Robert Muir [mailto:rcmuir@gmail.com]
> >> >> Sent: Monday, July 06, 2009 1:11 PM
> >> >> To: java-dev@lucene.apache.org
> >> >> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
> >> >> excessively in IndexReader and IndexWriter
> >> >>
> >> >> yeah, its fixed now.
> >> >>
> >> >> On Mon, Jul 6, 2009 at 7:06 AM, Michael
> >> >> McCandless<lu...@mikemccandless.com> wrote:
> >> >> > Is this the native vs LF svn:eol-style that Uwe already fixed?
> >> >> >
> >> >> > Mike
> >> >> >
> >> >> > On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera<se...@gmail.com>
> wrote:
> >> >> >> Can somebody try to revert the change and test it on Windows?
> >> >> >>
> >> >> >> On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com>
> >> wrote:
> >> >> >>>
> >> >> >>> well then I have no idea why it doesn't fail. Except that
> perhaps
> >> its
> >> >> >>> EOL-related (as Shai said), and that the failure is somehow
> >> >> >>> platform-dependent due to newline differences between windows
> and
> >> unix
> >> >> >>> (and the way these are encoded in UTF-16/stored in SVN)?
> >> >> >>>
> >> >> >>> I don't do really any work with files in UTF-16 so this is just
> a
> >> >> theory.
> >> >> >>>
> >> >> >>> On Thu, Jul 2, 2009 at 9:40 AM, Mark
> Miller<ma...@gmail.com>
> >> >> wrote:
> >> >> >>> > Hudson runs all the tests and emails java-dev if any of them
> >> fail.
> >> >> >>> >
> >> >> >>> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA)
> >> <ji...@apache.org>
> >> >> >>> > wrote:
> >> >> >>> >>
> >> >> >>> >>    [
> >> >> >>> >>
> >> >> >>> >> https://issues.apache.org/jira/browse/LUCENE-
> >> >> 1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
> >> >> tabpanel&focusedCommentId=12726479#action_12726479
> >> >> >>> >> ]
> >> >> >>> >>
> >> >> >>> >> Robert Muir commented on LUCENE-1707:
> >> >> >>> >> -------------------------------------
> >> >> >>> >>
> >> >> >>> >> bq. Why doesn't Hudson encounter this problem?
> >> >> >>> >>
> >> >> >>> >> Forgive my ignorance, does hudson also run tests or just
> verify
> >> >> build?
> >> >> >>> >> These files are only used in tests!
> >> >> >>> >>
> >> >> >>> >> I agree we should correct it, and perhaps to prevent other
> >> problems
> >> >> >>> >> these
> >> >> >>> >> files should be converted to UTF-8.
> >> >> >>> >>
> >> >> >>> >> For the record I am still confused about these java-code
> >> analyzers
> >> >> that
> >> >> >>> >> implement snowball algorithms, why do they exist when the
> same
> >> >> >>> >> functionality
> >> >> >>> >> is in contrib/snowball?
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> > Don't use ensureOpen() excessively in IndexReader and
> >> IndexWriter
> >> >> >>> >> > -----------------------------------------------------------
> ---
> >> ---
> >> >> >>> >> >
> >> >> >>> >> >                 Key: LUCENE-1707
> >> >> >>> >> >                 URL:
> >> >> >>> >> > https://issues.apache.org/jira/browse/LUCENE-1707
> >> >> >>> >> >             Project: Lucene - Java
> >> >> >>> >> >          Issue Type: Improvement
> >> >> >>> >> >          Components: Index
> >> >> >>> >> >            Reporter: Shai Erera
> >> >> >>> >> >             Fix For: 2.9
> >> >> >>> >> >
> >> >> >>> >> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> > A spin off from here:
> >> >> >>> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-
> >> >> td24127806.html.
> >> >> >>> >> > We should stop calling this method when it's not necessary
> for
> >> >> any
> >> >> >>> >> > internal Lucene code. Currently, this code seems to hurt
> >> properly
> >> >> >>> >> > written
> >> >> >>> >> > apps, unnecessarily.
> >> >> >>> >> > Will post a patch soon
> >> >> >>> >>
> >> >> >>> >> --
> >> >> >>> >> This message is automatically generated by JIRA.
> >> >> >>> >> -
> >> >> >>> >> You can reply to this email to add a comment to the issue
> >> online.
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> -------------------------------------------------------------
> ---
> >> ---
> >> >> --
> >> >> >>> >> To unsubscribe, e-mail: java-dev-
> unsubscribe@lucene.apache.org
> >> >> >>> >> For additional commands, e-mail: java-dev-
> help@lucene.apache.org
> >> >> >>> >>
> >> >> >>> >
> >> >> >>> >
> >> >> >>> >
> >> >> >>> > --
> >> >> >>> > --
> >> >> >>> > - Mark
> >> >> >>> >
> >> >> >>> > http://www.lucidimagination.com
> >> >> >>> >
> >> >> >>> >
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> --
> >> >> >>> Robert Muir
> >> >> >>> rcmuir@gmail.com
> >> >> >>>
> >> >> >>> ----------------------------------------------------------------
> ---
> >> --
> >> >> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> >> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >> >>>
> >> >> >>
> >> >> >>
> >> >> >
> >> >> > ------------------------------------------------------------------
> ---
> >> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> >> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Robert Muir
> >> >> rcmuir@gmail.com
> >> >>
> >> >> --------------------------------------------------------------------
> -
> >> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Robert Muir
> >> rcmuir@gmail.com
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> 
> 
> 
> --
> Robert Muir
> rcmuir@gmail.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Robert Muir <rc...@gmail.com>.
uwe I completely agree.

to add the icing on the cake the entire analyzer appears to be just a
duplication of the contrib/snowball Russian functionality...!

On Mon, Jul 6, 2009 at 9:19 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
> The whole russian analyzer is very strange and works against all
> charset/unicode conventions. It defines own "charsets" (the only valid one
> is UNICODE), which are all applied to standard java 16 bit chars. The test
> shows, how this works: It open a text file in KOI8 using the "ISO-88591-1"
> charset (just to not modify the codepoints when converting to 16bit java
> chars (in principle it does a deprecated "new String(byte[],0)"). These
> completely wrong java chars are then handled by an analyzers's internal
> charset conversion (working on the 16 bit chars).
>
> The only correct usage of this package is:
> - open file with correct encoding (when instantiating the Reader, so specify
> KOI8 or windows1251 to the Reader). The string is then correctly UTF-16
> encoded java chars. On this string the "pseudo-charset" UNICODE of this
> analyzer can be used.
>
> In my opinion, this invalid usage of java chars should be deprecated, the
> only correct pseudo-charset should be the one specified by UNICODE and all
> charset conversions should be done using the Reader.
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: Robert Muir [mailto:rcmuir@gmail.com]
>> Sent: Monday, July 06, 2009 3:08 PM
>> To: java-dev@lucene.apache.org
>> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
>> excessively in IndexReader and IndexWriter
>>
>> Uwe, I think so too. This way it will not be prone to breakage again.
>>
>> On Mon, Jul 6, 2009 at 8:38 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
>> > In my opinion, these files should be converted to UTF-8 and committed
>> again
>> > (and the Reader in the test recondigured for UTF-8). Then they can be
>> native
>> > EOL style again. The problem is that SVN can only handle the EOL style
>> for
>> > one-byte-per-char and UTF-8 files.
>> >
>> > I give it a try here (and I have a converter).
>> >
>> > -----
>> > Uwe Schindler
>> > H.-H.-Meier-Allee 63, D-28213 Bremen
>> > http://www.thetaphi.de
>> > eMail: uwe@thetaphi.de
>> >
>> >> -----Original Message-----
>> >> From: Robert Muir [mailto:rcmuir@gmail.com]
>> >> Sent: Monday, July 06, 2009 1:11 PM
>> >> To: java-dev@lucene.apache.org
>> >> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
>> >> excessively in IndexReader and IndexWriter
>> >>
>> >> yeah, its fixed now.
>> >>
>> >> On Mon, Jul 6, 2009 at 7:06 AM, Michael
>> >> McCandless<lu...@mikemccandless.com> wrote:
>> >> > Is this the native vs LF svn:eol-style that Uwe already fixed?
>> >> >
>> >> > Mike
>> >> >
>> >> > On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera<se...@gmail.com> wrote:
>> >> >> Can somebody try to revert the change and test it on Windows?
>> >> >>
>> >> >> On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com>
>> wrote:
>> >> >>>
>> >> >>> well then I have no idea why it doesn't fail. Except that perhaps
>> its
>> >> >>> EOL-related (as Shai said), and that the failure is somehow
>> >> >>> platform-dependent due to newline differences between windows and
>> unix
>> >> >>> (and the way these are encoded in UTF-16/stored in SVN)?
>> >> >>>
>> >> >>> I don't do really any work with files in UTF-16 so this is just a
>> >> theory.
>> >> >>>
>> >> >>> On Thu, Jul 2, 2009 at 9:40 AM, Mark Miller<ma...@gmail.com>
>> >> wrote:
>> >> >>> > Hudson runs all the tests and emails java-dev if any of them
>> fail.
>> >> >>> >
>> >> >>> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA)
>> <ji...@apache.org>
>> >> >>> > wrote:
>> >> >>> >>
>> >> >>> >>    [
>> >> >>> >>
>> >> >>> >> https://issues.apache.org/jira/browse/LUCENE-
>> >> 1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
>> >> tabpanel&focusedCommentId=12726479#action_12726479
>> >> >>> >> ]
>> >> >>> >>
>> >> >>> >> Robert Muir commented on LUCENE-1707:
>> >> >>> >> -------------------------------------
>> >> >>> >>
>> >> >>> >> bq. Why doesn't Hudson encounter this problem?
>> >> >>> >>
>> >> >>> >> Forgive my ignorance, does hudson also run tests or just verify
>> >> build?
>> >> >>> >> These files are only used in tests!
>> >> >>> >>
>> >> >>> >> I agree we should correct it, and perhaps to prevent other
>> problems
>> >> >>> >> these
>> >> >>> >> files should be converted to UTF-8.
>> >> >>> >>
>> >> >>> >> For the record I am still confused about these java-code
>> analyzers
>> >> that
>> >> >>> >> implement snowball algorithms, why do they exist when the same
>> >> >>> >> functionality
>> >> >>> >> is in contrib/snowball?
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> > Don't use ensureOpen() excessively in IndexReader and
>> IndexWriter
>> >> >>> >> > --------------------------------------------------------------
>> ---
>> >> >>> >> >
>> >> >>> >> >                 Key: LUCENE-1707
>> >> >>> >> >                 URL:
>> >> >>> >> > https://issues.apache.org/jira/browse/LUCENE-1707
>> >> >>> >> >             Project: Lucene - Java
>> >> >>> >> >          Issue Type: Improvement
>> >> >>> >> >          Components: Index
>> >> >>> >> >            Reporter: Shai Erera
>> >> >>> >> >             Fix For: 2.9
>> >> >>> >> >
>> >> >>> >> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> > A spin off from here:
>> >> >>> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-
>> >> td24127806.html.
>> >> >>> >> > We should stop calling this method when it's not necessary for
>> >> any
>> >> >>> >> > internal Lucene code. Currently, this code seems to hurt
>> properly
>> >> >>> >> > written
>> >> >>> >> > apps, unnecessarily.
>> >> >>> >> > Will post a patch soon
>> >> >>> >>
>> >> >>> >> --
>> >> >>> >> This message is automatically generated by JIRA.
>> >> >>> >> -
>> >> >>> >> You can reply to this email to add a comment to the issue
>> online.
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> ----------------------------------------------------------------
>> ---
>> >> --
>> >> >>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> >>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >> >>> >>
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>> > --
>> >> >>> > --
>> >> >>> > - Mark
>> >> >>> >
>> >> >>> > http://www.lucidimagination.com
>> >> >>> >
>> >> >>> >
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>> Robert Muir
>> >> >>> rcmuir@gmail.com
>> >> >>>
>> >> >>> -------------------------------------------------------------------
>> --
>> >> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >> >>>
>> >> >>
>> >> >>
>> >> >
>> >> > ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Robert Muir
>> >> rcmuir@gmail.com
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >
>> >
>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
Robert Muir
rcmuir@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Uwe Schindler <uw...@thetaphi.de>.
The whole russian analyzer is very strange and works against all
charset/unicode conventions. It defines own "charsets" (the only valid one
is UNICODE), which are all applied to standard java 16 bit chars. The test
shows, how this works: It open a text file in KOI8 using the "ISO-88591-1"
charset (just to not modify the codepoints when converting to 16bit java
chars (in principle it does a deprecated "new String(byte[],0)"). These
completely wrong java chars are then handled by an analyzers's internal
charset conversion (working on the 16 bit chars).

The only correct usage of this package is:
- open file with correct encoding (when instantiating the Reader, so specify
KOI8 or windows1251 to the Reader). The string is then correctly UTF-16
encoded java chars. On this string the "pseudo-charset" UNICODE of this
analyzer can be used.

In my opinion, this invalid usage of java chars should be deprecated, the
only correct pseudo-charset should be the one specified by UNICODE and all
charset conversions should be done using the Reader.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Monday, July 06, 2009 3:08 PM
> To: java-dev@lucene.apache.org
> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
> excessively in IndexReader and IndexWriter
> 
> Uwe, I think so too. This way it will not be prone to breakage again.
> 
> On Mon, Jul 6, 2009 at 8:38 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
> > In my opinion, these files should be converted to UTF-8 and committed
> again
> > (and the Reader in the test recondigured for UTF-8). Then they can be
> native
> > EOL style again. The problem is that SVN can only handle the EOL style
> for
> > one-byte-per-char and UTF-8 files.
> >
> > I give it a try here (and I have a converter).
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >> -----Original Message-----
> >> From: Robert Muir [mailto:rcmuir@gmail.com]
> >> Sent: Monday, July 06, 2009 1:11 PM
> >> To: java-dev@lucene.apache.org
> >> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
> >> excessively in IndexReader and IndexWriter
> >>
> >> yeah, its fixed now.
> >>
> >> On Mon, Jul 6, 2009 at 7:06 AM, Michael
> >> McCandless<lu...@mikemccandless.com> wrote:
> >> > Is this the native vs LF svn:eol-style that Uwe already fixed?
> >> >
> >> > Mike
> >> >
> >> > On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera<se...@gmail.com> wrote:
> >> >> Can somebody try to revert the change and test it on Windows?
> >> >>
> >> >> On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com>
> wrote:
> >> >>>
> >> >>> well then I have no idea why it doesn't fail. Except that perhaps
> its
> >> >>> EOL-related (as Shai said), and that the failure is somehow
> >> >>> platform-dependent due to newline differences between windows and
> unix
> >> >>> (and the way these are encoded in UTF-16/stored in SVN)?
> >> >>>
> >> >>> I don't do really any work with files in UTF-16 so this is just a
> >> theory.
> >> >>>
> >> >>> On Thu, Jul 2, 2009 at 9:40 AM, Mark Miller<ma...@gmail.com>
> >> wrote:
> >> >>> > Hudson runs all the tests and emails java-dev if any of them
> fail.
> >> >>> >
> >> >>> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA)
> <ji...@apache.org>
> >> >>> > wrote:
> >> >>> >>
> >> >>> >>    [
> >> >>> >>
> >> >>> >> https://issues.apache.org/jira/browse/LUCENE-
> >> 1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
> >> tabpanel&focusedCommentId=12726479#action_12726479
> >> >>> >> ]
> >> >>> >>
> >> >>> >> Robert Muir commented on LUCENE-1707:
> >> >>> >> -------------------------------------
> >> >>> >>
> >> >>> >> bq. Why doesn't Hudson encounter this problem?
> >> >>> >>
> >> >>> >> Forgive my ignorance, does hudson also run tests or just verify
> >> build?
> >> >>> >> These files are only used in tests!
> >> >>> >>
> >> >>> >> I agree we should correct it, and perhaps to prevent other
> problems
> >> >>> >> these
> >> >>> >> files should be converted to UTF-8.
> >> >>> >>
> >> >>> >> For the record I am still confused about these java-code
> analyzers
> >> that
> >> >>> >> implement snowball algorithms, why do they exist when the same
> >> >>> >> functionality
> >> >>> >> is in contrib/snowball?
> >> >>> >>
> >> >>> >>
> >> >>> >> > Don't use ensureOpen() excessively in IndexReader and
> IndexWriter
> >> >>> >> > --------------------------------------------------------------
> ---
> >> >>> >> >
> >> >>> >> >                 Key: LUCENE-1707
> >> >>> >> >                 URL:
> >> >>> >> > https://issues.apache.org/jira/browse/LUCENE-1707
> >> >>> >> >             Project: Lucene - Java
> >> >>> >> >          Issue Type: Improvement
> >> >>> >> >          Components: Index
> >> >>> >> >            Reporter: Shai Erera
> >> >>> >> >             Fix For: 2.9
> >> >>> >> >
> >> >>> >> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
> >> >>> >> >
> >> >>> >> >
> >> >>> >> > A spin off from here:
> >> >>> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-
> >> td24127806.html.
> >> >>> >> > We should stop calling this method when it's not necessary for
> >> any
> >> >>> >> > internal Lucene code. Currently, this code seems to hurt
> properly
> >> >>> >> > written
> >> >>> >> > apps, unnecessarily.
> >> >>> >> > Will post a patch soon
> >> >>> >>
> >> >>> >> --
> >> >>> >> This message is automatically generated by JIRA.
> >> >>> >> -
> >> >>> >> You can reply to this email to add a comment to the issue
> online.
> >> >>> >>
> >> >>> >>
> >> >>> >> ----------------------------------------------------------------
> ---
> >> --
> >> >>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> >>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >>> >>
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > --
> >> >>> > --
> >> >>> > - Mark
> >> >>> >
> >> >>> > http://www.lucidimagination.com
> >> >>> >
> >> >>> >
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Robert Muir
> >> >>> rcmuir@gmail.com
> >> >>>
> >> >>> -------------------------------------------------------------------
> --
> >> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >>>
> >> >>
> >> >>
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Robert Muir
> >> rcmuir@gmail.com
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> 
> 
> 
> --
> Robert Muir
> rcmuir@gmail.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Robert Muir <rc...@gmail.com>.
Uwe, I think so too. This way it will not be prone to breakage again.

On Mon, Jul 6, 2009 at 8:38 AM, Uwe Schindler<uw...@thetaphi.de> wrote:
> In my opinion, these files should be converted to UTF-8 and committed again
> (and the Reader in the test recondigured for UTF-8). Then they can be native
> EOL style again. The problem is that SVN can only handle the EOL style for
> one-byte-per-char and UTF-8 files.
>
> I give it a try here (and I have a converter).
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: Robert Muir [mailto:rcmuir@gmail.com]
>> Sent: Monday, July 06, 2009 1:11 PM
>> To: java-dev@lucene.apache.org
>> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
>> excessively in IndexReader and IndexWriter
>>
>> yeah, its fixed now.
>>
>> On Mon, Jul 6, 2009 at 7:06 AM, Michael
>> McCandless<lu...@mikemccandless.com> wrote:
>> > Is this the native vs LF svn:eol-style that Uwe already fixed?
>> >
>> > Mike
>> >
>> > On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera<se...@gmail.com> wrote:
>> >> Can somebody try to revert the change and test it on Windows?
>> >>
>> >> On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com> wrote:
>> >>>
>> >>> well then I have no idea why it doesn't fail. Except that perhaps its
>> >>> EOL-related (as Shai said), and that the failure is somehow
>> >>> platform-dependent due to newline differences between windows and unix
>> >>> (and the way these are encoded in UTF-16/stored in SVN)?
>> >>>
>> >>> I don't do really any work with files in UTF-16 so this is just a
>> theory.
>> >>>
>> >>> On Thu, Jul 2, 2009 at 9:40 AM, Mark Miller<ma...@gmail.com>
>> wrote:
>> >>> > Hudson runs all the tests and emails java-dev if any of them fail.
>> >>> >
>> >>> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA) <ji...@apache.org>
>> >>> > wrote:
>> >>> >>
>> >>> >>    [
>> >>> >>
>> >>> >> https://issues.apache.org/jira/browse/LUCENE-
>> 1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
>> tabpanel&focusedCommentId=12726479#action_12726479
>> >>> >> ]
>> >>> >>
>> >>> >> Robert Muir commented on LUCENE-1707:
>> >>> >> -------------------------------------
>> >>> >>
>> >>> >> bq. Why doesn't Hudson encounter this problem?
>> >>> >>
>> >>> >> Forgive my ignorance, does hudson also run tests or just verify
>> build?
>> >>> >> These files are only used in tests!
>> >>> >>
>> >>> >> I agree we should correct it, and perhaps to prevent other problems
>> >>> >> these
>> >>> >> files should be converted to UTF-8.
>> >>> >>
>> >>> >> For the record I am still confused about these java-code analyzers
>> that
>> >>> >> implement snowball algorithms, why do they exist when the same
>> >>> >> functionality
>> >>> >> is in contrib/snowball?
>> >>> >>
>> >>> >>
>> >>> >> > Don't use ensureOpen() excessively in IndexReader and IndexWriter
>> >>> >> > -----------------------------------------------------------------
>> >>> >> >
>> >>> >> >                 Key: LUCENE-1707
>> >>> >> >                 URL:
>> >>> >> > https://issues.apache.org/jira/browse/LUCENE-1707
>> >>> >> >             Project: Lucene - Java
>> >>> >> >          Issue Type: Improvement
>> >>> >> >          Components: Index
>> >>> >> >            Reporter: Shai Erera
>> >>> >> >             Fix For: 2.9
>> >>> >> >
>> >>> >> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
>> >>> >> >
>> >>> >> >
>> >>> >> > A spin off from here:
>> >>> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-
>> td24127806.html.
>> >>> >> > We should stop calling this method when it's not necessary for
>> any
>> >>> >> > internal Lucene code. Currently, this code seems to hurt properly
>> >>> >> > written
>> >>> >> > apps, unnecessarily.
>> >>> >> > Will post a patch soon
>> >>> >>
>> >>> >> --
>> >>> >> This message is automatically generated by JIRA.
>> >>> >> -
>> >>> >> You can reply to this email to add a comment to the issue online.
>> >>> >>
>> >>> >>
>> >>> >> -------------------------------------------------------------------
>> --
>> >>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>> >>
>> >>> >
>> >>> >
>> >>> >
>> >>> > --
>> >>> > --
>> >>> > - Mark
>> >>> >
>> >>> > http://www.lucidimagination.com
>> >>> >
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Robert Muir
>> >>> rcmuir@gmail.com
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>>
>> >>
>> >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >
>> >
>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
Robert Muir
rcmuir@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Uwe Schindler <uw...@thetaphi.de>.
In my opinion, these files should be converted to UTF-8 and committed again
(and the Reader in the test recondigured for UTF-8). Then they can be native
EOL style again. The problem is that SVN can only handle the EOL style for
one-byte-per-char and UTF-8 files.

I give it a try here (and I have a converter).

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Monday, July 06, 2009 1:11 PM
> To: java-dev@lucene.apache.org
> Subject: Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen()
> excessively in IndexReader and IndexWriter
> 
> yeah, its fixed now.
> 
> On Mon, Jul 6, 2009 at 7:06 AM, Michael
> McCandless<lu...@mikemccandless.com> wrote:
> > Is this the native vs LF svn:eol-style that Uwe already fixed?
> >
> > Mike
> >
> > On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera<se...@gmail.com> wrote:
> >> Can somebody try to revert the change and test it on Windows?
> >>
> >> On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com> wrote:
> >>>
> >>> well then I have no idea why it doesn't fail. Except that perhaps its
> >>> EOL-related (as Shai said), and that the failure is somehow
> >>> platform-dependent due to newline differences between windows and unix
> >>> (and the way these are encoded in UTF-16/stored in SVN)?
> >>>
> >>> I don't do really any work with files in UTF-16 so this is just a
> theory.
> >>>
> >>> On Thu, Jul 2, 2009 at 9:40 AM, Mark Miller<ma...@gmail.com>
> wrote:
> >>> > Hudson runs all the tests and emails java-dev if any of them fail.
> >>> >
> >>> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA) <ji...@apache.org>
> >>> > wrote:
> >>> >>
> >>> >>    [
> >>> >>
> >>> >> https://issues.apache.org/jira/browse/LUCENE-
> 1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel&focusedCommentId=12726479#action_12726479
> >>> >> ]
> >>> >>
> >>> >> Robert Muir commented on LUCENE-1707:
> >>> >> -------------------------------------
> >>> >>
> >>> >> bq. Why doesn't Hudson encounter this problem?
> >>> >>
> >>> >> Forgive my ignorance, does hudson also run tests or just verify
> build?
> >>> >> These files are only used in tests!
> >>> >>
> >>> >> I agree we should correct it, and perhaps to prevent other problems
> >>> >> these
> >>> >> files should be converted to UTF-8.
> >>> >>
> >>> >> For the record I am still confused about these java-code analyzers
> that
> >>> >> implement snowball algorithms, why do they exist when the same
> >>> >> functionality
> >>> >> is in contrib/snowball?
> >>> >>
> >>> >>
> >>> >> > Don't use ensureOpen() excessively in IndexReader and IndexWriter
> >>> >> > -----------------------------------------------------------------
> >>> >> >
> >>> >> >                 Key: LUCENE-1707
> >>> >> >                 URL:
> >>> >> > https://issues.apache.org/jira/browse/LUCENE-1707
> >>> >> >             Project: Lucene - Java
> >>> >> >          Issue Type: Improvement
> >>> >> >          Components: Index
> >>> >> >            Reporter: Shai Erera
> >>> >> >             Fix For: 2.9
> >>> >> >
> >>> >> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
> >>> >> >
> >>> >> >
> >>> >> > A spin off from here:
> >>> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-
> td24127806.html.
> >>> >> > We should stop calling this method when it's not necessary for
> any
> >>> >> > internal Lucene code. Currently, this code seems to hurt properly
> >>> >> > written
> >>> >> > apps, unnecessarily.
> >>> >> > Will post a patch soon
> >>> >>
> >>> >> --
> >>> >> This message is automatically generated by JIRA.
> >>> >> -
> >>> >> You can reply to this email to add a comment to the issue online.
> >>> >>
> >>> >>
> >>> >> -------------------------------------------------------------------
> --
> >>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>> >>
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > --
> >>> > - Mark
> >>> >
> >>> > http://www.lucidimagination.com
> >>> >
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Robert Muir
> >>> rcmuir@gmail.com
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>>
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> 
> 
> 
> --
> Robert Muir
> rcmuir@gmail.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Robert Muir <rc...@gmail.com>.
yeah, its fixed now.

On Mon, Jul 6, 2009 at 7:06 AM, Michael
McCandless<lu...@mikemccandless.com> wrote:
> Is this the native vs LF svn:eol-style that Uwe already fixed?
>
> Mike
>
> On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera<se...@gmail.com> wrote:
>> Can somebody try to revert the change and test it on Windows?
>>
>> On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com> wrote:
>>>
>>> well then I have no idea why it doesn't fail. Except that perhaps its
>>> EOL-related (as Shai said), and that the failure is somehow
>>> platform-dependent due to newline differences between windows and unix
>>> (and the way these are encoded in UTF-16/stored in SVN)?
>>>
>>> I don't do really any work with files in UTF-16 so this is just a theory.
>>>
>>> On Thu, Jul 2, 2009 at 9:40 AM, Mark Miller<ma...@gmail.com> wrote:
>>> > Hudson runs all the tests and emails java-dev if any of them fail.
>>> >
>>> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA) <ji...@apache.org>
>>> > wrote:
>>> >>
>>> >>    [
>>> >>
>>> >> https://issues.apache.org/jira/browse/LUCENE-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726479#action_12726479
>>> >> ]
>>> >>
>>> >> Robert Muir commented on LUCENE-1707:
>>> >> -------------------------------------
>>> >>
>>> >> bq. Why doesn't Hudson encounter this problem?
>>> >>
>>> >> Forgive my ignorance, does hudson also run tests or just verify build?
>>> >> These files are only used in tests!
>>> >>
>>> >> I agree we should correct it, and perhaps to prevent other problems
>>> >> these
>>> >> files should be converted to UTF-8.
>>> >>
>>> >> For the record I am still confused about these java-code analyzers that
>>> >> implement snowball algorithms, why do they exist when the same
>>> >> functionality
>>> >> is in contrib/snowball?
>>> >>
>>> >>
>>> >> > Don't use ensureOpen() excessively in IndexReader and IndexWriter
>>> >> > -----------------------------------------------------------------
>>> >> >
>>> >> >                 Key: LUCENE-1707
>>> >> >                 URL:
>>> >> > https://issues.apache.org/jira/browse/LUCENE-1707
>>> >> >             Project: Lucene - Java
>>> >> >          Issue Type: Improvement
>>> >> >          Components: Index
>>> >> >            Reporter: Shai Erera
>>> >> >             Fix For: 2.9
>>> >> >
>>> >> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
>>> >> >
>>> >> >
>>> >> > A spin off from here:
>>> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-td24127806.html.
>>> >> > We should stop calling this method when it's not necessary for any
>>> >> > internal Lucene code. Currently, this code seems to hurt properly
>>> >> > written
>>> >> > apps, unnecessarily.
>>> >> > Will post a patch soon
>>> >>
>>> >> --
>>> >> This message is automatically generated by JIRA.
>>> >> -
>>> >> You can reply to this email to add a comment to the issue online.
>>> >>
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > --
>>> > - Mark
>>> >
>>> > http://www.lucidimagination.com
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Robert Muir
>>> rcmuir@gmail.com
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
Robert Muir
rcmuir@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Michael McCandless <lu...@mikemccandless.com>.
Is this the native vs LF svn:eol-style that Uwe already fixed?

Mike

On Thu, Jul 2, 2009 at 10:03 AM, Shai Erera<se...@gmail.com> wrote:
> Can somebody try to revert the change and test it on Windows?
>
> On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com> wrote:
>>
>> well then I have no idea why it doesn't fail. Except that perhaps its
>> EOL-related (as Shai said), and that the failure is somehow
>> platform-dependent due to newline differences between windows and unix
>> (and the way these are encoded in UTF-16/stored in SVN)?
>>
>> I don't do really any work with files in UTF-16 so this is just a theory.
>>
>> On Thu, Jul 2, 2009 at 9:40 AM, Mark Miller<ma...@gmail.com> wrote:
>> > Hudson runs all the tests and emails java-dev if any of them fail.
>> >
>> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA) <ji...@apache.org>
>> > wrote:
>> >>
>> >>    [
>> >>
>> >> https://issues.apache.org/jira/browse/LUCENE-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726479#action_12726479
>> >> ]
>> >>
>> >> Robert Muir commented on LUCENE-1707:
>> >> -------------------------------------
>> >>
>> >> bq. Why doesn't Hudson encounter this problem?
>> >>
>> >> Forgive my ignorance, does hudson also run tests or just verify build?
>> >> These files are only used in tests!
>> >>
>> >> I agree we should correct it, and perhaps to prevent other problems
>> >> these
>> >> files should be converted to UTF-8.
>> >>
>> >> For the record I am still confused about these java-code analyzers that
>> >> implement snowball algorithms, why do they exist when the same
>> >> functionality
>> >> is in contrib/snowball?
>> >>
>> >>
>> >> > Don't use ensureOpen() excessively in IndexReader and IndexWriter
>> >> > -----------------------------------------------------------------
>> >> >
>> >> >                 Key: LUCENE-1707
>> >> >                 URL:
>> >> > https://issues.apache.org/jira/browse/LUCENE-1707
>> >> >             Project: Lucene - Java
>> >> >          Issue Type: Improvement
>> >> >          Components: Index
>> >> >            Reporter: Shai Erera
>> >> >             Fix For: 2.9
>> >> >
>> >> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
>> >> >
>> >> >
>> >> > A spin off from here:
>> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-td24127806.html.
>> >> > We should stop calling this method when it's not necessary for any
>> >> > internal Lucene code. Currently, this code seems to hurt properly
>> >> > written
>> >> > apps, unnecessarily.
>> >> > Will post a patch soon
>> >>
>> >> --
>> >> This message is automatically generated by JIRA.
>> >> -
>> >> You can reply to this email to add a comment to the issue online.
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>
>> >
>> >
>> >
>> > --
>> > --
>> > - Mark
>> >
>> > http://www.lucidimagination.com
>> >
>> >
>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Shai Erera <se...@gmail.com>.
Can somebody try to revert the change and test it on Windows?

On Thu, Jul 2, 2009 at 4:44 PM, Robert Muir <rc...@gmail.com> wrote:

> well then I have no idea why it doesn't fail. Except that perhaps its
> EOL-related (as Shai said), and that the failure is somehow
> platform-dependent due to newline differences between windows and unix
> (and the way these are encoded in UTF-16/stored in SVN)?
>
> I don't do really any work with files in UTF-16 so this is just a theory.
>
> On Thu, Jul 2, 2009 at 9:40 AM, Mark Miller<ma...@gmail.com> wrote:
> > Hudson runs all the tests and emails java-dev if any of them fail.
> >
> > On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA) <ji...@apache.org>
> wrote:
> >>
> >>    [
> >>
> https://issues.apache.org/jira/browse/LUCENE-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726479#action_12726479
> >> ]
> >>
> >> Robert Muir commented on LUCENE-1707:
> >> -------------------------------------
> >>
> >> bq. Why doesn't Hudson encounter this problem?
> >>
> >> Forgive my ignorance, does hudson also run tests or just verify build?
> >> These files are only used in tests!
> >>
> >> I agree we should correct it, and perhaps to prevent other problems
> these
> >> files should be converted to UTF-8.
> >>
> >> For the record I am still confused about these java-code analyzers that
> >> implement snowball algorithms, why do they exist when the same
> functionality
> >> is in contrib/snowball?
> >>
> >>
> >> > Don't use ensureOpen() excessively in IndexReader and IndexWriter
> >> > -----------------------------------------------------------------
> >> >
> >> >                 Key: LUCENE-1707
> >> >                 URL:
> https://issues.apache.org/jira/browse/LUCENE-1707
> >> >             Project: Lucene - Java
> >> >          Issue Type: Improvement
> >> >          Components: Index
> >> >            Reporter: Shai Erera
> >> >             Fix For: 2.9
> >> >
> >> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
> >> >
> >> >
> >> > A spin off from here:
> >> > http://www.nabble.com/Excessive-use-of-ensureOpen()-td24127806.html<http://www.nabble.com/Excessive-use-of-ensureOpen%28%29-td24127806.html>
> .
> >> > We should stop calling this method when it's not necessary for any
> >> > internal Lucene code. Currently, this code seems to hurt properly
> written
> >> > apps, unnecessarily.
> >> > Will post a patch soon
> >>
> >> --
> >> This message is automatically generated by JIRA.
> >> -
> >> You can reply to this email to add a comment to the issue online.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >
> >
> >
> > --
> > --
> > - Mark
> >
> > http://www.lucidimagination.com
> >
> >
>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: [jira] Commented: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

Posted by Robert Muir <rc...@gmail.com>.
well then I have no idea why it doesn't fail. Except that perhaps its
EOL-related (as Shai said), and that the failure is somehow
platform-dependent due to newline differences between windows and unix
(and the way these are encoded in UTF-16/stored in SVN)?

I don't do really any work with files in UTF-16 so this is just a theory.

On Thu, Jul 2, 2009 at 9:40 AM, Mark Miller<ma...@gmail.com> wrote:
> Hudson runs all the tests and emails java-dev if any of them fail.
>
> On Thu, Jul 2, 2009 at 9:37 AM, Robert Muir (JIRA) <ji...@apache.org> wrote:
>>
>>    [
>> https://issues.apache.org/jira/browse/LUCENE-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726479#action_12726479
>> ]
>>
>> Robert Muir commented on LUCENE-1707:
>> -------------------------------------
>>
>> bq. Why doesn't Hudson encounter this problem?
>>
>> Forgive my ignorance, does hudson also run tests or just verify build?
>> These files are only used in tests!
>>
>> I agree we should correct it, and perhaps to prevent other problems these
>> files should be converted to UTF-8.
>>
>> For the record I am still confused about these java-code analyzers that
>> implement snowball algorithms, why do they exist when the same functionality
>> is in contrib/snowball?
>>
>>
>> > Don't use ensureOpen() excessively in IndexReader and IndexWriter
>> > -----------------------------------------------------------------
>> >
>> >                 Key: LUCENE-1707
>> >                 URL: https://issues.apache.org/jira/browse/LUCENE-1707
>> >             Project: Lucene - Java
>> >          Issue Type: Improvement
>> >          Components: Index
>> >            Reporter: Shai Erera
>> >             Fix For: 2.9
>> >
>> >         Attachments: LUCENE-1707.patch, LUCENE-1707.patch
>> >
>> >
>> > A spin off from here:
>> > http://www.nabble.com/Excessive-use-of-ensureOpen()-td24127806.html.
>> > We should stop calling this method when it's not necessary for any
>> > internal Lucene code. Currently, this code seems to hurt properly written
>> > apps, unnecessarily.
>> > Will post a patch soon
>>
>> --
>> This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
>
> --
> --
> - Mark
>
> http://www.lucidimagination.com
>
>



-- 
Robert Muir
rcmuir@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org