You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Sascha Janz <Sa...@gmx.net> on 2014/08/05 16:36:18 UTC

Performance StringCoding.decode

hi,

i want to speed up our search performance. so i run test and monitor them with java mission control.

the analysis showed that one hotspot is 


sun.nio.cs.UTF_8$Decoder.decode(byte[], int, int, char[])
- java.lang.StringCoding.decode(Charset, byte[], int, int)
   - java.lang.String.<init>(byte[], int, int, Charset)	
     -org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.readField(DataInput, 
      StoredFieldVisitor, FieldInfo, int)
      -org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(int,
       StoredFieldVisitor)
       -org.apache.lucene.index.SegmentReader.document(int, StoredFieldVisitor)
        -org.apache.lucene.index.IndexReader.document(int, Set)

we use jdk 1.7.55 and lucene 4.9.0.

is there a chance to speed this up? or do some changes in lucene IndexWriterConfig, e.g. use an other codec?

we use the default values of IndexWriterConfig


regards
sascha

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Performance StringCoding.decode

Posted by Erick Erickson <er...@gmail.com>.
Well, that code is when you're reading the fields of documents off disk.
Stored fields are compressed/decompressed automatically.

So one question is what is your test doing? In other words, is it
artificially hitting this? The theory is that this should only be done when
you gather the final top N docs to return to the user (i.e. for each doc in
the &rows= parameter if you were coming in from Solr).

There's no simple configuration setting to turn off compression that I know
of. Then again, at the Lucene level I'm pretty clueless.

Erick


On Tue, Aug 5, 2014 at 8:41 AM, dizh@neusoft.com <di...@neusoft.com> wrote:

> how to monitor? use jprofile?
>
>
>
>
>
> From: Sascha Janz
> Date: 2014-08-05 22:36
> To: java-user@lucene.apache.org
> Subject: Performance StringCoding.decode
> hi,
>
> i want to speed up our search performance. so i run test and monitor them
> with java mission control.
>
> the analysis showed that one hotspot is
>
>
> sun.nio.cs.UTF_8$Decoder.decode(byte[], int, int, char[])
> - java.lang.StringCoding.decode(Charset, byte[], int, int)
>    - java.lang.String.<init>(byte[], int, int, Charset)
>
>  -org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.readField(DataInput,
>       StoredFieldVisitor, FieldInfo, int)
>
> -org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(int,
>        StoredFieldVisitor)
>        -org.apache.lucene.index.SegmentReader.document(int,
> StoredFieldVisitor)
>         -org.apache.lucene.index.IndexReader.document(int, Set)
>
> we use jdk 1.7.55 and lucene 4.9.0.
>
> is there a chance to speed this up? or do some changes in lucene
> IndexWriterConfig, e.g. use an other codec?
>
> we use the default values of IndexWriterConfig
>
>
> regards
> sascha
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------------------------------------
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
>  storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---------------------------------------------------------------------------------------------------
>

Aw: RE: RE: Re: Performance StringCoding.decode

Posted by Sascha Janz <Sa...@gmx.net>.
we use jdk 1.7.55 and lucene 4.9.0

Sascha
 
 

Gesendet: Mittwoch, 06. August 2014 um 18:11 Uhr
Von: "Uwe Schindler" <uw...@thetaphi.de>
An: java-user@lucene.apache.org
Betreff: RE: RE: Re: Performance StringCoding.decode
What Java version are you using? In Java 7 decoding of bytes to strings should be fast.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Sascha Janz [mailto:Sascha.Janz@gmx.net]
> Sent: Wednesday, August 06, 2014 5:57 PM
> To: java-user@lucene.apache.org
> Subject: Aw: RE: Re: Performance StringCoding.decode
>
>
>
> hi,
>
> no, not for all results, but user can configure the result list size up to 100
> documents.
>
> i was already afraid, that this is a point where is nothing what i can do to
> optimize.
>
> the call for reading the docs comes from IndexSearcher.document (int n).
>
> i tried also with only specific fields with IndexSearcher.document (int
> n,Set<String>). but this made no difference.
>
> We have about 10 TextFields per document, with one "large" body element.
> ( email content).
>
> The results are sorted by timestamp, which is defined like this.
>
> doc.add(new Field("timestamp", Long.toString(timestamp), Field.Store.YES,
> Field.Index.NOT_ANALYZED, Field.TermVector.NO)
>
> greetings
> sascha
>
>
> Gesendet: Mittwoch, 06. August 2014 um 10:50 Uhr
> Von: "Uwe Schindler" <uw...@thetaphi.de>
> An: java-user@lucene.apache.org
> Betreff: RE: Re: Performance StringCoding.decode
> Hi,
>
> It looks like you are fetching the stored fields of *all* search results. In
> general, Lucene is made to return the most relevant documents to the user.
> Fetching stored fields is then done only for like the 10 top-ranking results. If
> you do this for all results (which can be thousands), this is of course a
> performance problem: the stored fields are compressed on disk and after
> decompression the bytes have to be converted to UTF-16 Java Strings. There
> is not much, Lucene can do.
>
> If you use stored fields for ranking purposes (inside function queries), you
> should change them to numeric docvalues fields.
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de[http://www.thetaphi.de]
> eMail: uwe@thetaphi.de
>
> > -----Original Message-----
> > From: Sascha Janz [mailto:Sascha.Janz@gmx.net]
> > Sent: Wednesday, August 06, 2014 10:27 AM
> > To: java-user@lucene.apache.org
> > Subject: Aw: Re: Performance StringCoding.decode
> >
> > i used JMC ( Java Mission Control) from jdk7 u40+
> >
> >
> > see here
> >
> >
> > http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-[http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-]
> [http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-[http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-]]
> > 2004763.html
> >
> >
> >
> > Gesendet: Dienstag, 05. August 2014 um 17:41 Uhr
> > Von: "dizh@neusoft.com" <di...@neusoft.com>
> > An: "java-user@lucene.apache.org" <ja...@lucene.apache.org>
> > Betreff: Re: Performance StringCoding.decode how to monitor? use
> jprofile?
> >
> >
> >
> >
> >
> > From: Sascha Janz
> > Date: 2014-08-05 22:36
> > To: java-user@lucene.apache.org
> > Subject: Performance StringCoding.decode hi,
> >
> > i want to speed up our search performance. so i run test and monitor them
> > with java mission control.
> >
> > the analysis showed that one hotspot is
> >
> >
> > sun.nio.cs.UTF_8$Decoder.decode(byte[], int, int, char[])
> > - java.lang.StringCoding.decode(Charset, byte[], int, int)
> > - java.lang.String.<init>(byte[], int, int, Charset) -
> >
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.rea
> > dField(DataInput,
> > StoredFieldVisitor, FieldInfo, int)
> > -
> >
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visi
> > tDocument(int,
> > StoredFieldVisitor)
> > -org.apache.lucene.index.SegmentReader.document(int,
> StoredFieldVisitor)
> > -org.apache.lucene.index.IndexReader.document(int, Set)
> >
> > we use jdk 1.7.55 and lucene 4.9.0.
> >
> > is there a chance to speed this up? or do some changes in lucene
> > IndexWriterConfig, e.g. use an other codec?
> >
> > we use the default values of IndexWriterConfig
> >
> >
> > regards
> > sascha
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> > -------------------------------------------------------------------------------------------
> ---
> > -----
> > Confidentiality Notice: The information contained in this e-mail and any
> > accompanying attachment(s) is intended only for the use of the intended
> > recipient and may be confidential and/or privileged of Neusoft
> Corporation,
> > its subsidiaries and/or its affiliates. If any reader of this communication is
> not
> > the intended recipient, unauthorized use, forwarding, printing, storing,
> > disclosure or copying is strictly prohibited, and may be unlawful.If you have
> > received this communication in error,please immediately notify the sender
> > by return e-mail, and delete the original message and all copies from your
> > system. Thank you.
> > -------------------------------------------------------------------------------------------
> ---
> > -----
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: RE: Re: Performance StringCoding.decode

Posted by Uwe Schindler <uw...@thetaphi.de>.
What Java version are you using? In Java 7 decoding of bytes to strings should be fast.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Sascha Janz [mailto:Sascha.Janz@gmx.net]
> Sent: Wednesday, August 06, 2014 5:57 PM
> To: java-user@lucene.apache.org
> Subject: Aw: RE: Re: Performance StringCoding.decode
> 
> 
> 
>  hi,
> 
> no, not for all results, but user can configure the result list size up to 100
> documents.
> 
> i was already afraid, that this is a point where is nothing what i can do  to
> optimize.
> 
> the call for reading the docs comes from IndexSearcher.document (int n).
> 
> i tried also with only specific fields with IndexSearcher.document (int
> n,Set<String>). but this made no difference.
> 
> We have about 10 TextFields per document, with one "large" body element.
> ( email content).
> 
> The results are sorted by timestamp, which is defined like this.
> 
> doc.add(new Field("timestamp", Long.toString(timestamp), Field.Store.YES,
> Field.Index.NOT_ANALYZED, Field.TermVector.NO)
> 
> greetings
> sascha
> 
> 
> Gesendet: Mittwoch, 06. August 2014 um 10:50 Uhr
> Von: "Uwe Schindler" <uw...@thetaphi.de>
> An: java-user@lucene.apache.org
> Betreff: RE: Re: Performance StringCoding.decode
> Hi,
> 
> It looks like you are fetching the stored fields of *all* search results. In
> general, Lucene is made to return the most relevant documents to the user.
> Fetching stored fields is then done only for like the 10 top-ranking results. If
> you do this for all results (which can be thousands), this is of course a
> performance problem: the stored fields are compressed on disk and after
> decompression the bytes have to be converted to UTF-16 Java Strings. There
> is not much, Lucene can do.
> 
> If you use stored fields for ranking purposes (inside function queries), you
> should change them to numeric docvalues fields.
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> > -----Original Message-----
> > From: Sascha Janz [mailto:Sascha.Janz@gmx.net]
> > Sent: Wednesday, August 06, 2014 10:27 AM
> > To: java-user@lucene.apache.org
> > Subject: Aw: Re: Performance StringCoding.decode
> >
> > i used JMC ( Java Mission Control) from jdk7 u40+
> >
> >
> > see here
> >
> >
> > http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-
> [http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-]
> > 2004763.html
> >
> >
> >
> > Gesendet: Dienstag, 05. August 2014 um 17:41 Uhr
> > Von: "dizh@neusoft.com" <di...@neusoft.com>
> > An: "java-user@lucene.apache.org" <ja...@lucene.apache.org>
> > Betreff: Re: Performance StringCoding.decode how to monitor? use
> jprofile?
> >
> >
> >
> >
> >
> > From: Sascha Janz
> > Date: 2014-08-05 22:36
> > To: java-user@lucene.apache.org
> > Subject: Performance StringCoding.decode hi,
> >
> > i want to speed up our search performance. so i run test and monitor them
> > with java mission control.
> >
> > the analysis showed that one hotspot is
> >
> >
> > sun.nio.cs.UTF_8$Decoder.decode(byte[], int, int, char[])
> > - java.lang.StringCoding.decode(Charset, byte[], int, int)
> > - java.lang.String.<init>(byte[], int, int, Charset) -
> >
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.rea
> > dField(DataInput,
> > StoredFieldVisitor, FieldInfo, int)
> > -
> >
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visi
> > tDocument(int,
> > StoredFieldVisitor)
> > -org.apache.lucene.index.SegmentReader.document(int,
> StoredFieldVisitor)
> > -org.apache.lucene.index.IndexReader.document(int, Set)
> >
> > we use jdk 1.7.55 and lucene 4.9.0.
> >
> > is there a chance to speed this up? or do some changes in lucene
> > IndexWriterConfig, e.g. use an other codec?
> >
> > we use the default values of IndexWriterConfig
> >
> >
> > regards
> > sascha
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> > -------------------------------------------------------------------------------------------
> ---
> > -----
> > Confidentiality Notice: The information contained in this e-mail and any
> > accompanying attachment(s) is intended only for the use of the intended
> > recipient and may be confidential and/or privileged of Neusoft
> Corporation,
> > its subsidiaries and/or its affiliates. If any reader of this communication is
> not
> > the intended recipient, unauthorized use, forwarding, printing, storing,
> > disclosure or copying is strictly prohibited, and may be unlawful.If you have
> > received this communication in error,please immediately notify the sender
> > by return e-mail, and delete the original message and all copies from your
> > system. Thank you.
> > -------------------------------------------------------------------------------------------
> ---
> > -----
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Aw: RE: Re: Performance StringCoding.decode

Posted by Sascha Janz <Sa...@gmx.net>.

 hi,
 
no, not for all results, but user can configure the result list size up to 100 documents.
 
i was already afraid, that this is a point where is nothing what i can do  to optimize.
 
the call for reading the docs comes from IndexSearcher.document (int n).
 
i tried also with only specific fields with IndexSearcher.document (int n,Set<String>). but this made no difference.

We have about 10 TextFields per document, with one "large" body element. ( email content). 

The results are sorted by timestamp, which is defined like this.

doc.add(new Field("timestamp", Long.toString(timestamp), Field.Store.YES, Field.Index.NOT_ANALYZED, Field.TermVector.NO)

greetings 
sascha
 

Gesendet: Mittwoch, 06. August 2014 um 10:50 Uhr
Von: "Uwe Schindler" <uw...@thetaphi.de>
An: java-user@lucene.apache.org
Betreff: RE: Re: Performance StringCoding.decode
Hi,

It looks like you are fetching the stored fields of *all* search results. In general, Lucene is made to return the most relevant documents to the user. Fetching stored fields is then done only for like the 10 top-ranking results. If you do this for all results (which can be thousands), this is of course a performance problem: the stored fields are compressed on disk and after decompression the bytes have to be converted to UTF-16 Java Strings. There is not much, Lucene can do.

If you use stored fields for ranking purposes (inside function queries), you should change them to numeric docvalues fields.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Sascha Janz [mailto:Sascha.Janz@gmx.net]
> Sent: Wednesday, August 06, 2014 10:27 AM
> To: java-user@lucene.apache.org
> Subject: Aw: Re: Performance StringCoding.decode
>
> i used JMC ( Java Mission Control) from jdk7 u40+
>
>
> see here
>
>
> http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-[http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-]
> 2004763.html
>
>
>
> Gesendet: Dienstag, 05. August 2014 um 17:41 Uhr
> Von: "dizh@neusoft.com" <di...@neusoft.com>
> An: "java-user@lucene.apache.org" <ja...@lucene.apache.org>
> Betreff: Re: Performance StringCoding.decode how to monitor? use jprofile?
>
>
>
>
>
> From: Sascha Janz
> Date: 2014-08-05 22:36
> To: java-user@lucene.apache.org
> Subject: Performance StringCoding.decode hi,
>
> i want to speed up our search performance. so i run test and monitor them
> with java mission control.
>
> the analysis showed that one hotspot is
>
>
> sun.nio.cs.UTF_8$Decoder.decode(byte[], int, int, char[])
> - java.lang.StringCoding.decode(Charset, byte[], int, int)
> - java.lang.String.<init>(byte[], int, int, Charset) -
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.rea
> dField(DataInput,
> StoredFieldVisitor, FieldInfo, int)
> -
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visi
> tDocument(int,
> StoredFieldVisitor)
> -org.apache.lucene.index.SegmentReader.document(int, StoredFieldVisitor)
> -org.apache.lucene.index.IndexReader.document(int, Set)
>
> we use jdk 1.7.55 and lucene 4.9.0.
>
> is there a chance to speed this up? or do some changes in lucene
> IndexWriterConfig, e.g. use an other codec?
>
> we use the default values of IndexWriterConfig
>
>
> regards
> sascha
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ----------------------------------------------------------------------------------------------
> -----
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s) is intended only for the use of the intended
> recipient and may be confidential and/or privileged of Neusoft Corporation,
> its subsidiaries and/or its affiliates. If any reader of this communication is not
> the intended recipient, unauthorized use, forwarding, printing, storing,
> disclosure or copying is strictly prohibited, and may be unlawful.If you have
> received this communication in error,please immediately notify the sender
> by return e-mail, and delete the original message and all copies from your
> system. Thank you.
> ----------------------------------------------------------------------------------------------
> -----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Re: Performance StringCoding.decode

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

It looks like you are fetching the stored fields of *all* search results. In general, Lucene is made to return the most relevant documents to the user. Fetching stored fields is then done only for like the 10 top-ranking results. If you do this for all results (which can be thousands), this is of course a performance problem: the stored fields are compressed on disk and after decompression the bytes have to be converted to UTF-16 Java Strings. There is not much, Lucene can do.

If you use stored fields for ranking purposes (inside function queries), you should change them to numeric docvalues fields.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Sascha Janz [mailto:Sascha.Janz@gmx.net]
> Sent: Wednesday, August 06, 2014 10:27 AM
> To: java-user@lucene.apache.org
> Subject: Aw: Re: Performance StringCoding.decode
> 
> i used JMC ( Java Mission Control) from jdk7 u40+
> 
> 
> see here
> 
> 
> http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-
> 2004763.html
> 
> 
> 
> Gesendet: Dienstag, 05. August 2014 um 17:41 Uhr
> Von: "dizh@neusoft.com" <di...@neusoft.com>
> An: "java-user@lucene.apache.org" <ja...@lucene.apache.org>
> Betreff: Re: Performance StringCoding.decode how to monitor? use jprofile?
> 
> 
> 
> 
> 
> From: Sascha Janz
> Date: 2014-08-05 22:36
> To: java-user@lucene.apache.org
> Subject: Performance StringCoding.decode hi,
> 
> i want to speed up our search performance. so i run test and monitor them
> with java mission control.
> 
> the analysis showed that one hotspot is
> 
> 
> sun.nio.cs.UTF_8$Decoder.decode(byte[], int, int, char[])
> - java.lang.StringCoding.decode(Charset, byte[], int, int)
> - java.lang.String.<init>(byte[], int, int, Charset) -
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.rea
> dField(DataInput,
> StoredFieldVisitor, FieldInfo, int)
> -
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visi
> tDocument(int,
> StoredFieldVisitor)
> -org.apache.lucene.index.SegmentReader.document(int, StoredFieldVisitor)
> -org.apache.lucene.index.IndexReader.document(int, Set)
> 
> we use jdk 1.7.55 and lucene 4.9.0.
> 
> is there a chance to speed this up? or do some changes in lucene
> IndexWriterConfig, e.g. use an other codec?
> 
> we use the default values of IndexWriterConfig
> 
> 
> regards
> sascha
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> ----------------------------------------------------------------------------------------------
> -----
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s) is intended only for the use of the intended
> recipient and may be confidential and/or privileged of Neusoft Corporation,
> its subsidiaries and/or its affiliates. If any reader of this communication is not
> the intended recipient, unauthorized use, forwarding, printing, storing,
> disclosure or copying is strictly prohibited, and may be unlawful.If you have
> received this communication in error,please immediately notify the sender
> by return e-mail, and delete the original message and all copies from your
> system. Thank you.
> ----------------------------------------------------------------------------------------------
> -----
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Aw: Re: Performance StringCoding.decode

Posted by Sascha Janz <Sa...@gmx.net>.
i used JMC ( Java Mission Control) from jdk7 u40+


see here
 

http://www.oracle.com/technetwork/java/javase/2col/jmc-relnotes-2004763.html
 
 

Gesendet: Dienstag, 05. August 2014 um 17:41 Uhr
Von: "dizh@neusoft.com" <di...@neusoft.com>
An: "java-user@lucene.apache.org" <ja...@lucene.apache.org>
Betreff: Re: Performance StringCoding.decode
how to monitor? use jprofile?





From: Sascha Janz
Date: 2014-08-05 22:36
To: java-user@lucene.apache.org
Subject: Performance StringCoding.decode
hi,

i want to speed up our search performance. so i run test and monitor them with java mission control.

the analysis showed that one hotspot is


sun.nio.cs.UTF_8$Decoder.decode(byte[], int, int, char[])
- java.lang.StringCoding.decode(Charset, byte[], int, int)
- java.lang.String.<init>(byte[], int, int, Charset)
-org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.readField(DataInput,
StoredFieldVisitor, FieldInfo, int)
-org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(int,
StoredFieldVisitor)
-org.apache.lucene.index.SegmentReader.document(int, StoredFieldVisitor)
-org.apache.lucene.index.IndexReader.document(int, Set)

we use jdk 1.7.55 and lucene 4.9.0.

is there a chance to speed this up? or do some changes in lucene IndexWriterConfig, e.g. use an other codec?

we use the default values of IndexWriterConfig


regards
sascha

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is
not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this communication in error,please
immediately notify the sender by return e-mail, and delete the original message and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Performance StringCoding.decode

Posted by "dizh@neusoft.com" <di...@neusoft.com>.
how to monitor? use jprofile?




 
From: Sascha Janz
Date: 2014-08-05 22:36
To: java-user@lucene.apache.org
Subject: Performance StringCoding.decode
hi,
 
i want to speed up our search performance. so i run test and monitor them with java mission control.
 
the analysis showed that one hotspot is 
 
 
sun.nio.cs.UTF_8$Decoder.decode(byte[], int, int, char[])
- java.lang.StringCoding.decode(Charset, byte[], int, int)
   - java.lang.String.<init>(byte[], int, int, Charset) 
     -org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.readField(DataInput, 
      StoredFieldVisitor, FieldInfo, int)
      -org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(int,
       StoredFieldVisitor)
       -org.apache.lucene.index.SegmentReader.document(int, StoredFieldVisitor)
        -org.apache.lucene.index.IndexReader.document(int, Set)
 
we use jdk 1.7.55 and lucene 4.9.0.
 
is there a chance to speed this up? or do some changes in lucene IndexWriterConfig, e.g. use an other codec?
 
we use the default values of IndexWriterConfig
 
 
regards
sascha
 
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
 
---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this communication in error,please 
immediately notify the sender by return e-mail, and delete the original message and all copies from 
your system. Thank you. 
---------------------------------------------------------------------------------------------------