You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by "Christopher Currens (JIRA)" <ji...@apache.org> on 2012/06/15 20:04:42 UTC

[jira] [Created] (LUCENENET-495) Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations

Christopher Currens created LUCENENET-495:
---------------------------------------------

             Summary: Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations
                 Key: LUCENENET-495
                 URL: https://issues.apache.org/jira/browse/LUCENENET-495
             Project: Lucene.Net
          Issue Type: Bug
          Components: Lucene.Net Core
    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 3.0.3
            Reporter: Christopher Currens
            Assignee: Christopher Currens
            Priority: Critical
             Fix For: Lucene.Net 3.0.3


This issue mostly just affects RAMDirectory.  However, RAMFile and RAMOutputStream are used in other (all?) directory implementations, including FSDirectory types.

In RAMOutputStream, the file last modified property for the RAMFile is updated when the stream is flushed.  It's calculated using {{DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond}}.  I've read before that Microsoft has regretted making DateTime.Now a property instead of a method, and after seeing what it's doing, I'm starting to understand why.  DateTime.Now is returning local time.  In order for it to calculate that, it has to get the utf offset for the machine, which requires the creation of a _class_, System.Globalization.DaylightTime.  This is bad for performance.

Using code to write 10,000 small documents to an index (4kb sizes), it created 1,570,157 of these DaylightTime classes, a total of 62MB of extra memory...clearly RAMOutputStream.Flush() is called a lot.

A fix I'd like to propose is to change the RAMFile from storing the LastModified date to UTC instead of local.  DateTime.UtcNow doesn't create any additional objects and is very fast.  For this small benchmark, the performance increase is 31%.

I've set it to convert to local-time, when {{RAMDirectory.LastModified(string name)}} is called to make sure it has the same behavior (tests fail otherwise).  Are there any other side-effects to making this change?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (LUCENENET-495) Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations

Posted by "Itamar Syn-Hershko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENENET-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13393633#comment-13393633 ] 

Itamar Syn-Hershko commented on LUCENENET-495:
----------------------------------------------

Makes sense
                
> Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENENET-495
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-495
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 3.0.3
>            Reporter: Christopher Currens
>            Assignee: Christopher Currens
>            Priority: Critical
>             Fix For: Lucene.Net 3.0.3
>
>
> This issue mostly just affects RAMDirectory.  However, RAMFile and RAMOutputStream are used in other (all?) directory implementations, including FSDirectory types.
> In RAMOutputStream, the file last modified property for the RAMFile is updated when the stream is flushed.  It's calculated using {{DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond}}.  I've read before that Microsoft has regretted making DateTime.Now a property instead of a method, and after seeing what it's doing, I'm starting to understand why.  DateTime.Now is returning local time.  In order for it to calculate that, it has to get the utf offset for the machine, which requires the creation of a _class_, System.Globalization.DaylightTime.  This is bad for performance.
> Using code to write 10,000 small documents to an index (4kb sizes), it created 1,570,157 of these DaylightTime classes, a total of 62MB of extra memory...clearly RAMOutputStream.Flush() is called a lot.
> A fix I'd like to propose is to change the RAMFile from storing the LastModified date to UTC instead of local.  DateTime.UtcNow doesn't create any additional objects and is very fast.  For this small benchmark, the performance increase is 31%.
> I've set it to convert to local-time, when {{RAMDirectory.LastModified(string name)}} is called to make sure it has the same behavior (tests fail otherwise).  Are there any other side-effects to making this change?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (LUCENENET-495) Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations

Posted by "Christopher Currens (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENENET-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13393631#comment-13393631 ] 

Christopher Currens commented on LUCENENET-495:
-----------------------------------------------

I did fix the thread-safety bug.  The code was checking if a key existed in the (synchronized) HashTable, and then tried to add it.  Because there was no locking, there was the scenario when two threads would check if the key existed at the same time, then both add it within a few instructions of each other, causing one to throw an ArgumentException, because the key already existed.

In all of the code code, we are using the correct types we should be (I think).  This is code in the test suite that hasn't ever been updated.  In fact, it really should be a HashSet and not a HashTable.  We were using it because at the time, it was pre-.net 3.0, and the only way to match the java code.  We could change it, but IMO, it's not really worth it right now, because it's ONLY used in the test code.  In the next version we're porting, the testing code is significantly different, so I don't want to spend _too_ much time cleaning it up if it works.
                
> Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENENET-495
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-495
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 3.0.3
>            Reporter: Christopher Currens
>            Assignee: Christopher Currens
>            Priority: Critical
>             Fix For: Lucene.Net 3.0.3
>
>
> This issue mostly just affects RAMDirectory.  However, RAMFile and RAMOutputStream are used in other (all?) directory implementations, including FSDirectory types.
> In RAMOutputStream, the file last modified property for the RAMFile is updated when the stream is flushed.  It's calculated using {{DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond}}.  I've read before that Microsoft has regretted making DateTime.Now a property instead of a method, and after seeing what it's doing, I'm starting to understand why.  DateTime.Now is returning local time.  In order for it to calculate that, it has to get the utf offset for the machine, which requires the creation of a _class_, System.Globalization.DaylightTime.  This is bad for performance.
> Using code to write 10,000 small documents to an index (4kb sizes), it created 1,570,157 of these DaylightTime classes, a total of 62MB of extra memory...clearly RAMOutputStream.Flush() is called a lot.
> A fix I'd like to propose is to change the RAMFile from storing the LastModified date to UTC instead of local.  DateTime.UtcNow doesn't create any additional objects and is very fast.  For this small benchmark, the performance increase is 31%.
> I've set it to convert to local-time, when {{RAMDirectory.LastModified(string name)}} is called to make sure it has the same behavior (tests fail otherwise).  Are there any other side-effects to making this change?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (LUCENENET-495) Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations

Posted by "Itamar Syn-Hershko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENENET-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13393363#comment-13393363 ] 

Itamar Syn-Hershko commented on LUCENENET-495:
----------------------------------------------

+1

Take a look at DateTimeOffset as well - this becomes the standard for .NET 4 +
                
> Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENENET-495
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-495
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 3.0.3
>            Reporter: Christopher Currens
>            Assignee: Christopher Currens
>            Priority: Critical
>             Fix For: Lucene.Net 3.0.3
>
>
> This issue mostly just affects RAMDirectory.  However, RAMFile and RAMOutputStream are used in other (all?) directory implementations, including FSDirectory types.
> In RAMOutputStream, the file last modified property for the RAMFile is updated when the stream is flushed.  It's calculated using {{DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond}}.  I've read before that Microsoft has regretted making DateTime.Now a property instead of a method, and after seeing what it's doing, I'm starting to understand why.  DateTime.Now is returning local time.  In order for it to calculate that, it has to get the utf offset for the machine, which requires the creation of a _class_, System.Globalization.DaylightTime.  This is bad for performance.
> Using code to write 10,000 small documents to an index (4kb sizes), it created 1,570,157 of these DaylightTime classes, a total of 62MB of extra memory...clearly RAMOutputStream.Flush() is called a lot.
> A fix I'd like to propose is to change the RAMFile from storing the LastModified date to UTC instead of local.  DateTime.UtcNow doesn't create any additional objects and is very fast.  For this small benchmark, the performance increase is 31%.
> I've set it to convert to local-time, when {{RAMDirectory.LastModified(string name)}} is called to make sure it has the same behavior (tests fail otherwise).  Are there any other side-effects to making this change?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (LUCENENET-495) Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations

Posted by "Itamar Syn-Hershko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENENET-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13393629#comment-13393629 ] 

Itamar Syn-Hershko commented on LUCENENET-495:
----------------------------------------------

1. IMO, if there is a thread safety bug, it needs to be fixed

2. Why do we have AddIfNotContains(Hashtable, object), and we are not using ConcurrentDictionary?
                
> Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENENET-495
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-495
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 3.0.3
>            Reporter: Christopher Currens
>            Assignee: Christopher Currens
>            Priority: Critical
>             Fix For: Lucene.Net 3.0.3
>
>
> This issue mostly just affects RAMDirectory.  However, RAMFile and RAMOutputStream are used in other (all?) directory implementations, including FSDirectory types.
> In RAMOutputStream, the file last modified property for the RAMFile is updated when the stream is flushed.  It's calculated using {{DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond}}.  I've read before that Microsoft has regretted making DateTime.Now a property instead of a method, and after seeing what it's doing, I'm starting to understand why.  DateTime.Now is returning local time.  In order for it to calculate that, it has to get the utf offset for the machine, which requires the creation of a _class_, System.Globalization.DaylightTime.  This is bad for performance.
> Using code to write 10,000 small documents to an index (4kb sizes), it created 1,570,157 of these DaylightTime classes, a total of 62MB of extra memory...clearly RAMOutputStream.Flush() is called a lot.
> A fix I'd like to propose is to change the RAMFile from storing the LastModified date to UTC instead of local.  DateTime.UtcNow doesn't create any additional objects and is very fast.  For this small benchmark, the performance increase is 31%.
> I've set it to convert to local-time, when {{RAMDirectory.LastModified(string name)}} is called to make sure it has the same behavior (tests fail otherwise).  Are there any other side-effects to making this change?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (LUCENENET-495) Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations

Posted by "Christopher Currens (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENENET-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13393624#comment-13393624 ] 

Christopher Currens commented on LUCENENET-495:
-----------------------------------------------

Yeah, I noticed that new structure a few weeks ago.  Definitely more powerful, in that it keeps track of utc offset.  DateTimeOffset.Now actually just calls {{new DateTimeOffset(DateTime.Now)}}, so it doesn't help in this case.

Interestingly enough, this improves speed in Lucene enough, that it has exposed other thread-safety issues in Lucene.  Fortunately, I think it's only affecting code specific to the test suite.  Well, it's actually code in CollectionHelpers, in Lucene.Net.Support, on AddIfNotContains(Hashtable, object).  However, the only usages I could find for that particular method, is in Lucene.Net.Test.
                
> Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENENET-495
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-495
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 3.0.3
>            Reporter: Christopher Currens
>            Assignee: Christopher Currens
>            Priority: Critical
>             Fix For: Lucene.Net 3.0.3
>
>
> This issue mostly just affects RAMDirectory.  However, RAMFile and RAMOutputStream are used in other (all?) directory implementations, including FSDirectory types.
> In RAMOutputStream, the file last modified property for the RAMFile is updated when the stream is flushed.  It's calculated using {{DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond}}.  I've read before that Microsoft has regretted making DateTime.Now a property instead of a method, and after seeing what it's doing, I'm starting to understand why.  DateTime.Now is returning local time.  In order for it to calculate that, it has to get the utf offset for the machine, which requires the creation of a _class_, System.Globalization.DaylightTime.  This is bad for performance.
> Using code to write 10,000 small documents to an index (4kb sizes), it created 1,570,157 of these DaylightTime classes, a total of 62MB of extra memory...clearly RAMOutputStream.Flush() is called a lot.
> A fix I'd like to propose is to change the RAMFile from storing the LastModified date to UTC instead of local.  DateTime.UtcNow doesn't create any additional objects and is very fast.  For this small benchmark, the performance increase is 31%.
> I've set it to convert to local-time, when {{RAMDirectory.LastModified(string name)}} is called to make sure it has the same behavior (tests fail otherwise).  Are there any other side-effects to making this change?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (LUCENENET-495) Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations

Posted by "Christopher Currens (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENENET-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christopher Currens resolved LUCENENET-495.
-------------------------------------------

    Resolution: Fixed
    
> Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENENET-495
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-495
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 3.0.3
>            Reporter: Christopher Currens
>            Assignee: Christopher Currens
>            Priority: Critical
>             Fix For: Lucene.Net 3.0.3
>
>
> This issue mostly just affects RAMDirectory.  However, RAMFile and RAMOutputStream are used in other (all?) directory implementations, including FSDirectory types.
> In RAMOutputStream, the file last modified property for the RAMFile is updated when the stream is flushed.  It's calculated using {{DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond}}.  I've read before that Microsoft has regretted making DateTime.Now a property instead of a method, and after seeing what it's doing, I'm starting to understand why.  DateTime.Now is returning local time.  In order for it to calculate that, it has to get the utf offset for the machine, which requires the creation of a _class_, System.Globalization.DaylightTime.  This is bad for performance.
> Using code to write 10,000 small documents to an index (4kb sizes), it created 1,570,157 of these DaylightTime classes, a total of 62MB of extra memory...clearly RAMOutputStream.Flush() is called a lot.
> A fix I'd like to propose is to change the RAMFile from storing the LastModified date to UTC instead of local.  DateTime.UtcNow doesn't create any additional objects and is very fast.  For this small benchmark, the performance increase is 31%.
> I've set it to convert to local-time, when {{RAMDirectory.LastModified(string name)}} is called to make sure it has the same behavior (tests fail otherwise).  Are there any other side-effects to making this change?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira