You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2015/09/17 12:08:46 UTC

[jira] [Comment Edited] (SPARK-10614) SystemClock uses non-monotonic time in its wait logic

    [ https://issues.apache.org/jira/browse/SPARK-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14768900#comment-14768900 ] 

Steve Loughran edited comment on SPARK-10614 at 9/17/15 10:08 AM:
------------------------------------------------------------------

Having done a little more detailed research on the current state of this clock, I'm now having doubts about this.

On x86, its generally assumed that the {{System.nanoTime()}} uses the {{TSC}} counter to get the timestamp —which is fast and only goes forwards (albeit at a rate which depends on CPU power states). But it turns out that on manycore CPUs, because that could lead to different answers on different cores, the OS may use alternative mechanisms to return a counter: which may be neither monotonic nor fast.

# [Inside the Hotspot VM: Clocks, Timers and Scheduling Events - Part I - Windows|https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks]
# [JDK-6440250 : On Windows System.nanoTime() may be 25x slower than System.currentTimeMillis()|http://bugs.java.com/view_bug.do?bug_id=6440250]
# [JDK-6458294 : nanoTime affected by system clock change on Linux (RH9) or in general lacks monotonicity|http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6458294]
# [Redhat on timestamps in Linux|https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/2/html/Realtime_Reference_Guide/chap-Realtime_Reference_Guide-Timestamping.html]
# [Timekeeping in VMware Virtual Machines|http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf]

These docs imply that nanotime may be fast-but-unreliable-on multiple socket systems (the latest many core parts share one counter) —and may downgrade to something slower than calls to getTimeMillis()., or even something that isn't guaranteed to be monotonic. 

It's not clear that on deployments of physical many-core systems moving to nanotime actually offers much. I don't know about EC2 or other cloud infrastructures though.

maybe its just best to WONTFIX this as it won't raise unrealistic expectations about nanoTime working


was (Author: stevel@apache.org):
Having done a little more detailed research on the current state of this clock, I'm now having doubts about this.

On x86, its generally assumed that the {{System.nanoTime()}} uses the {{TSC}} counter to get the timestamp —which is fast and only goes forwards (albeit at a rate which depends on CPU power states). But it turns out that on manycore CPUs, because that could lead to different answers on different cores, the OS may use alternative mechanisms to return a counter: which may be neither monotonic nor fast.

# [Inside the Hotspot VM: Clocks, Timers and Scheduling Events - Part I - Windows|https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks]
# [JDK-6440250 : On Windows System.nanoTime() may be 25x slower than System.currentTimeMillis()|http://bugs.java.com/view_bug.do?bug_id=6440250]
# [JDK-6458294 : nanoTime affected by system clock change on Linux (RH9) or in general lacks monotonicity|JDK-6458294 : nanoTime affected by system clock change on Linux (RH9) or in general lacks monotonicity]
# [Redhat on timestamps in Linux|https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/2/html/Realtime_Reference_Guide/chap-Realtime_Reference_Guide-Timestamping.html]
# [Timekeeping in VMware Virtual Machineshttp://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf]

These docs imply that nanotime may be fast-but-unreliable-on multiple socket systems (the latest many core parts share one counter) —and may downgrade to something slower than calls to getTimeMillis()., or even something that isn't guaranteed to be monotonic. 

It's not clear that on deployments of physical many-core systems moving to nanotime actually offers much. I don't know about EC2 or other cloud infrastructures though.

maybe its just best to WONTFIX this as it won't raise unrealistic expectations about nanoTime working

> SystemClock uses non-monotonic time in its wait logic
> -----------------------------------------------------
>
>                 Key: SPARK-10614
>                 URL: https://issues.apache.org/jira/browse/SPARK-10614
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> The consolidated (SPARK-4682) clock uses {{System.currentTimeMillis()}} for measuring time, which means its {{waitTillTime()}} routine is brittle against systems (VMs in particular) whose time can go backwards as well as forward.
> For the {{ExecutorAllocationManager}} this appears to be a regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org