Feature Name: Time
Start Date: 2022-06-01
RFC PR: twizzler-rfcs/rfcs#0003
Twizzler Issue: twizzler-operating-system/twizzler#0024

Summary

This document introduces a time sub-system into Twizzler. The time sub-system lays down the foundation for Twizzler's notion of time and interfaces for user space programs. We introduce a trait that abstracts the hardware, new system calls in the kernel, and types to represent clocks.

Motivation

We care about supporting user space programs, particularly Rust's standard library. A big part of that is providing APIs for time. User programs require services for time to do useful things such as benchmarking.

Rust exposes an interface for time to user space through various types.

Duration: represents span of time (sec and ns, like timespec)
Instant: a monotonic non-decreasing type that is implemented using os-specific system calls
- e.g. clock_gettime(CLOCK_MONOTONIC) for Linux
SystemTime: represents an anchor in time from the unix epoch, non-monotonic
- e.g. clock_gettime(CLOCK_REALTIME) for Linux

To support the current APIs in Rust, we need to provide an implementation of a monotonic clock and system clock. Twizzler currently provides stubs so that Rust can call into the kernel.

The outcome of this work is to:

Provide strong support for user space Rust applications using its time APIs
Develop interfaces for Twizzler native programs
Develop standard interfaces for the kernel to manage the hardware used for clocks.

Guide-level explanation

The time interface exposed to users is traditional in the sense that all mechanisms in the kernel hide behind system calls. However, user programs are free to use libraries built on top of system calls, as is done so in the Rust standard library.

One could imagine a simple Rust program that reads the elapsed time from a monotonic clock, ported over to Twizzler as so:

use std::time::Instant;
fn main() {
    // reads a moment in time
    let now = Instant::now();
    
    // elapsed time since now
    // this returns a Duration type
    let elapsed_time = now.elapsed();
    println!("Running main() took {} nanoseconds.", elapsed_time.as_nanos());
}

Clock and ClockInfo

To support the above functionality, Twizzler exposes a Clock abstraction to user space, and APIs revolve around this and ClockInfo. A Clock is a logical abstraction exposed to users that serves a particular purpose and ticks at a particular pace.


#![allow(unused)]
fn main() {
struct Clock {
    info: ClockInfo,
    group: ClockGroup,
    id: ClockID
}
}

ClockInfo is meant to describe a clock to users. It contains the last value read from the clock and properties of the clock, such as resolution.


#![allow(unused)]
fn main() {
struct ClockInfo {
    value: TimeSpan,
    resolution: FemtoSeconds,
    precision: FemtoSeconds,
}
}

The value is a TimeSpan type meant to represent a span of time. To support high precision clock hardware we represent some duration of time using two u64 values, one for seconds, and another for the remainder in femtoseconds. Each value is its own type meant to represent the units. The TimeSpan type can easily be converted to support legacy timestamps such as Duration or timespec/timeval. TimeSpan exposes an interface similar to Duration.

Thus, ClockInfo can be thought of as a POSIX timespec, but much better. It provides descriptions of what is being read, not just the value. This additional metadata is essential for scientific applications that want to use time as a tool for measuring some event. Applications need to know about the properties of clocks to get accurate measurements and reduce errors in experiments.

We define resolution as the period of a clock tick. In other words, the rate at which a clock advances time. The resolution is expressed in Femtoseconds which is a simple wrapper type for a u64 value. We choose to do this to make it clear what the semantic meaning of the value is. Furthermore, we define precision as the stability of the measurements, measured in Femtoseconds. Another value of interest is accuracy which tells us how close measurements are to the true value, based on some reference. This is useful in determining clock error, and will be explored in a future RFC.


#![allow(unused)]
fn main() {
#[repr(transparent)]
struct Seconds(u64);

#[repr(transparent)]
struct FemtoSeconds(u64);

struct TimeSpan(Seconds, FemtoSeconds);
}

ClockGroup are a set of enums that associate some semantic meaning to a clock. This gives users control of what type of clock they are reading from (when talking to the kernel), and they can expect certain invariants to be maintained, such as a clock being monotonic.


#![allow(unused)]
fn main() {
enum ClockGroup {
    Unknown,
    Monotonic,
    RealTime,
}
}

The kernel internally manages a list of usable clocks backed by hardware. The id identifies which clock source is used when interacting with the Clock. The ClockID is a simple wrapper type around a u64.


#![allow(unused)]
fn main() {
#[repr(transparent)]
struct ClockID(u64);
}

User programs can get time in a variety of ways. Either transparently using Rust's std::time crate, directly through system calls, or indirectly through methods exposed by the Clock type.

System Call Interface

Revisiting our example from earlier, let's see how it would work when performing a system call:

use crate twizzler_abi::syscall::sys_read_clock_info;
use crate clock::{ClockInfo, ClockSource, ReadClockFlags};

fn main() {
    // reads a moment in time
    // returns a ClockInfo type
    let now = sys_read_clock_info(ClockSource::BestMonotonic, ReadClockFlags::empty());
    
    // elapsed time since now
    let later = sys_read_clock_info(ClockSource::BestMonotonic, ReadClockFlags::empty());
    // ClockInfo.value() returns a TimeSpan type
    let elapsed_time = later.value() - now.value();
    println!("Running main() took {} nanoseconds.", elapsed_time.as_nanos());
}

A few things here. For starters, sys_read_clock_info, a new system call. We pass in the clock we want (ClockSource), and some flags. By default, this returns a ClockInfo object with all fields filled in. The returned ClockInfo, in this case, is generated from a ClockGroup specific clock. We will get to the use of ClockID later.


#![allow(unused)]
fn main() {
enum ClockSource {
  BestMonotonic,
  BestRealTime,
  ID(ClockID)
}
}

There might be more than one piece of hardware that can be used to serve as the backing for a specific ClockGroup. Hence, sys_read_clock_info returns a value read from the best clock source available. The semantic meanings of ClockSource map directly to ClockGroup.

Functionally, this is the same program, except it uses different abstractions. The ClockInfo type has a set of methods to return internal values. We calculate the elapsed time by subtracting two TimeSpan types that sampled different points in time from the same clock. Internally, std::time does something like this using the Instant type.

Twizzler exposes a new set of system calls related to timekeeping. Other than sys_read_clock_info, which is helpful in reading a clock and learning about its properties, users need support to discover available clocks.


#![allow(unused)]
fn main() {
fn sys_read_clock_info(source: ClockSource, flags: ReadClockFlags) -> Result<ClockInfo, ReadClockError>;
fn sys_read_clock_list(clock: ClockGroup, flags: ReadClockFlags) -> Result<VecDeque<Clock>, ReadClockError>;
}

Should a user need detailed information about clocks exposed by the kernel to user space, they could use sys_read_clock_list. By default, it returns a list of clocks for every type of clock exposed (ClockGroup). All information in the ClockInfo except the current value is also returned. For clocks with more than one clock source, the first one is returned. Users can get a list of all clocks, and thus all clock sources, for a particular type by specifying the ClockGroup and setting the appropriate flag.

If a user wants to read an arbitrary clock's value, they could specify the ClockID given to them by sys_read_clock_list.


#![allow(unused)]
fn main() {
// reads all clocks as candidates for monotonic
let clocks = sys_read_clock_list(ClockGroup::Montonic, ReadClockFlags::ClockGroup).expect("error message");

// reference to the last clock in list
// clock type has id of backing clock source
let clk = clocks.last().unwrap();

// reading from some arbitrary clock source
let now = sys_read_clock_info(ClockSource::ID(clk.id()), ReadClockFlags::empty());

// ClockInfo.value() returns a TimeSpan type
println!("Current value of clock: {} nanoseconds.", now.value().as_nanos());
}

The last thing a user might want to do is steer the clock to prevent drift. This is useful for systems that need precise values from an accurate clock. Real-world applications such as PTP/NTP need an interface like this. A full detailed explanation and implementation are out of scope for this RFC and will be explored in another RFC.

Clock Interface

The last way of accessing a clock is by using Twizzler's native library for clocks. Each Clock has a set of operations that map directly to the system call interface exposed to time.


#![allow(unused)]
fn main() {
fn read() -> TimeSpan {}
fn info() -> ClockInfo {}
}

Our running example would look something like this:

fn main() {
  // gets a reference to the monotonic clock
  let clock = Clock::get(ClockGroup::Monotonic);

  // read a moment in time
  // returns TimeSpan 
  let now = clock.read();

  // elapsed time since now
  let later = clock.read();
  // ClockInfo.value() returns a TimeSpan type
  let elapsed_time = later - now;
  println!("Running main() took {} nanoseconds.", elapsed_time.as_nanos());
}

The benefit of doing things this way is that users interact with time at a much higher level than the system call interface, and they are given useful clock metadata. If all a user cares about is the passage of time, then Instant should suffice. However, if they are curious about the precision or accuracy of a clock, then this interface is one way of doing so. It is much cleaner.

Reference-level explanation

Twizzler needs an abstraction over hardware used for timekeeping to support the interfaces exposed to users. The purpose of the abstraction in Twizzler is so that the kernel can support different types of hardware. Time can come from many sources: some counter or programmable timer on the processor or board. Another thing the kernel needs is standard interfaces to manage timekeeping hardware.

ClockHardware and Ticks

The kernel achieves both of these through the ClockHardware trait defined as follows:


#![allow(unused)]
fn main() {
trait ClockHardware {
    fn read(&self) -> Ticks;
    fn info(&self) -> ClockInfo;
    // start, isEnabled, callback, etc.
}
}

Rather than having a concrete type, we provide an interface implemented by different architectures for different time sources. ClockHardware exposes methods to read time or get a description of the hardware backing it. This leaves more room to introduce other useful methods, such as disabling/enabling a hardware timer.

The necessary interfaces will be clear as we integrate this design with the existing kernel code. For example, the kernel currently has a stat clock which is programmed through a number of somewhat ad-hoc APIs. The idea would be to use this to manage the hardware backing the stat-clock in a hardware-agnostic way.

The purpose of read is to read a value provided by hardware, which requires some assembly, and returns Ticks meant to represent raw time:


#![allow(unused)]
fn main() {
struct Ticks {
    value: u64,
    rate: FemtoSeconds
}
}

Ticks represent some duration on the clock. The width of value, which is 64-bits, is not fundamental and could change. The value can be scaled to some unit of time by multiplying the rate of the time source. This multiplication operation produces a TimeSpan.

If we were on an x86-64 machine for example, and we wanted to use the TSC as ClockHardware, read would return the value of the TSC. Internally read would call the rdtsc instruction and return the value reported by hardware as Ticks.


#![allow(unused)]
fn main() {
impl ClockHardware for TSC {
    fn read(&self) -> Ticks {
        let t = unsafe { x86::time::rdtsc() };
        Ticks { value:t , rate: self.resolution() }
    }
}
}

Ticks can be converted to a TimeSpan which is helpful to user space. info generates a ClockInfo, which describes the properties of the hardware timer. This is done in a time source specific way.

Integration

The kernel maintains a system-wide list of time sources (TICK_SOURCES) building on these abstractions. TICK_SOURCES is implemented as a vector or an array of ClockHardware. One can think of this as an array of methods determined at runtime, based on the system configuration.

TICK_SOURCES is generated when Twizzler starts up in kernel_main and calls clock::init();. The kernel enumerates all hardware time sources available, and chooses which ClockHardware to serve as the backend that supports a particular Clock exposed to user space (e.g. Monotonic).

The enumeration of hardware is machine/architecture-specific. Moreover, the materialization of clocks exposed to user space will require an algorithm that understands the requirements of the clock and the functionality provided by the hardware. This is planned to be explored in a future RFC. For now, we could use well-known clock sources for specific platforms.

Integrating this into the kernel would be done using a set of files:

<root>/src/kernel/src/
  arch/
    x86/
      tsc.rs // implements clock based on tsc
      processor.rs // find/save ref to clocks on processor
    aarch64/
    mod.rs // decides what to compile
  machine/
    pc/dummy.rs // some clock source on platform
    morello/
  clock.rs // probing hw clocks (hw/config specific)
  time.rs // abstracting hw clock sources
  main.rs // initialize clock subsystem

For each architecture subdirectory, we have a set of files implementing ClockHardware for specific hardware. Likewise, for timers on the motherboard, we have files that are board specific. The generic code lives in the main Twizzler source, which contains the ClockHardware trait and functions to initialize the time sub-system. At compile time, we decide what architecture to compile to and thus what time code we need to run. At run time, we discover hardware and choose the implementation as appropriate.

Circling back to our example from earlier, where a user program reads a monotonic clock, we could imagine that the time stamp counter (TSC) backs that clock on an x86 platform. A peek behind the curtain of Rust's std::time call to Instant::now() would look something like this:

use std::time::Instant;
fn main() {
  // reads a moment in time
  let now = Instant::now();
    // calls into os-implementation
    let ci = sys_read_clock_info(ClockGroup::Monotonic, 0, ReadClockFlags::empty());
    //======== jump to kernel space =========
      // os looks up the ClockHardware backing this clock
      USER_CLOCK[clock as usize].info();
      // time source specific generation of ClockInfo
      ClockHardware.info(self)
      // read the value given by hardware
      let t = self.read()
      // reading TSC (implementation)
      let tsc = unsafe { x86::time::rdtsc() };
      Ticks { value: tsc , rate: self.resolution() }
      // generate ClockInfo from value read
      ClockInfo::new(
       TimeSpan::from(t), // conversion to TimeSpec
       ClockGroup::Monotonic, 
      //  ...
      )
    //======== back to user space ========
    // read value from ClockInfo returned from sys_read_clock_info
    let now = ci.value()
    // Instant implemented as a TimeSpan
    return now;

  // elapsed time since now
  // this returns a Duration type
  let elapsed_time = now.elapsed();
  println!("Running main() took {} nanoseconds.", elapsed_time.as_nanos());
}

To illustrate this more clearly, we could imagine that the call stack up until the TSC is read would look something like this:

0: [kernel] x86::time::rdtsc()
1: [kernel] x86_64::tsc::ClockHardware::read()
2: [kernel] x86_64::tsc::ClockHardware::info()
3: [user]   twizzler_abi::syscall::sys_read_clock_info()
4: [user]   std::time::Instant::now()
5: [user]   main()

The actual calls in a real implementation would look different. We omit checks for values provided by users, and more efficient, possibly serialized reads of time.

Drawbacks

The biggest drawback might be the cost of abstraction. We plan to use Rust's dynamic dispatch, which involves a vtable call to the underlying interface. This layer of direction may or may not be expensive for time APIs. There might be a way around this and have the compiler statically compile all function calls, but it is unclear if possible.

Another source of overhead is that all interfaces with time require a system call. This can be optimized later for some things, such as reading a clock by exposing read-only memory to user space, similar to a vDSO

Using 64 bits for the implementation of Ticks may be relatively large for embedded. Likewise, for the standard library implementation of Rust's Duration 64 bits is a lot. For most instances of Twizzler, this should not be a problem.

Rationale and alternatives

This design offers flexibility and provides standard interfaces in the kernel. We can implement ClockHardware for any hardware. Having a standard interface also makes porting easier. Not doing things this way makes this part of the system harder to maintain.

We have considered other legacy designs and some of them have their downfalls.

Prior art

It is pretty standard in Rust to use traits to abstract hardware. Such is the case for many projects in the Embedded Rust community.

As far as time is concerned, the Embedded Time crate and Tock Time HIL are related. Both abstract over hardware used as time sources, have tick and frequency abstractions, and Tock provides a notion of a clock. However, they are not general purpose.

They do not provide interfaces for programming hardware used for timekeeping. We may use aspects of their design in our work, but we cannot directly use these crates as presented. Another downfall is that they do not describe the time source, which is necessary if applications want to consider the accuracy of their time measurements. Our goals are to provide something general purpose and high level.

Linux provides system calls for accessing monotonic and real-time clocks with clock_gettime. The way we read time is similar, except that we provide more useful metadata to users with ClockInfo instead of timespec. Other researchers in the community have noted the importance of time as a tool and the shortcomings of timespec. Having additional information available, such as accuracy or clock error, is critical to science. Not knowing the properties of clocks can lead to errors in experiments and hurt reproducibility [1].

Resources On Time

Other than the paper mentioned above on the importance of time as a tool, George Neville-Neil gives a good overview of the importance of time and why we need an interface to adjust time [2]. Not only that, but synchronized clocks are important in many distributed systems. The FAQ page on NTP is also a good resource.

[1] Najafi, Ali, Amy Tai, and Michael Wei. "Systems research is running out of time." Proceedings of the Workshop on Hot Topics in Operating Systems. 2021. https://dl.acm.org/doi/pdf/10.1145/3458336.3465293.

[2] George Neville-Neil. 2015. Time is an Illusion., ACM Queue 13, 9 (November-December 2015), 57–72. https://doi.org/10.1145/2857274.2878574

[3] Ulrich Windl, et al. 2006. The NTP FAQ and HOWTO: Understanding and using the Network Time Protocol. https://www.ntp.org/ntpfaq

Unresolved questions

Resolved Through RFC

Currently, there is no guide on how to implement system calls. It is unclear where specific abstractions for user space should go, such as ClockInfo. However, after looking at the changes to the twizzler-abi crate, it makes sense that the user space clock abstractions belong there.

Resolved Through Implementation

We expect that the necessary interfaces for managing time will be revealed as we implement ClockHardware for different hardware timers. When integrating this with existing code, such as the stat clock, we may add more methods to ClockHardware.
We may want an additional ClockHardwareInfo that describes not the logical clock but the hardware clock source. It could answer questions such as "is this monotonic?" It does not seem necessary at this point and could probably be integrated somewhere else.
We also don't know the overall performance of the Clock APIs over calls through std::time.

Related Issues

These are issues out of scope for this RFC, and could be addressed later, possibly in a future RFC.

This RFC does not introduce a system to allow users to set timers/alarms. This feature could be useful for sleeping or setting a callback but is out of scope.
We need an algorithm for adjusting sources of time. Standard protocols for this exist, such as NTP/PTP. Support for this requires an interface for adjusting the time that makes sense. Related to this is a definition for accuracy of a clock which is meaningful to the programmer.
Selecting the appropriate source of time for a particular clock use case. Some hardware timers are unfit for one reason or another. Maybe the resolution is low, the cost of reading the timer is high, or the timer measurements are unstable.
This design does not consider heterogeneous hardware or even differences in hardware among homogenous processors. This may or may not be an issue.

Future possibilities

These APIs and abstractions are necessary to support user space applications. This feature is marked on the Twizzler roadmap as a milestone.

Some additional features or optimizations are listed below that might be explored in the future if deemed necessary.

ClockGroup expanded to expose more logical clock types
A system call to register hardware as a new clock source
By design, we do not implicitly serialize operations that read timers. We could provide some flag that makes calls using arch-specific instruction barriers.
Faster reads of clocks can be achieved through read-only shared memory with user space.
We might want to be able to dynamically add a hardware source if, say, a CPU suddenly came online.
We may explore designs that encapsulate the time sub-system within a microkernel-style user space service

Twizzler RFCs