Mutation - part 3: Aliased Mutability
Interior Mutability
Aliased data
We say that data is aliased / shared when it may be accessed from multiple variables or places:
a
static
variable can be accessed from anywhere1 at anytime, and is thus always considered shared.a variable being borrowed with a
& _
reference (hence my calling these references&shared _
).a variable owned by a smart pointer offering Shared Ownership such as
Rc
orArc
.
Aliasing can be necessary, for instance, when sharing data across threads.
Imagine, for instance, that you have a big String
that you want to read
from across two places (in this case, two different threads), and you don’t
want to needlessly copy it. The solution? You borrow it twice by creating
two &shared String
references out of it, and then proceed to move
each
one to each thread.
- Note: Instead of
&'_ String
,Arc<str>
would also work, and it would be'static
, thus working even with::std::thread::spawn()
.
1use ::std::{thread, time::Duration};
2use ::crossbeam::thread::scope;
3
4fn main ()
5{
6 let big_string: String = "SPAM".repeat(1_000);
7 let ref_1: &/*shared*/ String = &big_string;
8 let ref_2: &/*shared*/ String = &big_string;
9
10 scope(move /* ref_1 and ref_2 */ |scope| {
11 // Thread1
12 scope.spawn(move /* ref_1 */ |_| {
13 read_string(ref_1);
14 });
15 // Thread2
16 scope.spawn(move /* ref_2 */ |_| {
17 read_string(ref_2);
18 });
19 }) // automagically joins all the scoped threads
20 .expect("Some thread panicked");
21}
22
23fn read_string (s: &/* shared */ String)
24{
25 let thread_id = thread::current().id();
26 println!(
27 "[{:?}] Reading s = {:?}",
28 thread_id,
29 *s, // read the string data
30 );
31}
32
As you can see, Rust does not complain, even though, at the highlighted lines, big_string
is borrowed twice in an overlapping (aliasing) manner: for their virtue of being shared, shared references can overlap and be copied (&T : Copy
).
This is the closest thing to most non-Rust programmers’ intuition about pointers: as long as the pointee exists and does not move, we can have as many pointers to the pointee as we wish.
What has mutation to do with all this?
Well, in most programming languages, we are allowed to mutate the pointee even when there are multiple pointers pointing to the same pointee! (the only thing forbidden being to simultaneously read and write a value: a data race).
Take, for instance, the following C code:
#include <stdio.h>
// Perfectly valid C code.
int main (void)
{
int x = 42;
int * p1 = &x;
int * p2 = &x;
*p2 = *p2 + 27;
printf("x = %d\n", *p1);
return 0;
}
As you can see, in a language such as C, it is perfectly valid to:
have multiple mutable references aliasing the same thing,
use one of the references (here,
p2
) to mutate the pointee,and use the other reference (
p1
).
This example will suffice to prove that:
&mut _
does not correspond to what most programmers mean by “mutable reference”
Indeed, if it were the case, then we should be able to rewrite the C code (which –let me remind you– is correct and valid) in Rust in the following fashion:
#[allow(non_camel_case_types)]
type int = i32;
fn main ()
{
let mut x: int = 42;
let p1: &mut int = &mut x;
let p2: &mut int = &mut x;
*p2 = *p2 + 27;
println!("x = {}", *p1);
}
And yet such code fails to compile, because, as said in the previous part,
&mut
is the type of unique references, and since p1
and p2
overlap,
claiming that they are the unique holder of the address of x
is a lie
that Rust spots easily.
- Obviously, changing
p1
to be a shared reference does not suffice to soothe Rust, sincep2
is still claiming to be unique.
Can we rewrite it in Rust?
So, since C code can always be rewritten with (unsafe
) Rust,
we may then think that expressing “p1
is not being used while x
is
mutated through p2
” needs raw pointers:
#[allow(non_camel_case_types)]
type int = i32;
fn main ()
{
let mut x: int = 42;
let p1: *mut int = &mut x;
let p2: &mut int = &mut x;
*p2 = *p2 + 27;
println!("x = {}", unsafe { *p1 });
}
The above program compiles without any warning whatsoever, so everything is fine with this Brave New Idiom. Hooray!
This definitely deserves a good cup of coffee.
Let’s take our sweet time to drink it.
- (the skeptical reader may run the code with Miri to be convinced)
Yep. I said that the unicity property was a false claim and a lie,
and instead of fixing it, we just decided to go and lie in a more insidious
way, abusing Rust’s unsafe
trust.
To see why this is UB, there is something that even most Rust programmers are not aware of:
Between the time a reference / pointer is created and the last time it is used, there is no difference, aliasing-wise, between a borrowing reference and a raw pointer2.
The only time where “downgrading” a Rust reference to a raw pointer can be useful is when dealing with lifetimes / more complex memory properties:
fn main () { let p1: Box<i32> = Box::new(42); let at_ft: &i32 = &*p1; let p2 = p1; // move the pointer but not the pointee println!("{}", *at_ft); }
The above code fails to compile even though
at_ft
is still a valid pointer. But Rust analysis conservatively assumes that whenever a variable is moved, all the values borrowed through it may also have moved, thus invalidating all the borrows. This prevents us from accidentally using&p1: &Box<i32>
afterp1
is moved, but it also prevents us from usingat_ft
, since Rust cannot tell the difference (both&p1
andat_ft = &*p1
borrow fromp1
).This is a case where
unsafe
and raw pointers are legitimate:#![feature(box_into_raw_non_null)] use ::std::ptr; fn main () { let p1: Box<i32> = Box::new(42); // a boxed element cannot be aliased, // so let's use NonNull for less strict semantics. let p1: ptr::NonNull<i32> = Box::into_raw_non_null(p1) ; let at_ft: *const i32 = p1.as_ptr(); let p2 = p1; // move the pointer but not the pointee unsafe { println!("{}", *at_ft); let _: Box<i32> = Box::from_raw(p2.as_ptr()); } }
So, back to the problem of expressing aliased mutation in Rust,
int x = 42; int * p1 = &x; int * p2 = &x; *p2 = *p2 + 27; printf("x = %d\n", *p1);
what is the answer?
For that, we need to think about our situation:
we need to mutate something through a reference that
may beis aliased, and we know this is valid because for the duration of the mutation (throughp2
), we know that the other pointer / reference,p1
, is not being dereferenced, so no data race is possible.
This is exactly what UnsafeCell
is for!
use ::core::cell::UnsafeCell;
#[allow(non_camel_case_types)]
type int = i32;
fn main ()
{
let x: UnsafeCell<int> = UnsafeCell::new(42);
let p1: &UnsafeCell<int> = &x;
let p2: &UnsafeCell<int> = &x;
unsafe {
*p2.get() = *p2.get() + 27;
println!("x = {}", *p1.get());
}
}
To better understand this code, let’s rename UnsafeCell
to Mut
,
and explicit when the dereference (*_.get()
) is used to read and write:
- (with
UnsafeCell
being named Mut, it will be easier to get the mental model of&UnsafeCell<T>
being lile&Mut<T>
)
#[allow(non_camel_case_types)]
type int = i32;
fn main ()
{
use lib::Mut;
let x: Mut<int> = Mut::new(42);
let p1: &Mut<int> = &x;
let p2: &Mut<int> = &x;
unsafe {
p2.set(p2.get() + 27);
println!("x = {}", p1.get());
}
}
/// Mut implementation
mod lib {
use ::core::{cell::UnsafeCell, ptr};
pub
struct Mut<T> /* = */ (
UnsafeCell<T>,
);
impl<T> Mut<T> {
pub
fn new (value: T) -> Self
{
Self(UnsafeCell::new(value))
}
pub
unsafe
fn get (self: &'_ Self) -> T
{
ptr::read(self.0.get())
}
/// Look, mutation through a shared reference
pub
unsafe
fn set (self: &'_ Self, value: T)
{
ptr::write(self.0.get(), value);
}
}
}
Cell: when UnsafeCell
is not unsafe
It turns out that the .get()
and set()
methods above can soundly be marked
non-unsafe
, provided some additional guarantee:
For it to be sound,
.get()
and.set()
cannot be called in parallel (else we would have a data race).But
.get()
and.set()
being called in parallel requires there being shared references to the same cell from multiple threads.And since
UnsafeCell<T> : !Sync
, this is not possible.
.get()
may cause a double free ifT
has drop glue.- This can be fixed by adding a
T : Copy
bound, since then there cannot be such drop glue.
- This can be fixed by adding a
TL,DR: given a wrapper around UnsafeCell
(thus guaranteed not to be Sync
),
the .get()
and .set()
methods are sound
(and thus do not require to be marked unsafe
),
provided there is a T : Copy
bound on .get()
.
That’s exactly what
Cell
is: a thread-local wrapper forCopy
types such as integers (or shared references), that provides non-unsafe
Aliased Mutability and thus let us write classical C-programming patterns.- The
T : Copy
is technically not needed (memory-safety-wise) on the.set()
method, but it would then be easy to misuseCell
and cause memory leaks with it (a memory leak, although annoying, is memory safe).
- The
We can now write down a non-unsafe
solution to our problem.
Solution
#![forbid(unsafe_code)] // no more unsafe, we promise
use ::core::cell::Cell;
#[allow(non_camel_case_types)]
type int = i32;
fn main ()
{
let x: Cell<int> = Cell::new(42);
let p1: &Cell<int> = &x;
let p2: &Cell<int> = &x;
p2.set(p2.get() + 27);
println!("x = {}", p1.get());
}
As you can see, UnsafeCell
is a mutability wrapper,
providing the raw (and thus unsafe
) semantics of C
:
mutation through a shared reference. In other words,
Interior Mutability: mutation through a shared reference
- Also called –at least by me– Aliased Mutability
Remember, the signature of UnsafeCell::get
is:
&UnsafeCell<T> -> *mut T
which means that an UnsafeCell<_>
wraps data and provides a (*mut
) handle to its interior, with which we can mutate the wrapped data: the interior of the wrapper is mutable, even when the reference to the UnsafeCell<_>
is not guaranteed to be unique.
The fact is that – whether with Cell<int>
(which is a (safe) wrapper for UnsafeCell
)
or UnsafeCell
directly – we have:
mutable aliased reference to a
T
==&UnsafeCell<T>
And more generally, any wrapper that may allow mutation through a non-unique
reference will wrap the part that can be mutated in an UnsafeCell
.
In other words: there cannot be interior mutability / aliased mutability without
UnsafeCell
.
Mutable fields
So, for those wondering how to express in Rust that a particular field in a struct is mutable, the answer is:
from within a (guaranteed) unique reference, everything is mutable.
from within a (potentially) shared reference, the field needs to be wrapped within a type offering Aliased Mutability.
Example: Rc
Rc
is a smart pointer offering shared ownership.
That is, even if you own a Rc<T>
, you may not be the only one having access
to the pointee T
. Hence:
dropping a
Rc<T>
may not drop the pointee (it will only do so when it knows, with a runtime check, that it is indeed the last and thus unique owner of the pointee);Rc<T> : Deref<Target = T>
, butRc<T>
does not offer a zero-costDerefMut
API (&'_ mut Rc<T> -> &'_ mut T
): how could it possibly offer such a function, when having a unique handle on oneRc<T>
does not guarantee that you have a unique handle on the pointee?
The runtime check and general functionality of Rc
is described by its name:
reference counting.
That is, a3 counter is appended to the pointee (in heap memory), initially
set to 1
by Rc::new
.
Then, each time Rc::clone()
is called, that counter is incremented;
and every time a Rc
is drop
ped, that counter is decremented.
Unicity can thus be known at runtime by reading that counter.
But incrementing a counter means mutating it, and
Clone::clone
signature only requires a shared reference on its input!
Yes, it is time for Aliased Mutability to shine:
use ::std::{
cell::Cell,
mem::drop,
ops::Deref,
};
/// This is the layout of the value in the heap:
struct RefCounted<T> {
/// The value itself
value: T,
/// with a counter next to it
counter: Cell<usize>, // "mutable" thanks to the `Cell`
}
/// A Rc<T> is a pointer to a heap-allocated value and a counter.
pub
struct BasicRc<T> {
ptr: *const RefCounted<T>,
}
impl<T> BasicRc<T> {
pub
fn new (value: T) -> Self
{
// append a new counter to the value
let value_with_counter = RefCounted {
value,
counter: Cell::new(1),
};
// let's move it into to heap:
let boxed_value_with_counter: Box<RefCounted<T>> =
Box::new(value_with_counter)
;
// since we will be the ones managing the memory,
// let's extract the underlying raw pointer:
let raw_ptr = Box::into_raw(boxed_value_with_counter);
// and return
Self {
ptr: raw_ptr,
}
}
// private
fn at_counter (self: &'_ Self) -> &'_ Cell<usize>
{
unsafe {
&(*self.ptr).counter
}
}
}
impl<T> Deref for BasicRc<T> {
type Target = T;
/// While `Self` is borrowed it is sound to borrow the value,
/// but only in a shared fashion.
fn deref (self: &'_ Self) -> &'_ T
{
unsafe {
&(*self.ptr).value
}
}
}
impl<T> Clone for BasicRc<T> {
fn clone (self: &'_ Self) -> Self
{
let at_counter: &Cell<usize> = self.at_counter();
// increment counter
at_counter.set(
at_counter
.get()
.checked_add(1)
.expect("Counter integer overflow")
);
// having incremented the counter, we can copy the pointer
Self {
ptr: self.ptr,
}
}
}
impl<T> Drop for BasicRc<T> {
fn drop (self: &'_ mut Self)
{
let at_counter: &Cell<usize> = self.at_counter();
// decrement counter (no need to check for underflow)
at_counter.set(at_counter.get() - 1);
// the counter is 0 if and only if we are the last owner
if at_counter.get() == 0 {
// since we are the last owner,
// we are back to being just a Box:
let boxed_value_with_counter = unsafe {
Box::from_raw(self.ptr as *mut _)
};
// time to deallocate, by dropping the `Box`
drop::<Box<RefCounted<T>>>(
boxed_value_with_counter
);
}
}
}
Sync
This trait was quickly mentioned before when talking about why / how Cell
is
sound despite it allowing mutation through an aliased / shared reference.
Data Race
This happens when a place in memory is being read4 at the same time that it is being written to, which can lead to extraneous values being read and thus nasty bugs.
Basic Example (race condition)
use ::crossbeam::{
atomic::AtomicCell,
thread,
};
#[derive(Debug, Default)]
struct Twins {
thomson: AtomicCell<bool>,
thompson: AtomicCell<bool>,
}
impl Twins {
/// The struct invariant: `thomson == thompson`
fn assert_equal (self: &'_ Self)
{
if self.thomson.load() != self.thompson.load() {
eprintln!("Error, twins are not equal");
::std::process::abort(); // kill all threads
}
}
fn set_all (self: &'_ Self, value: bool)
{
self.thomson.store(value);
self.thompson.store(value);
}
}
fn main ()
{
let twins = Twins::default();
thread::scope(|scope| {
scope.spawn(|_| {
// Thread 1
loop {
twins.assert_equal();
}
});
scope.spawn(|_| {
// Thread 2
let mut next = true;
loop {
twins.set_all(next);
next = !next;
}
});
}).expect("Some thread panicked");
}
I won’t go into the details of the above code (which uses an external crate for
convenience5), but suffices to say that the above code, while being
memory-safe (it compiles with no unsafe
used), does suffer from a race
condition: even though the Twins
are only modified through a method that sets
both fields to the same value, their “all the fields have the same value”
invariant is temporarily broken in between.
Since the struct is Sync
(i.e., it can be shared across multiple
threads without possibly causing memory unsafety), I was able to share it
across two threads, resulting in one of them witnessing the temporary broken
state of the other:
- We have had a race condition, or “high-level data race”.
Data races and memory safety
In a similar fashion race condition also exist at the hardware level – called
data races: imagine a shared 32
-bit integer where only two values can be
written to it:
0_i32
(the null bit pattern:00...00
),and
-1_i32
(the “all ones” bit-pattern:11...11
),
Let’s see what can happen when these writes are not atomic, and happen instead,
16
bits at a time (for instance6). If, at the same time, another thread of
execution reads this shared integer, then it may end up reading a bit-pattern
with 16
zeroes followed by 16
ones, or viceversa; resulting in it reading
the value 65535_i32
or -65536_i32
.
The fact that the compiler does not guarantee any form of atomicity for standard reads and writes (for the sake of performance) makes this kind of situation –a data race– be Undefined Behavior:
This is what the Sync
trait (and the underlying Send
) is for:
a type is
Sync
if it can be (safely) shared across multiple threads.
It suffices for a type not to have Aliased Mutability to be
Sync
. Indeed, if it is sharedacross multiple–yadda yadda–, then it is aliased, and without Aliased Mutability, Rust forbids mutating it (precisely for this reason).Examples:
String
,Vec<u8>
,bool
, etc.use ::crossbeam::thread; fn main () { let mut x: i32 = 0; thread::scope(|scope| { scope.spawn(|_| { // Thread 1 let mut next = !0; loop { x = next; // &mut i32 next = !next; } }); scope.spawn(|_| { // Thread 2 if x != 0 && x != !0 { // & i32 eprintln!("UNDEFINED BEHAVIOR"); ::std::process::abort(); } }); }).expect("Some thread panicked"); }
If a type offers Aliased Mutability, then in order to be
Sync
it needs to ensure within its public API that the mutations are thread-safe:either because they are atomic (the case of
AtomicBool
,AtomicI32
, etc.),or because they use a synchronization mechanism (
RwLock
,Mutex
, …)
This distinction is what makes it safe to have non-thread-safe Interior
Mutability, since by virtue of not being !Sync
, Rust also forbids sharing such
values:
use ::std::cell::Cell;
use ::crossbeam::thread;
fn main ()
{
let x: Cell<i32> = Cell::new(0);
thread::scope(|scope| {
scope.spawn(|_| {
// Thread 1
let mut next = !0;
loop {
x.set(next); // &Cell<i32> mutation
next = !next;
}
});
scope.spawn(|_| {
// Thread 2
let x = x.get(); // &Cell<i32> read
if x != 0 && x != !0 {
eprintln!("UNDEFINED BEHAVIOR");
::std::process::abort();
}
});
}).expect("Some thread panicked");
}
Okay, now that we have seen how interior mutability allows to
mutate certain values despite their being aliased, and how this interacts with
multi-threaded memory safety, it is time for an exhaustive list of std
’s …
Safe abstractions offering Aliased Mutability
- (built on top of
UnsafeCell
)
!Sync | Sync | |
---|---|---|
T | RefCell<T> | RwLock<T> (and Mutex<T> ) |
T : Copy | Cell<T> | X |
integer | Cell<{integer}> | Atomic{Integer} |
pointer | Cell<*mut T> | AtomicPtr |
!Sync
wrappers (restricted to a single thread):
Cell<T>
lets the programmer mutate
Copy
types (mainly).This wrapper is zero-cost.
Examples:
Numbers:
Cell<bool>, Cell<u8>, Cell<i32>, Cell<u32>, Cell<usize>
, …Pointers:
Cell<&T>
,Cell<*mut T>
, …
RefCell<T>
lets the programmer obtain a
&mut T
reference out of a&RefCell<T>
. It’s basicallyUnsafeCell
but non-unsafe
thanks to there being runtime checks.runtime checks are based on
Cell
;misusage leads to runtime
panic!
s!!Examples:
RefCell<Vec<_>>, RefCell<String>
, …
Sync
wrappers (usable across multiple threads)
AtomicBool, AtomicU8, AtomicI32, AtomicU32, AtomicUsize, ...
ensure that both the reads and the writes are atomic operations, thus being memory safe in a multi-threaded context.
Their API requires explicit atomic orderings, which is a very tough topic on its own; when in doubt, always use
Ordering::SeqCst
- For a simpler and more flexible API, I must recommend using
::crossbeam::atomic::AtomicCell
- For a simpler and more flexible API, I must recommend using
RwLock<T>
(andMutex<T>
)lets the programmer obtain a
&mut T
reference out of a shared reference, by virtue of making any other parallel access wait for the borrow to end.misusage leads to deadlocks or runtime
panic!
s!!runtime checks are based on
atomic
s.Mutex
is a slightly more performant version ofRwLock
, when using only exclusive / write accesses.
Application: safe global mutable state
Easy, the language itself offers
static mut
!
Do. Not. Use. static mut
. Ever.
Indeed, mutating a static mut
has the (dis)honor of belonging to the room7
of Rust ill-designed patterns: doing it soundly is near impossible.
It will be deprecated in favor of an equivalent but safer construct8.
After all we have been through, it should now be easy to intuitively grasp why
static mut
is an UB bomb in the Rust world:
static
s are inherently aliased globally; It is as if there weremutating a
static mut
is done through a&mut StaticVar
unique reference.
The solution –you guessed it– is UnsafeCell
:
- it is okay to mutate through a
&mut StaticVar
(obtained from a&UnsafeCell<StaticVar>
) even if there are other&UnsafeCell<StaticVar>
around.
Mutable static
vars are, this way, less insane to use, since we can reuse
the safe UnsafeCell
-based constructs seen above –actually only the Sync
ones, since a static
is accessible from any thread:
either
Atomic{Integer}
s,or
RwLock<T>
.
In general, I’d still advise against using globale mutable state, since we humans9 are quite bad at non-local reasoning. However, in some simple cases, it can be better than the cumbersome pattern of expliciting the additional parameter everywhere.
One such example is a logger.
Concrete example: our own basic logger
#![forbid(unsafe_code)]
#[macro_use] pub
mod logger {
use ::std::{*,
fs::{
File,
},
sync::{
atomic::{
AtomicU8,
Ordering,
},
Mutex,
},
};
pub
enum LoggerTarget {
Disabled,
Stderr,
File(File),
}
#[derive(Clone, Copy)]
#[repr(u8)]
pub
enum LogLevel {
Error = 1,
Warning,
Info,
Debug,
Trace,
}
pub
static LOG_LEVEL: AtomicU8 = AtomicU8::new(0);
::lazy_static::lazy_static! {
pub
static ref LOGGER: Mutex<LoggerTarget> =
Mutex::new(LoggerTarget::Disabled)
;
}
pub
fn set_level (log_level: LogLevel)
{
LOG_LEVEL.store(log_level as u8, Ordering::SeqCst);
}
pub
fn disable ()
{
match LOGGER.lock() {
| Ok(mut logger) => {
LOG_LEVEL.store(0, Ordering::SeqCst);
*logger = LoggerTarget::Disabled;
},
| Err(_) => {
panic!("Logger was poisoned");
}
}
}
pub
fn target_stderr ()
{
*LOGGER.lock().expect("Logger was poisoned") =
LoggerTarget::Stderr
;
}
pub
fn target_file (
filename: impl AsRef<path::Path>,
) -> io::Result<()>
{
let file =
fs::OpenOptions::new()
.write(true)
.append(true)
.create(true)
.open(filename)?
;
let logger_target = LoggerTarget::File(file);
*LOGGER.lock().expect("Logger was poisoned") =
logger_target
;
Ok(())
}
macro_rules! log {(
$level:expr, $($message:tt)*
) => (if {
$crate::logger::LOG_LEVEL.load(
::std::sync::atomic::Ordering::SeqCst
) >= ($level as u8)
} {
use ::std::io::Write;
use $crate::logger::{
LoggerTarget,
LOGGER,
};
let mut logger =
LOGGER
.lock()
.expect("Logger was poisoned")
;
match &mut *logger as &mut LoggerTarget {
| &mut LoggerTarget::Disabled => {},
| &mut LoggerTarget::Stderr => {
eprintln!($($message)*);
},
| &mut LoggerTarget::File(ref mut file) => {
writeln!(file, $($message)*)
.unwrap_or_else(|err| {
*logger = LoggerTarget::Stderr;
eprintln!(
"Writing to log file failed: {}",
err,
);
});
},
}
})}
}
Remarks
This is clearly not the most efficient implementation there can be, but it gets the job done, featuring both:
a
u8
global, used for theLOG_LEVEL
(based on an atomic),a more complex struct (
LoggerTarget
), requiring something like aMutex
.- Since
Mutex::new()
is not aconst fn
yet, it cannot be used to initialise astatic
; hence the most idiomatic way around this restriction is to use the::lazy_static::lazy_static!
macro to have a static with delayed initialisation, which lets thus initialise astatic
with any kind of code.
- Since
thread_local!(static)
Remember when I said that we needed to use Sync
-hronised Aliased Mutability
for a global variable? It turns out that std
offers another “global”
construct:
A static declaration wrapped in that macro will create a “thread-local static
“:
The “global” variable is then not global to the process but to each thread.
In other words, each thread has its own personal memory to back the “global” variable, meaning that there is not one global variable but
n
“global” variables, wheren
is the number of threads.This design, in turn, means that a
thread_local!
cannot suffer from data races, and thus does not requireSync
-hronisation!That’s why types such as
Cell
andRefCell
can actually be used with it.The internals of the macro use (delayed) initialisation shenanigans similar to
lazy_static!
’s, meaning that you can use nonconst
initialisation10Accessing a
thread_local!
is a little bit more cumbersome than a classicstatic
, since it requires Continuation Passing Style:thread_local! { static FOO: i32 = 42; } FOO.with(|foo: &i32| { // foo is a temporary borrow to the thread local contents assert_eq!(*foo, 42); })
Example: thread_local!
logger
#![forbid(unsafe_code)]
#[macro_use] pub
mod logger {
use ::std::{*,
cell::{
Cell,
RefCell,
},
fs::{
File,
},
};
pub
enum LoggerTarget {
Disabled,
Stderr,
File(File),
}
#[derive(Clone, Copy)]
#[repr(u8)]
pub
enum LogLevel {
Error = 1,
Warning,
Info,
Debug,
Trace,
}
thread_local! {
pub
static LOG_LEVEL: Cell<u8> = Cell::new(0);
pub
static LOGGER: RefCell<LoggerTarget> =
RefCell::new(LoggerTarget::Disabled)
;
}
pub
fn set_level (new_log_level: LogLevel)
{
LOG_LEVEL.with(move |log_level| {
log_level.set(new_log_level as u8);
});
}
pub
fn disable ()
{
LOG_LEVEL.with(|log_level| {
log_level.set(0);
});
LOGGER.with(|logger| {
*logger.borrow_mut() = LoggerTarget::Disabled;
});
}
pub
fn target_stderr ()
{
LOGGER.with(|logger| {
*logger.borrow_mut() = LoggerTarget::Stderr;
});
}
pub
fn target_file (
filename: impl AsRef<path::Path>,
) -> io::Result<()>
{
LOGGER.with(move |logger| {
let file =
fs::OpenOptions::new()
.write(true)
.append(true)
.create(true)
.open(filename)?
;
let logger_target = LoggerTarget::File(file);
*logger.borrow_mut() = logger_target;
Ok(())
})
}
macro_rules! log {(
$level:expr, $($message:tt)*
) => (if {
$crate::logger::LOG_LEVEL.with(
::core::cell::Cell::get
) >= ($level as u8)
} {
use ::std::io::Write;
use $crate::logger::{
LoggerTarget,
LOGGER,
};
LOGGER.with(|logger| {
let mut logger = logger.borrow_mut();
match &mut *logger as &mut LoggerTarget {
| &mut LoggerTarget::Disabled => {},
| &mut LoggerTarget::Stderr => {
eprintln!($($message)*);
},
| &mut LoggerTarget::File(ref mut file) => {
writeln!(file, $($message)*)
.unwrap_or_else(|err| {
*logger = LoggerTarget::Stderr;
eprintln!(
"Writing to log file failed ({})",
err,
);
});
},
}
})
})}
}
Conclusion
Well, that was a hell of a journey! I never expected to talk that much about
mutation; at the beginning I just had that itch about people calling &mut _
a
mutable reference and then seeing a lot of unsafe
code that was unsound
because of that misconception.
So, I hope you found this long rant interesting, and that it made you, if
possible, more aware of the difficulty of writing actually sound unsafe
code
in Rust, while also appreciating the elegance of having delegated the
cornerstone of most languages, aliased mutation, to a special construct in
Rust: UnsafeCell
-based struct
s. You know understand the beauty of &mut
.
TL,DR
mut
is used:either as a binding qualifier to make it “unique-aware” so that one can get a unique reference to it (e.g., to mutate),
or in the type of a reference, to express the fact that it such reference is guaranteed to be unique;
Thus
&mut T
does not mean “mutable reference to aT
”, but something stronger (and thus wildly more dangerous inunsafe
code).The closest to “mutable reference” is either the raw mutable pointer
*mut T
, or, when talking as a reference, the aliasable&UnsafeCell<T>
This is called Interior Mutability (or Aliased Mutability), and can be used, for instance, to make some struct fields mutable even in an aliased context (c.f.,
Rc
).It is possible thanks to the magically sound API of
UnsafeCell
:&UnsafeCell<T> -> *mut T
When
T
isCopy
and without multi-threading, this unchecked construct can be used soundly, which leads to theCell
type, a zero-cost wrapper forC
-like mutation.For more complex types runtime checks are needed, which gives
RefCell
This can be used to have mutable
thread_local!(static)
s.
By using atomic types which are data-race-free, this aliased mutation can take place from multiple threads in parallel (
Sync
trait);Based on these atomic primitives, a runtime-checked
Sync
wrapper can be made (RwLock
orMutex
), for sound multi-threaded mutation on even the most complex types.This can be used to have mutable
static
s / globals.
- except for the fact that a
static
can be limited to private scopes, such as when defined within a function’s body, or when defined within amod
with a non-pub
path leading to it. [return] - See Stacked Borrows [return]
- Actually two counters, to work with
Weak
references. I ignore them in my post, but for those interested, they are invited to readRc
’s documentation. [return] - or written to [return]
- Contrary to most other languages, Rust does not try to feature a rich standard library; on the contrary, to offer a stable and always backwards-compatible language while also allowing continuous improvements, an ecosystem of ever-lasting external crates with semantic versioning is encouraged: crates.io. Many of these crates are even maintained by the core team, ensuring the existence of well known and top quality crates, such as
::crossbeam
. [return] - ARM [return]
- For those wondering, generic
mem::zeroed
(and thusmem:uninitialized
too) also belong to that room of shame. [return] - It relies on using a classic
static
withUnsafeCell
instead ofstatic mut
, minor aSync
consideration. [return] - [return]
- Although
Cell::new
andRefCell::new
are bothconst fn
[return]