Object Safety
A trait object in Rust0 can only be constructed out of traits that satisfy certain restrictions, which are collectively called “object safety”. This object safety can appear to be a needless restriction at first, I’ll try to give a deeper understanding into why it exists and related compiler behaviour.
This is the second (and a half) in a short series of articles on trait
objects. The first
one—Peeking inside Trait Objects—set the scene by
looking into the low-level implementation details of trait objects,
and the
first-and-a-half-th—an interlude about Sized
—looked
at the special Sized
trait. I strongly recommended at least glancing
over it to be familiar with trait objects, vtables and Sized
,
since this post builds on those concepts.
Other posts in this series on trait objects
Motivation
The notion of object safety was introduced in RFC 255, with
the motivation that one should be able to use the dynamic trait object
types Foo
(as a type) in more places where a “static” Foo
(as a
trait) generic is expected. In a sense, it is bringing the two uses of
traits—static dispatch and dynamic dispatch—closer together,
reducing special handling in the language.
The high-level behaviour/restriction imposed by that RFC is: a trait
object—&Foo
, &mut Foo
, etc.—can only be made out of a trait
Foo
if Foo
is object safe. This section will focus on borrowed &
trait objects, but what is said applies to any.
Let’s look at an example of the things object safety enables: if we
have a trait Foo
and a function like
1
fn func<T: Foo + ?Sized>(x: &T) { ... }
It would be nice to be able to call it like func(object)
where
object: &Foo
; that is, take T
to be the dynamically sized type
Foo
. As you might guess from the context, it is not possible to
do this without some notion of object safety: the arbitrary piece of
code ...
can do bad (uncontrolled) things.
Take it on faith (for a few paragraphs) that calling a generic method is one example of something that can’t be done on a trait object. So, let’s define a trait and a function like:
1
2
3
4
5
6
7
8
trait Bad {
fn generic_method<A>(&self, value: A);
}
fn func<T: Bad + ?Sized>(x: &T) {
x.generic_method("foo"); // A = &str
x.generic_method(1_u8); // A = u8
}
The function func
can’t be called like foo(obj)
where obj
is a
trait object &Bad
because the generic method calls are
illegal. There’s a possible approaches here, like
- have signatures like
<T: Foo + ?Sized>(x: &T)
not work withT = Foo
by default, for any traitFoo
, - check the body of the function to see if it is legal to have
T = Bad
when we ask for that, or - ensure that we can never pass a
&Bad
intofunc
.
Approach 1 is what existed before object safety, and is what object safety was designed to solve. Approach 2 violates Rust’s goal of needing to know only the signatures of any function/method called to type-check a program. That is, if one satisfies the signature one can call it, unlike C++, there’s no need to type-check internal code of each the actual instantiation of a generic because the signatures guarantee that the internals will be legal.
Approach 3 is the one that Rust takes via object safety, by ensuring
that it is impossible to ever encounter a scenario in which a function
with signature fn func<T: Foo + ?Sized>(x: &T)
that does bad things,
could have T == Foo
. That is, make it so that the only way that a
&Foo
can be created is if there’s no way that func
can misbehave.
Object safety and those sort of function signatures apply particularly
to UFCS (uniform function call syntax), which allows one to call
methods as normal, generic function scoped under the type/trait in
which they are defined, for example, the UFCS function
Bad::generic_method
from the trait above effectively has signature:
1
fn Bad::generic_method<Self: Bad + ?Sized, A>(self: &Self, x: A)
If fn method(&self)
comes from a trait Foo
, x.method()
can
always be rewritten to Foo::method(x)
(modulo auto-deref and
auto-ref, which possibly add an &
and/or some number of *
s),
however, without object safety, it may not be possible to write
trait_object.method()
as Foo::method(trait_object)
. Object safety
guarantees this transformation is always valid—making UFCS and
method calls essentially equivalent—by outlawing creating a trait
object in situations where it would be invalid.
How it works
After RFC 546 and PR 20341, making trait objects
automatically work with those sort of generic functions is achieved by
effectively having the compiler implicitly create an implementation of
Foo
(as a trait) for Foo
(as a type). Each method of the trait is
implemented to call into the corresponding method in the vtable. In
the explicit notation of my previous post, the
situation might look something like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
trait Foo {
fn method1(&self);
fn method2(&mut self, x: i32, y: String) -> usize;
}
// autogenerated impl
impl<'a> Foo for Foo+'a {
fn method1(&self) {
// `self` is an `&Foo` trait object.
// load the right function pointer and call it with the opaque data pointer
(self.vtable.method1)(self.data)
}
fn method2(&mut self, x: i32, y: String) -> usize {
// `self` is an `&mut Foo` trait object
// as above, passing along the other arguments
(self.vtable.method2)(self.data, x, y)
}
}
To be clear: the .vtable
and .data
notation doesn’t work directly
on trait objects, so that code has no hope of compiling, I am just
being explicit about actual behaviour.
Object safety
The rules for object safety were set-out in that initial RFC 255, with two missed cases identified and resolved in RFC 428 and RFC 546. At the time of writing, the possible ways to be object-unsafe are described by two enums:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
pub enum ObjectSafetyViolation<'tcx> {
/// Self : Sized declared on the trait
SizedSelf,
/// Method has someting illegal
Method(Rc<ty::Method<'tcx>>, MethodViolationCode),
}
/// Reasons a method might not be object-safe.
#[derive(Copy,Clone,Show)]
pub enum MethodViolationCode {
/// e.g., `fn(self)`
ByValueSelf,
/// e.g., `fn foo()`
StaticMethod,
/// e.g., `fn foo(&self, x: Self)` or `fn foo(&self) -> Self`
ReferencesSelf,
/// e.g., `fn foo<A>()`
Generic,
}
Let’s go through each case.
Update 2015-05-06: RFC 817 added more precise control
over object safety via where
clauses, see
Where Self Meets Sized: Revisiting Object Safety.
Sized Self
1
2
3
trait Foo: Sized {
fn method(&self);
}
The trait Foo
inherits from Sized
, requiring the Self
type to be
sized, and hence writing impl Foo for Foo
is illegal: the type Foo
is not sized and doesn’t implement Sized
. Traits default to Self
being possibly-unsized—effectively a bound Self: ?Sized
—to make
more traits object safe by default.
By-value self
Update 2015-05-06: this is no longer object unsafe, but it is
impossible to call such methods on possibly-unsized types, including
trait objects. That is, one can define traits with self
methods,
but one is statically disallowed from call those methods on trait
objects (and on generics that could be trait objects).
1
2
3
trait Foo {
fn method(self);
}
At the moment1, it’s not possible to use trait
objects by-value anywhere, due to the lack of sizedness. If one were
to write an impl Foo for Foo
, the signature of method
would mean
self
has type Foo
: a by-value unsized type, illegal!
Static method
1
2
3
trait Foo {
fn func() -> i32;
}
There’s no way to provide a sensible implementation of func
as a
static method on the type Foo
:
1
2
3
4
5
impl<'a> Foo for Foo+'a {
fn func() -> i32 {
// what goes here??
}
}
The compiler can’t just conjure up some i32
—the chosen value may
make no sense in context—and it can’t call some other type’s
Foo::func
method—which type would it choose? The whole scenario
makes no sense.
References Self
There’s two fundamental ways in which this can happen, as an argument
or as a return value, in either case a reference to the Self
type
means that it must match the type of the self
value, the true type
of which is unknown at compile time. For example:
1
2
3
trait Foo {
fn method(&self, other: &Self);
}
The types of the two arguments have to match, but this can’t be
guaranteed with a trait object: the erased types of two separate
&Foo
values may not match:
1
2
3
4
5
impl<'a> Foo for Foo+'a {
fn method(&self, other: &(Foo+'a))
(self.vtable.method)(self.data, /* what goes here? */)
}
}
(Using the explicit-but-invalid notation as above.)
One can’t use other.data
because the method
entry of self.vtable
is assuming that both pointers point to the same, specific type
(whatever type the vtable is specialised for), but there’s absolutely
no guarantee other.data
points to matching data. There’s also not
necessarily a (reliable) way to detect a mismatch, and no way the
compiler can know a correct way to handle a mismatch even if it can be
detected.
Generic method
1
2
3
trait Foo {
fn method<A>(&self, a: A);
}
As discussed briefly in the first post, generic functions in Rust are monomorphised, that is, a copy of the function is created for each type used as a generic parameter. An attempted implementation might look like
1
2
3
4
5
impl<'a> Foo for Foo+'a {
fn method<A>(&self, a: A) {
(self.vtable./* ... huh ???*/)(self.data, a: A)
}
}
The vtable is a static struct of function pointers, somehow we have to
select a function pointer from it that will work with the arbitrary
type A
. To have any hope of doing this, one would have2
to pregenerate code for every type that could possibly be used for A
and then fill in the huh
above to select the right one. This would
be effectively implicitly adding a whole series of methods to the
trait:
1
2
3
4
5
6
7
trait Foo {
fn method_u8(&self); // A = u8
fn method_i8(&self); // A = i8
fn method_String(&self); // A = String
fn method_unit(&self); // A = ()
// ...
}
and each one would need an entry in the vtable struct. If it is even possible, this would be some serious bloat, especially as I imagine most possibilities wouldn’t be used.
For the more fundamental question of “is it possible”, the answer is
rarely: it only works if the number of possible types that can be used
with the generic parameters is finite and completely known, so that a
complete list can be written. I think the only circumstance in which
this occurs is if all the parameters have to be bounded by some
private trait (the example above fails, since A
is unbounded and so
can be used with every type ever, including ones that aren’t even
defined in scope).
- /r/rust
-
As usual, this post is designed to reflect the state of Rust at version
rustc 1.0.0-nightly (44a287e6e 2015-01-08 17:03:40 -0800)
. ↩ -
There is desire to remove/relax this restriction for function parameters, and especially
self
, to allow them to be unsized types. Niko’s “Purging proc” describes the problem and the necessity for theInvoke
trait as a work-around for theFnOnce
trait. ↩ -
Strictly speaking I suppose one could do some type of runtime codegen/JITing, but that’s not really something Rust wants to build into the language, as it would require Rust programs to essentially carry around a compiler. ↩