Take a break: Rust match has fallthrough
Rust’s match
statement can do a lot of things, even C-style fallthough to the next branch, despite having no real support for it. It turns out to a “shallow” feature, where the C to Rust translation is easily done, without needing to understand the code itself. The hardest part is coming to terms with writing it, and then convincing someone else to let you land it!
Here’s what the tail handling of the MurmurHash3 hash function could look like, in Rust, if you have enough courage0:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
'outer: {
'case1: {
'case2: {
'case3: {
match len & 3 {
3 => break 'case3,
2 => break 'case2,
1 => break 'case1,
_ => break 'outer,
}
} // 'case3:
k1 ^= (tail[2] as u32) << 16;
} // 'case2:
k1 ^= (tail[1] as u32) << 8;
} // case1:
k1 ^= tail[0] as u32;
k1 *= c1; k1 = k1.rotate_left(15); k1 *= c2; h1 ^= k1;
}
This very occasionally matters and it’s not too hard to inflict it upon yourself, if required. Hopefully it isn’t required, because there’s no way to make it pretty.
Breaking the ice
In the MurmurHash3 hash functions, there’s two interesting snippets of code. One is for computing 128-bit results and the other for 32-bit results. They have the same vibes, but the 32-bit one is smaller and serves as a better example. It uses various uint32_t
variables (c1
, c2
, k1
, h1
), an int
variable len
, and a uint8_t *
pointer tail
:
1
2
3
4
5
6
7
switch(len & 3)
{
case 3: k1 ^= tail[2] << 16;
case 2: k1 ^= tail[1] << 8;
case 1: k1 ^= tail[0];
k1 *= c1; k1 = ROTL32(k1,15); k1 *= c2; h1 ^= k1;
};
It’s relying on the implicit fallthrough behaviour of C’s switch
statement: if the CPU proceeds through a case
arm without hitting a break
, it happily continues executing the next arm. That’s a compact, “DRY” and efficient way to write this equivalent code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
if ((len & 3) == 3) {
// case 3:
k1 ^= tail[2] << 16;
// case 2:
k1 ^= tail[1] << 8;
// case 1:
k1 ^= tail[0];
k1 *= c1; k1 = ROTL32(k1,15); k1 *= c2; h1 ^= k1;
}
else if ((len & 3) == 2) {
// case 2:
k1 ^= tail[1] << 8;
// case 1:
k1 ^= tail[0];
k1 *= c1; k1 = ROTL32(k1,15); k1 *= c2; h1 ^= k1;
}
else if ((len & 3) == 1) {
// case 1:
k1 ^= tail[0];
k1 *= c1; k1 = ROTL32(k1,15); k1 *= c2; h1 ^= k1;
}
Expressing the original switch
in Rust is… a bit awkward. A nice(?) version might be spelling out individual if
statements, carefully using >=
and avoiding else if
to get the right fallthrough behaviour:
1
2
3
4
5
6
7
8
9
10
11
12
if (len & 3) >= 3 {
k1 ^= (tail[2] as u32) << 16;
}
if (len & 3) >= 2 {
k1 ^= (tail[1] as u32) << 8;
}
if (len & 3) >= 1 {
k1 ^= tail[0] as u32;
k1 *= c1; k1 = k1.rotate_left(15); k1 *= c2; h1 ^= k1;
}
But, it doesn’t so nicely scale to larger switch
es like the 128-bit version of this example code: that one has 15 arms, instead of 3! And, especially for those larger versions, a switch
/match
statement seems to help the optimiser apply tricks like jump tables or computed-goto
, whereas individual if
s seem to end up with long sequential chains of compare-then-jump instead (compiler explorer).
Worst of all, that’s boring!
Instead, it’s possible to translate the original C code directly, fallthrough and all. All you need is more break
, like we saw before:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
'outer: {
'case1: {
'case2: {
'case3: {
match len & 3 {
3 => break 'case3,
2 => break 'case2,
1 => break 'case1,
_ => break 'outer,
}
} // 'case3:
k1 ^= (tail[2] as u32) << 16;
} // 'case2:
k1 ^= (tail[1] as u32) << 8;
} // case1:
k1 ^= tail[0] as u32;
k1 *= c1; k1 = k1.rotate_left(15); k1 *= c2; h1 ^= k1;
}
Perfect! But.. you might need to bribe someone to get it past code review.
Break it down for me
What’s going on here?
This is relying on Rust’s labelled blocks & breaks acting as a “forward goto
”. When we write 'a: { ... break 'a; ... }
that break 'a
is jumping to the end of the block labelled 'a
. We can use that to jump into the middle of a sequence of instructions.
Here’s a strangely-indented version of that same Rust code, which only has whitespace changes from the more conventional formatting just above:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
'outer: { 'case1: {'case2: { 'case3: {
match len & 3 {
3 => break 'case3,
2 => break 'case2,
1 => break 'case1,
_ => break 'outer,
}
} // 'case3:
k1 ^= (tail[2] as u32) << 16;
} // 'case2:
k1 ^= (tail[1] as u32) << 8;
} // 'case1:
k1 ^= tail[0] as u32;
k1 *= c1; k1 = k1.rotate_left(15); k1 *= c2; h1 ^= k1;
} // 'outer:
We can pretend the core functionality—all the updates to k1
—is “straight-line” code, and then the match
entrypoint is just choosing where to start executing within that code. (Maybe that presentation above is clearer to you, or the more conventional one is: the behaviour is the same.)
For instance, if len & 3
is 2, then the match
runs break 'case2
, and jumps to the end of the 'case2
block. That’s exactly the } // 'case2:
line, skipping over k1 ^= (tail[2] as u32) << 16;
, as desired. Thus, k1 ^= (tail[1] as u32) << 8;
will run.
Once that update has run, execution continues through the } // 'case1
(that is, leaving the block labelled 'case1
), on to executing k1 ^= tail[0] as u32;
: fallthrough! Also exactly what’s desired!
This is just a unwrapping of the original C code, with an extra “jump” in it (that the optimiser simplifies away).
I want to break things too
The transformation from C switch
to Rust match
is mechanical and easy to execute:
- Configure the blocks
- Arrange the
match
- Insert the code
Here’s a C example that demonstrates more switch
features:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
switch (x) {
case 12:
f1();
// fallthrough
case 56:
f2();
// no fallthrough
break;
case 34:
f3();
// conditional fallthrough
if (f4()) break;
default:
f5();
}
First, start with a 'outer: { ... }
block. For each case
or default
statement, create a '...: { ... }
block nested within 'outer
and within each other. Put the last case
/default
outermost, and the first one (immediately after switch
) innermost, with the rest in order between the two.
1
2
3
4
5
6
7
8
9
10
'outer: {
'default: {
'case34: {
'case56: {
'case12: {
}
}
}
}
}
Next, within all of the blocks, create a match
with an arm for each case
/default
, and contents break '...
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
'outer: {
'default: {
'case34: {
'case56: {
'case12: {
match x {
12 => break 'case12,
56 => break 'case56,
34 => break 'case34,
_ => break 'default,
}
}
}
}
}
}
Finally, put the code for each arm immediately after the matching labelled block (this is slightly unintuitive: code for case “X” should not be nested within the 'caseX
block). Replace any break
statements in C with break 'outer
in Rust.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
'outer: {
'default: {
'case34: {
'case56: {
'case12: {
match x {
12 => break 'case12,
56 => break 'case56,
34 => break 'case34,
_ => break 'default,
}
}
f1();
// fallthrough
}
f2();
// no fallthrough
break 'outer;
}
f3();
// conditional fallthrough
if f4() { break 'outer; }
}
f5()
}
That’s the whole process.
Why break Rust like this?
There’s a few places where C’s switch
fallthrough behaviour is particularly useful, often numeric-related “tight” code, like: hash functions like MurmurHash3 above, and prime sieves. Using fallthrough leads to simpler code that’s well optimised by the compiler.
The same logic applies to Rust: in the rare case it’s helpful, it can be very helpful… except it’s so awkward to express in Rust, that there’s no defining this as “simpler”. This only makes sense if benchmarks prove it .
(This is a quick article so I haven’t done the legwork to construct an example microbenchmark that demonstrates the performance impact of appropriate use of fallthrough: walking through that would be an article itself. Now you know about this idiosyncratic technique you can experiment… and maybe even write that article yourself!)
This labelled-break techinque is also a special case of a more general possibility: labelled breaks can be used to implement any directed acyclic graph of C goto
s. The switch
fallthrough is a simple linked-list graph (each case/state goes to one other case), but the technique can be used for more complicated graphs too:
- create a nested labelled block for each state, in reverse topological order, then
- use
break '...
to jump to the appropriate less-deeply-nested block when transitioning states.
This “feature” should be used… carefully. It’s clearly hard to understand, and would be hard to maintain by hand, especially as the number of cases grow larger. However, it may be a good trick to use within a macro or other code-generation.
Short breakdown
One can (ab)use labelled breaks in Rust to support fallthrough-like behaviour in match
. Not at all pretty, not at all advisable, but it works, and sometimes that’s as much as we can hope for.
The pythonesque/fallthrough and Jules-Bertholet/fallthrough macros appear to automate the transformation and seem to make it prettier… although I haven’t used them.
-
There may be fundamentally-better ways to approach computing this value, completely different to the C-inspired versions in this post. ↩