Avoiding `clone` when we no longer need the struct's field
There are some situations where we can forego the use of clone
and instead use take
to “extract” a field value from a struct. This is particularly useful when we need ownership of multiple (private) fields of a struct, but can’t destructure it.
The problem
Suppose we have this struct in an external crate:
pub struct MyStruct {
a: Vec<u64>,
pub b: String,
}
impl MyStruct {
pub fn a(self) -> Vec<u64> {
self.a
}
pub fn a_ref(&self) -> &Vec<u64> {
&self.a
}
pub fn a_mut(&mut self) -> &mut Vec<u64> {
&mut self.a
}
}
And we have a function which takes ownership of this struct, does something to both its fields, and returns the inner vector:
pub fn my_function_clone(s: MyStruct) -> Vec<u64> {
// Can't move out of s:
// let a = s.a(); // cannot do this
// Also cannot destructure, as the field `a` we want is private
// let MyStruct { a, .. } = s; // cannot do this
// Instead need to clone the entire vector
let a: Vec<u64> = s.a_ref().clone();
// anonymous closure doing something to `b`, which doesn't use `a`.
let _ = || s.b();
a
}
The solution
Notice that we no longer need MyStruct
after we access return a
.
A more efficient approach is to “take” the vector a
out of the struct and leave the rest. How to do this? Well std::mem::take
replaces the underlying data with Default::default()
, which in this case is an empty vector. This makes the operation extremely cheap.
pub fn my_function_take(s: MyStruct) -> Vec<u64> {
let a = { mem::take(s.a_mut()) };
// anonymous closure doing something to `b`, which doesn't use `a`.
let _ = || s.b;
a
}
Results
Results of clone
vs. take
for 2^24 field elements (the field chosen here is internally represented as 6 u64
values):
clone_vector time: [94.980 ms 96.027 ms 97.145 ms]
take_vector time: [272.09 ns 296.69 ns 321.30 ns]
The take
approach is 300,000x faster! Alright, it’s true that the size of the cloned vector is rather large, but it is not unreasonable to have vectors of this size in cryptographic applications.
Still, e.g. for a vector with ~1M elements, the cost of cloning is about 2-3ms, compared to a negligible cost of take
. Paying a few extra milliseconds here and there might seem innocuous, but is actually quite a big deal if you want to optimize your code and provide a competitive implementation.
Full example: https://github.com/HungryCatsStudio/take-vs-clone-bench
Author: Marcin Górny