Implementing a Basic String Type in Rust
This challenge focuses on understanding how fundamental data structures work under the hood in Rust by creating a simplified version of its built-in String type. You'll gain practical experience with memory management, UTF-8 encoding, and common string operations.
Problem Description
Your task is to implement a custom MyString struct in Rust that mimics some of the core functionalities of the standard library's String. This MyString should be capable of storing UTF-8 encoded text and support basic operations like creation, appending, and retrieving its length.
Key Requirements:
- Storage:
MyStringshould internally store its data as a dynamically allocated buffer of bytes (Vec<u8>). - UTF-8 Validity: The internal
Vec<u8>must always represent valid UTF-8 encoded data. - Capacity and Length:
MyStringshould track its allocated capacity and the current number of bytes it contains (its length). - Operations:
new(): Create an emptyMyString.from_str(s: &str): Create aMyStringfrom a string slice.push_str(&mut self, s: &str): Append a string slice to the end ofMyString. This operation should handle reallocation if the internal buffer is not large enough.len(&self): Return the number of bytes in theMyString.as_str(&self): Return a string slice (&str) view of theMyString's contents. This should panic if the internal buffer is not valid UTF-8.
Expected Behavior:
MyString::new()should result in an empty string with zero capacity and zero length.MyString::from_str("hello")should create aMyStringcontaining the bytes for "hello" and have appropriate capacity.- Appending a string that fits within the current capacity should not cause reallocation.
- Appending a string that exceeds the current capacity should trigger a reallocation, typically doubling the capacity (or setting a minimum if capacity is zero).
len()should accurately report the byte count.as_str()should return a valid&strwhen the contents are valid UTF-8.
Edge Cases to Consider:
- Creating an empty string.
- Appending an empty string.
- Appending a string that exactly fills the remaining capacity.
- Appending a string that requires multiple reallocations.
- Ensuring UTF-8 validity is maintained after appends.
Examples
Example 1:
let mut my_string = MyString::new();
my_string.push_str("Hello");
my_string.push_str(" ");
my_string.push_str("World!");
println!("Length: {}", my_string.len()); // Expected: 12
println!("Content: {}", my_string.as_str()); // Expected: Hello World!
Explanation: We start with an empty string, then append "Hello", a space, and "World!". The total length is the sum of the byte lengths of these strings. as_str() should correctly reconstruct the full string.
Example 2:
let my_string = MyString::from_str("Rust");
println!("Length: {}", my_string.len()); // Expected: 4
println!("Content: {}", my_string.as_str()); // Expected: Rust
Explanation: Creating a string from a string slice initializes MyString with the provided content.
Example 3: (UTF-8)
let mut my_string = MyString::from_str("你好"); // "Ni hao" in Chinese (2 characters, 6 bytes)
my_string.push_str("世界"); // "World" in Chinese (2 characters, 6 bytes)
println!("Length: {}", my_string.len()); // Expected: 12
println!("Content: {}", my_string.as_str()); // Expected: 你好世界
Explanation: This demonstrates handling of multi-byte UTF-8 characters. The length returned is the byte length, not the character count.
Constraints
- The internal storage for
MyStringmust be aVec<u8>. - All string operations must preserve UTF-8 validity.
- Reallocation strategy: When capacity is insufficient, double the current capacity. If the current capacity is 0, set it to a small initial value (e.g., 8 bytes).
as_str()should usestd::str::from_utf8internally.
Notes
- Consider the implications of Rust's ownership and borrowing rules for your
MyStringimplementation. - The
push_strmethod needs to carefully handle appending bytes and ensuring the resultingVec<u8>remains valid UTF-8. - You will need to implement
new,from_str,push_str,len, andas_stras public methods on yourMyStringstruct. - You might find it helpful to think about how Rust's
Stringmanages its capacity and performs reallocations. - This challenge intentionally omits some advanced
Stringfeatures (likecapacity(),push(),pop(),insert(), slicing, etc.) to keep the scope manageable. Focus on the core requirements listed above.