Implementing a Custom String Type in Rust
This challenge asks you to implement a simplified String type in Rust, focusing on memory management and basic string operations. Understanding how Rust's String works under the hood is crucial for grasping memory safety and ownership concepts, and this exercise provides a hands-on opportunity to explore those ideas.
Problem Description
You are tasked with creating a custom MyString type in Rust that mimics some of the core functionalities of the standard String type. MyString should manage its own dynamically allocated memory to store a sequence of bytes (UTF-8 encoded characters). Your implementation should include the following:
- Data Storage:
MyStringshould internally store aVec<u8>to hold the string data. - Constructor: A constructor
new(data: &[u8]) -> MyStringthat takes a slice of bytes (&[u8]) and allocates memory to copy the data into theVec<u8>. len()method: A methodlen(&self) -> usizethat returns the length of the string (number of bytes).as_str(&self) -> &strmethod: A method that returns a string slice (&str) representing the contents ofMyString. This should be a safe view into the underlyingVec<u8>.push(c: u8) -> &mut Selfmethod: A method that appends a single bytecto the end of the string. This method should modify theMyStringin place and return a reference to the modifiedMyString.- Error Handling: The
newconstructor should panic if the input slice contains invalid UTF-8 sequences. You can use thestd::str::from_utf8function for this validation.
Examples
Example 1:
Input: `MyString::new(b"hello")`
Output: `MyString { data: Vec<u8>: [104, 101, 108, 108, 111] }`
Explanation: A new `MyString` is created, allocating memory and copying the bytes "hello" into its internal `Vec<u8>`.
Example 2:
Input: `let mut s = MyString::new(b"world"); s.push(b'!');`
Output: `MyString { data: Vec<u8>: [119, 111, 114, 108, 100] }` followed by `MyString { data: Vec<u8>: [119, 111, 114, 108, 100, 33] }`
Explanation: First, a `MyString` is created with "world". Then, the `push` method appends the byte '!' (ASCII 33) to the end of the string.
Example 3:
Input: `MyString::new(b"invalid\x80utf8")`
Output: Panic with message "Invalid UTF-8 sequence"
Explanation: The input contains an invalid UTF-8 sequence (`\x80`). The `from_utf8` function will return `Err`, causing the constructor to panic.
Constraints
- The
Vec<u8>should be allocated on the heap. - The
as_strmethod must return a valid&strslice. - The
pushmethod must modify theMyStringin place. - The
newconstructor must validate UTF-8 encoding. - The
lenmethod should return the number of bytes stored, not the number of characters.
Notes
- Focus on the core concepts of memory management and ownership.
- Consider how to safely create a string slice (
&str) from your internalVec<u8>. - The
pushmethod is a good place to practice in-place modification. - Use
std::str::from_utf8for UTF-8 validation. Remember to handle theResultappropriately. - This is a simplified
Stringtype; it doesn't include all the features of the standardString. The goal is to understand the underlying principles.
#[derive(Debug)]
struct MyString {
data: Vec<u8>,
}
impl MyString {
fn new(data: &[u8]) -> MyString {
if let Ok(_) = std::str::from_utf8(data) {
MyString { data: data.to_vec() }
} else {
panic!("Invalid UTF-8 sequence");
}
}
fn len(&self) -> usize {
self.data.len()
}
fn as_str(&self) -> &str {
unsafe {
std::str::from_utf8_unchecked(&self.data)
}
}
fn push(&mut self, c: u8) -> &mut Self {
self.data.push(c);
self
}
}