Hone logo
Hone
Problems

Implementing a Custom String Type in Rust

This challenge asks you to implement a simplified String type in Rust, focusing on memory management and basic string operations. Understanding how Rust's String works under the hood is crucial for grasping memory safety and ownership concepts, and this exercise provides a hands-on opportunity to explore those ideas.

Problem Description

You are tasked with creating a custom MyString type in Rust that mimics some of the core functionalities of the standard String type. MyString should manage its own dynamically allocated memory to store a sequence of bytes (UTF-8 encoded characters). Your implementation should include the following:

  • Data Storage: MyString should internally store a Vec<u8> to hold the string data.
  • Constructor: A constructor new(data: &[u8]) -> MyString that takes a slice of bytes (&[u8]) and allocates memory to copy the data into the Vec<u8>.
  • len() method: A method len(&self) -> usize that returns the length of the string (number of bytes).
  • as_str(&self) -> &str method: A method that returns a string slice (&str) representing the contents of MyString. This should be a safe view into the underlying Vec<u8>.
  • push(c: u8) -> &mut Self method: A method that appends a single byte c to the end of the string. This method should modify the MyString in place and return a reference to the modified MyString.
  • Error Handling: The new constructor should panic if the input slice contains invalid UTF-8 sequences. You can use the std::str::from_utf8 function for this validation.

Examples

Example 1:

Input: `MyString::new(b"hello")`
Output: `MyString { data: Vec<u8>: [104, 101, 108, 108, 111] }`
Explanation: A new `MyString` is created, allocating memory and copying the bytes "hello" into its internal `Vec<u8>`.

Example 2:

Input: `let mut s = MyString::new(b"world"); s.push(b'!');`
Output: `MyString { data: Vec<u8>: [119, 111, 114, 108, 100] }` followed by `MyString { data: Vec<u8>: [119, 111, 114, 108, 100, 33] }`
Explanation: First, a `MyString` is created with "world". Then, the `push` method appends the byte '!' (ASCII 33) to the end of the string.

Example 3:

Input: `MyString::new(b"invalid\x80utf8")`
Output: Panic with message "Invalid UTF-8 sequence"
Explanation: The input contains an invalid UTF-8 sequence (`\x80`). The `from_utf8` function will return `Err`, causing the constructor to panic.

Constraints

  • The Vec<u8> should be allocated on the heap.
  • The as_str method must return a valid &str slice.
  • The push method must modify the MyString in place.
  • The new constructor must validate UTF-8 encoding.
  • The len method should return the number of bytes stored, not the number of characters.

Notes

  • Focus on the core concepts of memory management and ownership.
  • Consider how to safely create a string slice (&str) from your internal Vec<u8>.
  • The push method is a good place to practice in-place modification.
  • Use std::str::from_utf8 for UTF-8 validation. Remember to handle the Result appropriately.
  • This is a simplified String type; it doesn't include all the features of the standard String. The goal is to understand the underlying principles.
#[derive(Debug)]
struct MyString {
    data: Vec<u8>,
}

impl MyString {
    fn new(data: &[u8]) -> MyString {
        if let Ok(_) = std::str::from_utf8(data) {
            MyString { data: data.to_vec() }
        } else {
            panic!("Invalid UTF-8 sequence");
        }
    }

    fn len(&self) -> usize {
        self.data.len()
    }

    fn as_str(&self) -> &str {
        unsafe {
            std::str::from_utf8_unchecked(&self.data)
        }
    }

    fn push(&mut self, c: u8) -> &mut Self {
        self.data.push(c);
        self
    }
}
Loading editor...
rust