Hone logo
Hone
Problems

Implement a Basic Thread Sanitizer in Rust

Concurrency is a powerful tool, but it introduces complex bugs like data races that can be notoriously difficult to detect and debug. A thread sanitizer helps identify these issues by detecting memory accesses to shared data that occur without proper synchronization. This challenge will guide you in building a simplified thread sanitizer for Rust.

Problem Description

Your task is to implement a rudimentary thread sanitizer for Rust. This sanitizer should instrument memory accesses to shared data and detect potential data races. Specifically, you need to:

  1. Track Memory Accesses: Keep track of which memory locations have been accessed by which threads.
  2. Detect Data Races: Identify situations where two or more threads access the same memory location concurrently, and at least one of the accesses is a write, without any synchronization mechanism being used.
  3. Report Violations: When a data race is detected, your sanitizer should report it with relevant information, such as the memory address, the type of access (read/write), and the thread IDs involved.

For this challenge, we will simplify the synchronization aspect. You do not need to implement a full-fledged lock-based synchronization mechanism. Instead, focus on detecting the potential for a data race based on concurrent access to unprotected shared memory.

Key Requirements:

  • The sanitizer should work with unsafe Rust code where direct memory manipulation or sharing occurs.
  • It needs to maintain a record of accessed memory regions, including the thread performing the access and the type of access.
  • Upon detecting a potential data race, it must panic or print an error message detailing the race condition.
  • You will need to provide a mechanism to "instrument" code that uses shared memory.

Expected Behavior:

When code that exhibits a data race is executed under your sanitizer's observation, the sanitizer should detect and report it. Code without data races should execute without interference.

Edge Cases to Consider:

  • Multiple threads writing to the same location.
  • One thread writing while another thread reads the same location.
  • Memory regions being allocated and deallocated. (For this challenge, we can assume static or heap-allocated data for simplicity, and you don't need to handle dynamic allocation/deallocation in a sophisticated way).
  • The overhead introduced by the sanitizer.

Examples

Example 1: Data Race - Multiple Writes

use std::thread;
use std::sync::Arc;

// Assume 'shared_data' is being managed by your sanitizer
// and the following code is instrumented to pass through it.

let mut shared_data = 0;
let shared_data_ptr = &mut shared_data as *mut i32;
let mut handles = vec![];

for _ in 0..5 {
    let data_ptr = shared_data_ptr;
    handles.push(thread::spawn(move || {
        // This write operation should be instrumented
        // by your sanitizer to detect the race.
        unsafe {
            // Simulate concurrent write
            let value = *data_ptr; // Read
            *data_ptr = value + 1; // Write
        }
    }));
}

for handle in handles {
    handle.join().unwrap();
}

// If your sanitizer works correctly, it should detect a data race here
// and panic or report an error.

Output (Conceptual):

Thread sanitizer detected a data race!
Address: <address_of_shared_data>
Access 1: Thread A (Write)
Access 2: Thread B (Write)
...

Explanation: Five threads are concurrently attempting to read and write to the same shared_data variable without any synchronization. This is a classic data race.

Example 2: No Data Race - Atomic Operation (Conceptual)

use std::thread;
use std::sync::Arc;
use std::sync::atomic::{AtomicI32, Ordering};

// Assume 'shared_atomic_data' is being managed by your sanitizer.
// Atomic operations are generally considered "safe" from data races
// if used correctly, but your sanitizer might still observe them.

let shared_atomic_data = Arc::new(AtomicI32::new(0));
let mut handles = vec![];

for _ in 0..5 {
    let data = Arc::clone(&shared_atomic_data);
    handles.push(thread::spawn(move || {
        // This atomic increment is synchronized internally.
        // Your sanitizer might log this access but shouldn't report a data race.
        data.fetch_add(1, Ordering::SeqCst);
    }));
}

for handle in handles {
    handle.join().unwrap();
}

// The program should complete without panic if no other races exist.

Output:

(No output from the sanitizer, program completes successfully).

Explanation: Each thread uses an atomic operation to increment the shared counter. Atomic operations are designed to be thread-safe, so no data race is expected or reported.

Constraints

  • The sanitizer should be implemented purely in Rust.
  • Focus on detecting races on primitive types (like i32, u8, etc.) and simple data structures. You don't need to support arbitrary complex types for this challenge.
  • Performance overhead is a consideration, but correctness in detecting races is paramount. Aim for reasonable performance, not necessarily zero overhead.
  • You will need to define your own way of "instrumenting" the code. This might involve wrapping memory accesses in macros or custom functions.

Notes

  • Consider using a HashMap or a similar data structure to keep track of memory accesses. The key could be the memory address, and the value could store information about active accesses (thread ID, access type, timestamp/order).
  • Thread IDs can be obtained using std::thread::current().id().
  • For simplicity, you can focus on detecting races on memory locations accessed via raw pointers (*mut T or *const T).
  • Think about how to represent the state of a memory location: unaccessed, read by thread X, written by thread Y, read by X and Y, etc.
  • This is a simplified model. Real-world thread sanitizers are far more complex and often involve compiler instrumentation. Your task is to simulate this behavior in Rust code.
  • You'll need a way to represent different types of memory access (read, write).
  • When a race is detected, a panic! is a suitable way to signal an error for this challenge.
Loading editor...
rust