Hone logo
Hone
Problems

Concurrent Data Processing with Rust's Arc and Mutex

Rust's ownership and borrowing system ensures memory safety, but it can be tricky when dealing with concurrent access to data. This challenge focuses on building a simple concurrent data processor using Arc (Atomic Reference Counting) and Mutex to safely share and modify data across multiple threads. The goal is to demonstrate how to manage shared mutable state in a concurrent environment.

Problem Description

You are tasked with creating a concurrent data processor that receives a vector of integers and calculates the sum of squares of those integers. The processing should be divided among multiple threads to improve performance. The core of the challenge lies in safely sharing a mutable accumulator variable across these threads.

What needs to be achieved:

  1. Create a function process_data_concurrently that takes a vector of integers (Vec<i32>) and the number of threads (usize) as input.
  2. Divide the input vector into chunks, assigning each chunk to a separate thread.
  3. Each thread should calculate the sum of squares for its assigned chunk and add it to a shared accumulator.
  4. The shared accumulator must be protected by a Mutex to prevent data races.
  5. The Arc will allow multiple threads to safely own the Mutex.
  6. Return the final sum of squares.

Key Requirements:

  • Concurrency: Utilize multiple threads to process the data.
  • Thread Safety: Ensure that the shared accumulator is accessed and modified safely using Arc and Mutex.
  • Correctness: The final sum of squares must be accurate.
  • Efficiency: While correctness is paramount, consider the efficiency of the thread division and synchronization.

Expected Behavior:

The process_data_concurrently function should return the correct sum of squares of all integers in the input vector, calculated concurrently across the specified number of threads.

Edge Cases to Consider:

  • Empty Input Vector: Handle the case where the input vector is empty gracefully (return 0).
  • Zero Threads: If the number of threads is zero or less, process the data sequentially (return the sum of squares calculated by a single thread).
  • Large Input Vector: Consider how the thread division affects performance with very large input vectors.
  • Number of Threads Exceeding Vector Length: If the number of threads is greater than the length of the vector, each thread should process a single element (or fewer for the last thread).

Examples

Example 1:

Input: vec![1, 2, 3, 4], 2
Output: 30
Explanation: (1^2 + 2^2) + (3^2 + 4^2) = 1 + 4 + 9 + 16 = 30

Example 2:

Input: vec![1, 2, 3, 4, 5], 3
Output: 55
Explanation: (1^2) + (2^2 + 3^2) + (4^2 + 5^2) = 1 + 4 + 9 + 16 + 25 = 55

Example 3:

Input: vec![], 4
Output: 0
Explanation: Empty vector, so the sum of squares is 0.

Constraints

  • The input vector can contain up to 10,000 integers.
  • The number of threads will be between 1 and 100 (inclusive).
  • All integers in the input vector will be within the range of -1000 to 1000.
  • The function must complete within 1 second.

Notes

  • The std::thread module provides the necessary tools for creating and managing threads.
  • Arc allows safe sharing of data between threads.
  • Mutex provides mutual exclusion, ensuring that only one thread can access the shared accumulator at a time.
  • Consider using the chunk_indices method to divide the work evenly among the threads.
  • Think about how to handle potential panics within the threads and ensure that the accumulator remains consistent. (While not explicitly required for this challenge, it's good practice to consider.)
Loading editor...
rust