Random Sampling Implementation in Python
Random sampling is a fundamental technique in data science and statistics, allowing you to select a subset of data points from a larger dataset without replacement. This is useful for creating representative samples for analysis, reducing computational cost when dealing with massive datasets, or for bootstrapping techniques. Your task is to implement a function that performs random sampling from a given list.
Problem Description
You are required to implement a function called random_sample that takes a list and a sample size as input and returns a new list containing a random sample of the specified size from the original list, without replacement. The function should utilize the random module in Python to ensure randomness.
Key Requirements:
- The function must accept two arguments:
data: A list of any data type.sample_size: An integer representing the number of elements to sample.
- The function must return a new list containing the randomly selected elements from the input list.
- The sampling must be done without replacement, meaning an element can only appear once in the sample.
- The order of elements in the returned sample list should be randomized.
- The function should handle edge cases gracefully (see below).
Expected Behavior:
The function should return a list of the specified sample_size elements chosen randomly from the input data list. The returned list should not modify the original data list.
Edge Cases to Consider:
sample_sizeis 0: Return an empty list.sample_sizeis greater than the length ofdata: Return a copy of the entiredatalist (as it's impossible to sample more elements than exist).datais an empty list: Return an empty list.sample_sizeis negative: Raise aValueErrorwith a descriptive message.sample_sizeis not an integer: Raise aTypeErrorwith a descriptive message.
Examples
Example 1:
Input: data = [1, 2, 3, 4, 5], sample_size = 3
Output: [3, 1, 2] (or any other combination of 3 unique elements)
Explanation: The function randomly selects 3 unique elements from the list [1, 2, 3, 4, 5]. The order may vary.
Example 2:
Input: data = ['a', 'b', 'c'], sample_size = 1
Output: ['b'] (or 'a' or 'c')
Explanation: The function randomly selects 1 element from the list ['a', 'b', 'c'].
Example 3:
Input: data = [10, 20, 30, 40], sample_size = 4
Output: [40, 10, 20, 30] (or any other permutation of the original list)
Explanation: Since the sample size is equal to the length of the data, the function returns a shuffled copy of the original list.
Example 4:
Input: data = [1, 2, 3], sample_size = 0
Output: []
Explanation: An empty list is returned as the sample size is 0.
Constraints
datawill be a list.sample_sizewill be an integer.- The length of
datacan be any non-negative integer. - The function must not modify the original
datalist. - The function should be reasonably efficient for lists of up to 10,000 elements. While performance is not the primary focus, avoid excessively inefficient algorithms.
Notes
Consider using the random.sample() function from the Python random module. This function is specifically designed for random sampling without replacement and is generally the most efficient and Pythonic approach. Remember to handle the edge cases described above to ensure the robustness of your solution. Pay close attention to the type and value of sample_size to prevent unexpected errors.