Implementing a Basic Pandas Series
Pandas Series are fundamental data structures in Python for data analysis, providing a one-dimensional labeled array capable of holding any data type. This challenge asks you to implement a simplified version of a Pandas Series to understand its core functionality. Successfully completing this challenge will solidify your understanding of data structures, indexing, and basic data manipulation.
Problem Description
You are tasked with creating a SimpleSeries class that mimics the core functionality of a Pandas Series. The class should be able to:
- Initialization: Accept a list-like object (e.g., list, tuple) as input to initialize the data. Also, accept an optional list-like object for the index. If no index is provided, a default integer index (0, 1, 2, ...) should be generated.
- Indexing/Selection: Allow access to elements using integer-based indexing (like a list) and label-based indexing (using the index provided during initialization). Both single element and slice selection should be supported.
- Length: Provide a
len()method that returns the number of elements in the Series. - Data Type: The class should automatically determine the data type of the underlying data and store it.
- Error Handling: Raise an
IndexErrorif an index is out of bounds during selection. Raise aTypeErrorif the input data is not list-like.
Examples
Example 1:
Input: data = [10, 20, 30, 40, 50], index = ['a', 'b', 'c', 'd', 'e']
Output: series['b']
Output: 20
Explanation: The Series is initialized with data [10, 20, 30, 40, 50] and index ['a', 'b', 'c', 'd', 'e']. Accessing 'b' returns the corresponding value, 20.
Example 2:
Input: data = [1.1, 2.2, 3.3], index = [1, 2, 3]
Output: series[1:3]
Output: SimpleSeries([2.2, 3.3], [1, 2])
Explanation: The Series is initialized with data [1.1, 2.2, 3.3] and index [1, 2, 3]. Slicing from index 1 (inclusive) to 3 (exclusive) returns a new SimpleSeries containing [2.2, 3.3] with corresponding index [1, 2].
Example 3:
Input: data = [1, 2, 3], index = ['x', 'y', 'z']
Output: len(series)
Output: 3
Explanation: The Series is initialized with data [1, 2, 3] and index ['x', 'y', 'z']. The len() method returns the number of elements, which is 3.
Example 4 (Edge Case):
Input: data = [10, 20], index = ['a', 'b']
Output: series[5]
Output: IndexError: Index 5 is out of bounds.
Explanation: The Series has indices 'a' and 'b' only. Attempting to access index 5 raises an IndexError.
Constraints
- The input
datamust be a list-like object (list, tuple). - The input
index(if provided) must also be a list-like object. - The length of
dataandindexshould be the same if anindexis provided. - Indexing should support both positive and negative integer indices.
- The class should handle different data types (int, float, string, etc.) correctly.
- Performance is not a primary concern for this simplified implementation. Focus on correctness and clarity.
Notes
- You don't need to implement all the features of a full Pandas Series (e.g., mathematical operations, missing data handling). Focus on the core indexing and data storage aspects.
- Consider how to handle cases where the provided index contains duplicate values (for this simplified version, assume the index will be unique).
- Think about how to represent the Series internally (e.g., using Python lists or dictionaries).
- The slicing operation should return a new
SimpleSeriesobject, not a view. - The
__len__method should be implemented. - The class should be named
SimpleSeries.