Hone logo
Hone
Problems

Implementing a Basic Pandas Series

Pandas Series are fundamental data structures in Python for data analysis, providing a one-dimensional labeled array capable of holding any data type. This challenge asks you to implement a simplified version of a Pandas Series to understand its core functionality. Successfully completing this challenge will solidify your understanding of data structures, indexing, and basic data manipulation.

Problem Description

You are tasked with creating a SimpleSeries class that mimics the core functionality of a Pandas Series. The class should be able to:

  1. Initialization: Accept a list-like object (e.g., list, tuple) as input to initialize the data. Also, accept an optional list-like object for the index. If no index is provided, a default integer index (0, 1, 2, ...) should be generated.
  2. Indexing/Selection: Allow access to elements using integer-based indexing (like a list) and label-based indexing (using the index provided during initialization). Both single element and slice selection should be supported.
  3. Length: Provide a len() method that returns the number of elements in the Series.
  4. Data Type: The class should automatically determine the data type of the underlying data and store it.
  5. Error Handling: Raise an IndexError if an index is out of bounds during selection. Raise a TypeError if the input data is not list-like.

Examples

Example 1:

Input: data = [10, 20, 30, 40, 50], index = ['a', 'b', 'c', 'd', 'e']
Output: series['b']
Output: 20
Explanation: The Series is initialized with data [10, 20, 30, 40, 50] and index ['a', 'b', 'c', 'd', 'e']. Accessing 'b' returns the corresponding value, 20.

Example 2:

Input: data = [1.1, 2.2, 3.3], index = [1, 2, 3]
Output: series[1:3]
Output: SimpleSeries([2.2, 3.3], [1, 2])
Explanation: The Series is initialized with data [1.1, 2.2, 3.3] and index [1, 2, 3]. Slicing from index 1 (inclusive) to 3 (exclusive) returns a new SimpleSeries containing [2.2, 3.3] with corresponding index [1, 2].

Example 3:

Input: data = [1, 2, 3], index = ['x', 'y', 'z']
Output: len(series)
Output: 3
Explanation: The Series is initialized with data [1, 2, 3] and index ['x', 'y', 'z']. The len() method returns the number of elements, which is 3.

Example 4 (Edge Case):

Input: data = [10, 20], index = ['a', 'b']
Output: series[5]
Output: IndexError: Index 5 is out of bounds.
Explanation: The Series has indices 'a' and 'b' only. Attempting to access index 5 raises an IndexError.

Constraints

  • The input data must be a list-like object (list, tuple).
  • The input index (if provided) must also be a list-like object.
  • The length of data and index should be the same if an index is provided.
  • Indexing should support both positive and negative integer indices.
  • The class should handle different data types (int, float, string, etc.) correctly.
  • Performance is not a primary concern for this simplified implementation. Focus on correctness and clarity.

Notes

  • You don't need to implement all the features of a full Pandas Series (e.g., mathematical operations, missing data handling). Focus on the core indexing and data storage aspects.
  • Consider how to handle cases where the provided index contains duplicate values (for this simplified version, assume the index will be unique).
  • Think about how to represent the Series internally (e.g., using Python lists or dictionaries).
  • The slicing operation should return a new SimpleSeries object, not a view.
  • The __len__ method should be implemented.
  • The class should be named SimpleSeries.
Loading editor...
python