Count Unique Subjects per Teacher
This challenge focuses on data aggregation and analysis. You will be given a dataset representing subject assignments to teachers. The goal is to determine, for each teacher, how many distinct subjects they are assigned to teach. This is a common task in educational administration and resource management.
Problem Description
You are provided with a list of records, where each record signifies that a particular teacher teaches a specific subject. Your task is to process this data and return a summary that shows, for every teacher, the total count of unique subjects they are responsible for.
Key Requirements:
- Process a given dataset of teacher-subject assignments.
- Identify all unique subjects taught by each individual teacher.
- Count the number of these unique subjects for each teacher.
- The output should clearly associate each teacher with their count of unique subjects.
Expected Behavior:
The output should be a collection (e.g., a list of pairs, a map/dictionary) where each entry represents a teacher and their corresponding count of unique subjects. Teachers who do not appear in the input dataset should not be included in the output. If a teacher teaches the same subject multiple times (represented by multiple entries in the input), it should only be counted once towards their unique subject count.
Edge Cases to Consider:
- An empty input dataset.
- A dataset where all teachers teach only one subject.
- A dataset where a single teacher teaches many different subjects.
- A dataset where multiple teachers teach the same set of subjects.
Examples
Example 1:
Input:
[
{"teacher_id": 1, "subject": "Math"},
{"teacher_id": 1, "subject": "Physics"},
{"teacher_id": 2, "subject": "Chemistry"},
{"teacher_id": 1, "subject": "Math"}
]
Output:
[
{"teacher_id": 1, "unique_subjects_count": 2},
{"teacher_id": 2, "unique_subjects_count": 1}
]
Explanation: Teacher 1 is assigned "Math" and "Physics". Although "Math" appears twice, it's only counted as one unique subject. Therefore, Teacher 1 teaches 2 unique subjects. Teacher 2 is assigned "Chemistry". They teach 1 unique subject.
Example 2:
Input:
[
{"teacher_id": 10, "subject": "History"},
{"teacher_id": 20, "subject": "Art"},
{"teacher_id": 10, "subject": "Geography"},
{"teacher_id": 30, "subject": "Music"},
{"teacher_id": 20, "subject": "Drama"},
{"teacher_id": 10, "subject": "History"},
{"teacher_id": 20, "subject": "Art"}
]
Output:
[
{"teacher_id": 10, "unique_subjects_count": 2},
{"teacher_id": 20, "unique_subjects_count": 2},
{"teacher_id": 30, "unique_subjects_count": 1}
]
Explanation: Teacher 10 teaches "History" and "Geography" (2 unique subjects). Teacher 20 teaches "Art" and "Drama" (2 unique subjects). Teacher 30 teaches "Music" (1 unique subject).
Example 3: (Empty Input)
Input:
[]
Output:
[]
Explanation: With no input records, there are no teachers to report on, so the output is an empty list.
Constraints
- The input will be a list of records, where each record is an object/dictionary with two keys:
teacher_id(an integer) andsubject(a string). teacher_idwill be a positive integer.subjectwill be a non-empty string.- The number of records in the input list can range from 0 to 100,000.
- The number of unique
teacher_idvalues can range from 0 to 10,000. - The number of unique
subjectstrings can range from 0 to 10,000. - Your solution should be reasonably efficient and handle large inputs without excessive time complexity.
Notes
Consider how you might group the assignments by teacher first. Then, for each group, you'll need a way to count only the distinct subjects. A common approach involves using data structures that can efficiently store and check for the presence of unique elements.