leetcode : 1280. Students and Examinations
다이어그램
erDiagram
Students {
int student_id PK
varchar student_name
}
Subjects {
varchar subject_name PK
}
Examinations {
int student_id FK
varchar subject_name FK
}
Students ||--o{ Examinations: "has"
Subjects ||--o{ Examinations: "includes"
목표
각 학생의 과목 수강 횟수 (과목 리스트에 있는 과목 전부)
문제 풀이
MySQL
-- Solution 1
with temp as (
select s.student_id, s.student_name, sub.subject_name
from students s
cross join subjects sub
),
grouped as (
select student_id, subject_name, count(*) as attended_exams
from examinations
group by student_id, subject_name
)
select t.student_id, t.student_name, t.subject_name, coalesce(g.attended_exams, 0) as attended_exams
from temp t
left join grouped g on t.student_id = g.student_id and t.subject_name = g.subject_name
order by t.student_id, t.subject_name
-- Solution 2
WITH JOINED AS (
SELECT *
FROM STUDENTS AS ST
CROSS JOIN SUBJECTS AS SUB
)
SELECT J.*, COUNT(E.STUDENT_ID) AS ATTENDED_EXAMS
FROM JOINED J
LEFT JOIN EXAMINATIONS E
ON E.STUDENT_ID = J.STUDENT_ID
AND E.SUBJECT_NAME = J.SUBJECT_NAME
GROUP BY J.STUDENT_ID, J.SUBJECT_NAME
ORDER BY J.STUDENT_ID, J.SUBJECT_NAME
- Solution 1
- cross join을 한 테이블과 시험내역을 group by한 테이블을 left join한다.
- 학생 테이블은 모든 학생 정보를 담고 있고, 시험 테이블에서는 일부 학생만 있는데 결과 쿼리에서는 전체 학생을 원해서 cross join 이후, left join을 한다.
- Solution 2
- .
Pandas
# SOlution 1
def students_and_examinations(students: pd.DataFrame, subjects: pd.DataFrame, examinations: pd.DataFrame) -> pd.DataFrame:
temp = pd.MultiIndex.from_product([students['student_id'], subjects['subject_name']], names=['student_id', 'subject_name']).to_frame(index=False)
temp = temp.merge(students, on='student_id')
grouped = examinations.groupby(['student_id', 'subject_name']).size().reset_index(name='attended_exams')
answer = temp.merge(grouped, on=['student_id', 'subject_name'], how='left')
answer['attended_exams'].fillna(0, inplace = True)
answer.sort_values(by=['student_id', 'subject_name'], inplace=True)
return answer[['student_id','student_name','subject_name','attended_exams']]
# SOlution 2
def students_and_examinations(students: pd.DataFrame, subjects: pd.DataFrame, examinations: pd.DataFrame) -> pd.DataFrame:
joined = pd.merge(students, subjects, how='cross')
grouped = examinations.groupby(['student_id','subject_name']).agg(
attended_exams = ('subject_name','size')
).reset_index()
answer = pd.merge(joined, grouped, on=['student_id','subject_name'], how='left')
answer['attended_exams'] = answer['attended_exams'].fillna(0)
return answer.sort_values(by=['student_id','subject_name'])
- Solution 1
- chatgpt한테 물어보니까 multiindex 써야한다는데, 그냥 how = 'cross'가 있더라...
- Solution 2
- cross join + group + merge
코멘트
- cross join 했는데도 테이블 사이즈가 크지 않은거같아서 SQL기준 상위 15퍼정도 속도 나왔음.
- 애초에 테이블을 보면 정규화가 잘 된 케이스라서 자연스럽게 구하는 형태를 보고 cross join 방향으로 갈 수 있다.
- 나쁘다고만 생각하지 말기.
'Data Analysis > Query' 카테고리의 다른 글
leetcode : 1341. Movie Rating (0) | 2025.01.26 |
---|---|
leetcode : 1327. List the Products Ordered in a Period (0) | 2025.01.23 |
leetcode : 1321. Restaurant Growth (0) | 2025.01.22 |
leetcode : 1251. Average Selling Price (0) | 2025.01.20 |
leetcode : 1211. Queries Quality and Percentage (0) | 2025.01.19 |
leetcode : 1204. Last Person to Fit in the Bus (0) | 2025.01.18 |
leetcode : 1193. Monthly Transactions I (0) | 2025.01.17 |
댓글