leetcode : 1341. Movie Rating
다이어그램
erDiagram
Movies {
int movie_id PK
varchar title
}
Users {
int user_id PK
varchar name UK
}
MovieRating {
int movie_id PK
int user_id PK
int rating
date created_at
}
MovieRating }|--|| Movies : "movie_id"
MovieRating }|--|| Users : "user_id"
목표
평점 가장 많이 매긴 사람 + 평점 가장 높은 영화 두 가지를 구하면 된다.
문제 풀이
MySQL
with feb as (
select *, avg(rating) as score
from movierating
where left(created_at,7) = '2020-02'
group by movie_id
),
rated as (
select *, count(*) as cnt
from movierating
group by user_id
),
name as (
select name as results
from users u
join rated r on r.user_id = u.user_id
where cnt = (select max(cnt) from rated)
order by name
limit 1
),
movie_name as (
select m.title as results
from feb f
join movies m on f.movie_id = m.movie_id
where score = (select max(score) from feb)
order by title
limit 1
)
select * from name
union all
select * from movie_name
- Solution 1
- CTE에 결과를 마구마구 저장하기.
- 평점같은 경우는 movie id로, 평점 매긴 사람은 user_id로 group by
- 각 테이블에서 서브쿼리로 최대 평가 수, 최대 평점을 가진 것을 가져온다.
- 컬럼명을 results로 맞춰주고 union all
- test case 18에 이름/영화이름이 동일한 경우가 있어서 중복허용 안하면 사라진다.
- Solution 2
WITH RATED_COUNT AS ( SELECT U.NAME AS results, ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC, U.NAME) AS RN FROM MOVIERATING MR JOIN USERS U ON U.USER_ID = MR.USER_ID GROUP BY MR.USER_ID, U.NAME ),
RATED_SCORE AS (
SELECT M.TITLE AS results
FROM MOVIERATING MR
JOIN MOVIES M ON MR.MOVIE_ID = M.MOVIE_ID
WHERE MR.CREATED_AT BETWEEN '2020-02-01' AND '2020-02-29'
GROUP BY MR.MOVIE_ID
ORDER BY AVG(RATING) DESC, M.TITLE
)
SELECT results
FROM RATED_COUNT
WHERE RN = 1
UNION ALL
SELECT results
FROM RATED_SCORE
LIMIT 2
* Solution 2
* ROW NUMBER로 리뷰개수 1등을 가져온다.
* SCORE 집계 함수 테이블과 UNION을 해주고, LIMIT 2 해주면 이름/영화 1등이 나온다.
### **Pandas**
```python
def movie_rating(movies: pd.DataFrame, users: pd.DataFrame, movie_rating: pd.DataFrame) -> pd.DataFrame:
user_grouped = movie_rating.groupby('user_id').agg(
cnt = ('user_id','count')
).reset_index()
avg_score = movie_rating[movie_rating['created_at'].between('2020-02-01','2020-02-29')].groupby('movie_id').agg(
score = ('rating','mean')
).reset_index()
max_name = pd.merge(user_grouped, users)
max_movie = pd.merge(avg_score, movies)
max_name.sort_values(['cnt','name'], ascending=[0,1], inplace=True)
max_movie.sort_values(['score','title'], ascending=[0,1], inplace=True)
answer = pd.DataFrame({'results': [max_name['name'].values[0], max_movie['title'].values[0]]})
return answer
- Solution 1
- 마찬가지로 groupby + agg로 집계하기.
- merge를 통해 join해주고
- sort values에서 다중 조건으로 정렬하기
def movie_rating(movies: pd.DataFrame, users: pd.DataFrame, movie_rating: pd.DataFrame) -> pd.DataFrame:
rated_count = movie_rating.groupby('user_id').agg(
cnt = ('user_id','size')
).reset_index()
cond_feb = movie_rating['created_at'].between('2020-02-01','2020-02-29')
rated_score = movie_rating[cond_feb].groupby('movie_id').agg(
avg_score = ('rating','mean')
).reset_index()
top_user = pd.merge(rated_count, users).sort_values(by=['cnt','name'], ascending=[False, True]).head(1)
top_movie = pd.merge(rated_score, movies).sort_values(by=['avg_score','title'], ascending=[False, True]).head(1)
return pd.DataFrame({'results':[top_user['name'].values[0], top_movie['title'].values[0] ]})
- Solution 2
- 비슷한 방식으로 풀이.
- head로 맨 위 결과만 가져온다.
- vaelus에서 [0]을 안붙이면 np array라서 답은 같아보여도 오답으로 나온다.
코멘트
- .
'Data Analysis > Query' 카테고리의 다른 글
leetcode : 1407. Top Travellers (0) | 2025.02.01 |
---|---|
leetcode : 1393. Capital Gain/Loss (0) | 2025.01.31 |
leetcode : 1378. Replace Employee ID With The Unique Identifier (0) | 2025.01.31 |
leetcode : 1327. List the Products Ordered in a Period (0) | 2025.01.23 |
leetcode : 1321. Restaurant Growth (0) | 2025.01.22 |
leetcode : 1280. Students and Examinations (0) | 2025.01.21 |
leetcode : 1251. Average Selling Price (0) | 2025.01.20 |
댓글