[Pandas] 리트코드: 184. Department Highest Salary
Department Highest Salary - LeetCode
문제
Table: Employee
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| id | int |
| name | varchar |
| salary | int |
| departmentId | int |
+--------------+---------+
id is the primary key (column with unique values) for this table.
departmentId is a foreign key (reference columns) of the ID from the Department table.
Each row of this table indicates the ID, name, and salary of an employee. It also contains the ID of their department.
Table: Department
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| id | int |
| name | varchar |
+-------------+---------+
id is the primary key (column with unique values) for this table. It is guaranteed that department name is not NULL.
Each row of this table indicates the ID of a department and its name.
Write a solution to find employees who have the highest salary in each of the departments.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input:
Employee table:
+----+-------+--------+--------------+
| id | name | salary | departmentId |
+----+-------+--------+--------------+
| 1 | Joe | 70000 | 1 |
| 2 | Jim | 90000 | 1 |
| 3 | Henry | 80000 | 2 |
| 4 | Sam | 60000 | 2 |
| 5 | Max | 90000 | 1 |
+----+-------+--------+--------------+
Department table:
+----+-------+
| id | name |
+----+-------+
| 1 | IT |
| 2 | Sales |
+----+-------+
Output:
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT | Jim | 90000 |
| Sales | Henry | 80000 |
| IT | Max | 90000 |
+------------+----------+--------+
Explanation: Max and Jim both have the highest salary in the IT department and Henry has the highest salary in the Sales department.
문제풀이
def department_highest_salary(employee: pd.DataFrame, department: pd.DataFrame) -> pd.DataFrame:
employee['rank'] = employee.groupby('departmentId')['salary'].rank(method='dense', ascending=False)
high_salary_employee = employee[employee['rank'] == 1]
answer = pd.merge(high_salary_employee, department,
left_on='departmentId', right_on='id',
suffixes=('_high_salary_employee', '_department'))
answer.rename(columns = {'name_department':'Department','name_high_salary_employee':'Employee', 'salary':'Salary'}, inplace=True)
return answer[['Department','Employee','Salary']]
- groupby 이후 rank로 그룹 별 급여 순위를 매겨준다.
- 1위만 먼저 뽑는다. (조인 이후에 1위를 뽑으면 연산이 조금 더 걸릴 듯?)
- merge를 통해서 join을 한다. 여기선 colunms 이름이 다르니 on 대신 left on, right on을 사용해야한다.
- suffixes를 통해 각 테이블이 어떤 테이블에서 왔는지 명시해준다. (안 쓸 경우 _x, _y가 붙는다.)
- rename을 통해 컬럼명을 바꿔주고 정답 출력하기
코멘트
- agg같은 집계함수를 통해 푸는게 정석이긴 한데 rank가 더 편한거같음
'Data Analysis > Query' 카테고리의 다른 글
리트코드 : 181. Employees Earning More Than Their Managers (0) | 2024.03.27 |
---|---|
리트코드 : 196. Delete Duplicate Emails (0) | 2024.03.26 |
185. Department Top Three Salaries (0) | 2024.03.26 |
리트코드 : 180. Consecutive Numbers (0) | 2024.03.25 |
리트코드 : 178. Rank Scores (0) | 2024.03.25 |
리트코드 : 177. Nth Highest Salary (0) | 2024.03.24 |
리트코드 : 176. Second Highest Salary (0) | 2024.03.24 |
댓글