본문 바로가기
Data Analysis/Query

리트코드 : 1907. Count Salary Categories

by 베짱이28호 2024. 9. 15.

리트코드 : 1907. Count Salary Categories


문제

Table: Accounts

+-------------+------+
| Column Name | Type |
+-------------+------+
| account_id  | int  |
| income      | int  |
+-------------+------+
account_id is the primary key (column with unique values) for this table.
Each row contains information about the monthly income for one bank account.


Write a solution to calculate the number of bank accounts for each salary category. The salary categories are:

"Low Salary": All the salaries strictly less than $20000.
"Average Salary": All the salaries in the inclusive range [$20000, $50000].
"High Salary": All the salaries strictly greater than $50000.
The result table must contain all three categories. If there are no accounts in a category, return 0.

Return the result table in any order.

The result format is in the following example.



Example 1:

Input: 
Accounts table:
+------------+--------+
| account_id | income |
+------------+--------+
| 3          | 108939 |
| 2          | 12747  |
| 8          | 87709  |
| 6          | 91796  |
+------------+--------+
Output: 
+----------------+----------------+
| category       | accounts_count |
+----------------+----------------+
| Low Salary     | 1              |
| Average Salary | 0              |
| High Salary    | 3              |
+----------------+----------------+
Explanation: 
Low Salary: Account 2.
Average Salary: No accounts.
High Salary: Accounts 3, 6, and 8.
  • 각 소득그룹 별 몇 명 있는지 확인하기기

문제 풀이

MySQL

WITH TEMP2 AS (
    SELECT *,
           CASE
               WHEN INCOME > 50000 THEN 'High Salary'
               WHEN INCOME < 20000 THEN 'Low Salary'
               ELSE 'Average Salary'
           END AS SALARY_CATEGORY
    FROM (
        SELECT * FROM ACCOUNTS
        UNION ALL
        SELECT 0,0
        UNION ALL
        SELECT 0, 20000
        UNION ALL
        SELECT 0, 50001
    ) AS TEMP1
)

SELECT SALARY_CATEGORY AS 'category', COUNT(*)-1 AS 'accounts_count'
FROM TEMP2
GROUP BY SALARY_CATEGORY
  • 소둑 구간 표본이 없는 경우 GROUP BY에 없어서 더미 데이터를 추가한다.

Pandas

import pandas as pd

def count_salary_categories(accounts: pd.DataFrame) -> pd.DataFrame:

    accounts = pd.concat([accounts,
                        pd.DataFrame({'account_id': [0,0,0], 'income': [0,20000,50001]})],
                       ignore_index=True)

    accounts['category'] = pd.cut(accounts['income'], 
                            bins=[-np.inf, 20000, 50001, np.inf],
                            labels=['Low Salary', 'Average Salary', 'High Salary'], right=False)

    grouped = accounts.groupby('category').agg(
        accounts_count=('category', lambda x: x.size-1)
    ).reset_index()
    return grouped
  • 같은 방법으로 풀이.

코멘트

  • 더미데이터고 뭐고 그냥 컬럼 설정 미리 할당해놓고 count로 하는게 제일 빠른 듯

댓글