리트코드 : 1661. Average Time of Process per Machine

[link]

문제

Table: Activity

+----------------+---------+
| Column Name    | Type    |
+----------------+---------+
| machine_id     | int     |
| process_id     | int     |
| activity_type  | enum    |
| timestamp      | float   |
+----------------+---------+
The table shows the user activities for a factory website.
(machine_id, process_id, activity_type) is the primary key (combination of columns with unique values) of this table.
machine_id is the ID of a machine.
process_id is the ID of a process running on the machine with ID machine_id.
activity_type is an ENUM (category) of type ('start', 'end').
timestamp is a float representing the current time in seconds.
'start' means the machine starts the process at the given timestamp and 'end' means the machine ends the process at the given timestamp.
The 'start' timestamp will always be before the 'end' timestamp for every (machine_id, process_id) pair.


There is a factory website that has several machines each running the same number of processes. Write a solution to find the average time each machine takes to complete a process.

The time to complete a process is the 'end' timestamp minus the 'start' timestamp. The average time is calculated by the total time to complete every process on the machine divided by the number of processes that were run.

The resulting table should have the machine_id along with the average time as processing_time, which should be rounded to 3 decimal places.

Return the result table in any order.

The result format is in the following example.



Example 1:

Input: 
Activity table:
+------------+------------+---------------+-----------+
| machine_id | process_id | activity_type | timestamp |
+------------+------------+---------------+-----------+
| 0          | 0          | start         | 0.712     |
| 0          | 0          | end           | 1.520     |
| 0          | 1          | start         | 3.140     |
| 0          | 1          | end           | 4.120     |
| 1          | 0          | start         | 0.550     |
| 1          | 0          | end           | 1.550     |
| 1          | 1          | start         | 0.430     |
| 1          | 1          | end           | 1.420     |
| 2          | 0          | start         | 4.100     |
| 2          | 0          | end           | 4.512     |
| 2          | 1          | start         | 2.500     |
| 2          | 1          | end           | 5.000     |
+------------+------------+---------------+-----------+
Output: 
+------------+-----------------+
| machine_id | processing_time |
+------------+-----------------+
| 0          | 0.894           |
| 1          | 0.995           |
| 2          | 1.456           |
+------------+-----------------+
Explanation: 
There are 3 machines running 2 processes each.
Machine 0's average time is ((1.520 - 0.712) + (4.120 - 3.140)) / 2 = 0.894
Machine 1's average time is ((1.550 - 0.550) + (1.420 - 0.430)) / 2 = 0.995
Machine 2's average time is ((4.512 - 4.100) + (5.000 - 2.500)) / 2 = 1.456

머신 별 가동 시간의 평균

문제 풀이

MySQL

WITH START AS (
    SELECT *, SUM(TIMESTAMP) AS SS, COUNT(*) AS SCNT
    FROM ACTIVITY
    WHERE ACTIVITY_TYPE = 'START'
    GROUP BY MACHINE_ID, ACTIVITY_TYPE
),
END AS (
    SELECT *, SUM(TIMESTAMP) AS ES,  COUNT(*) AS ECNT
    FROM ACTIVITY
    WHERE ACTIVITY_TYPE = 'END'
    GROUP BY MACHINE_ID, ACTIVITY_TYPE
)

SELECT S.MACHINE_ID, ROUND((ES-SS)/SCNT,3) AS PROCESSING_TIME
FROM START S
JOIN END E ON S.MACHINE_ID = E.MACHINE_ID

좀 이상하게 짠거 같지만...
시작 시간, 종료 시간 별로 TABLE을 나눈 후, GROUP BY로 시간 합을 구한다.
JOIN 후, (끝난 시간 합 - 시작 시간)/그룹 별 개수에 반올림해주면 된다.

Pandas

import pandas as pd

def get_average_time(activity: pd.DataFrame) -> pd.DataFrame:
    activity['log'] = np.where(activity['activity_type']=='start',-activity['timestamp'],activity['timestamp'])
    grouped = activity.groupby('machine_id').agg(
        half_processing_time = ('log', 'mean')
    ).reset_index()
    grouped['processing_time'] = round(2*grouped['half_processing_time'],3)
    return grouped[['machine_id','processing_time']]

다시 보니, start는 음수 붙여서 모두 빼고, end에서는 합으로 모두 더한다.
행 개수만 2배가 됐기 때문에 조정해주면 된다.
np.where로 컬럼 하나 만들어주고 group by + mean으로 값 할당해서 풀이

코멘트

'Data Analysis > Query' 카테고리의 다른 글

리트코드 : Daily Leads and Partners (0)	2024.09.05
리트코드 : 1667. Fix Names in a Table (0)	2024.09.03
리트코드 : 1683. Invalid Tweets (0)	2024.09.03
리트코드 : 1484. Group Sold Products By The Date (0)	2024.09.02
리트코드 : 1164. Product Price at a Given Date (0)	2024.09.02
리트코드 : 1633. Percentage of Users Attended a Contest (0)	2024.08.31
리트코드 : 1587. Bank Account Summary II (0)	2024.08.31

베짱이의 작업공간

리트코드 : 1661. Average Time of Process per Machine

리트코드 : 1661. Average Time of Process per Machine

문제

문제 풀이

MySQL

Pandas

코멘트

'Data Analysis > Query' 카테고리의 다른 글

댓글

티스토리툴바

리트코드 : 1661. Average Time of Process per Machine

리트코드 : 1661. Average Time of Process per Machine

문제

문제 풀이

MySQL

Pandas

코멘트

'Data Analysis > Query' 카테고리의 다른 글

관련글

댓글

티스토리툴바