본문 바로가기
Data Science/SQL

[SQL/오답] 다중 조인 필터링과 조건부 집계 'IF(조건문,1,0)'로 취소율 계산하기 (LeetCode262 Hard)

by 에르모사 쩐뉴 2026. 2. 5.

[리드코드] Hard|262. Trips and Users;

https://leetcode.com/problems/trips-and-users/description/

1. Problem

2013-10-01부터 2013-10-03 사이의 일일 택시 호출 취소율을 구해야 한다.

  • 핵심 조건:
    1. 차단되지 않은(Unbanned) 유저의 기록만 포함할 것 (클라이언트와 드라이버 둘 다 'No'여야 함).
    2. 취소율 = (취소된 호출 수) / (차단되지 않은 유저의 전체 호출 수).
    3. 결과를 소수점 둘째 자리에서 반올림할 것.

2. Solution: 이중 조인을 이용한 클린 필터링

Users 테이블을 두 번 조인하여 각각 클라이언트와 드라이버의 상태를 검증한다.

3. Takeaway: 왜 이 풀이가 객관적으로 훌륭한가?

  • JOIN을 통한 사전 필터링: WHERE 절에서 IN (SELECT users_id FROM Users WHERE banned = 'No')를 두 번 쓰는 것보다, 조인 조건(ON)에 banned = 'No'를 직접 명시함으로써 데이터 결합 단계에서 불필요한 행을 미리 제거했다. 이는 성능과 가독성 면에서 매우 효율적인 선택이다.
  • SUM(IF(...)) vs AVG(IF(...)): 취소율을 구할 때 SUM(취소건) / COUNT(전체건) 방식을 사용하셨는데, 이는 비율 계산의 원리를 가장 명확하게 보여준다. (MySQL이라면 AVG(status != 'completed')로 더 축약할 수도 있다.)
  • 도메인 복잡성 해결: '클라이언트와 드라이버 모두'라는 까다로운 비즈니스 로직을 테이블 두 개를 각각 다른 별칭(u1, u2)으로 불러와 해결한 점이 Hard 문제를 풀이하는 핵심 포인트다.

Table: Trips

+-------------+----------+
| Column Name | Type     |
+-------------+----------+
| id          | int      |
| client_id   | int      |
| driver_id   | int      |
| city_id     | int      |
| status      | enum     |
| request_at  | varchar  |     
+-------------+----------+
id is the primary key (column with unique values) for this table.
The table holds all taxi trips. Each trip has a unique id, while client_id and driver_id are foreign keys to the users_id at the Users table.
Status is an ENUM (category) type of ('completed', 'cancelled_by_driver', 'cancelled_by_client').

Table: Users

+-------------+----------+
| Column Name | Type     |
+-------------+----------+
| users_id    | int      |
| banned      | enum     |
| role        | enum     |
+-------------+----------+
users_id is the primary key (column with unique values) for this table.
The table holds all users. Each user has a unique users_id, and role is an ENUM type of ('client', 'driver', 'partner').
banned is an ENUM (category) type of ('Yes', 'No').

The cancellation rate is computed by dividing the number of canceled (by client or driver) requests with unbanned users by the total number of requests with unbanned users on that day.

Write a solution to find the cancellation rate of requests with unbanned users (both client and driver must not be banned) each day between "2013-10-01" and "2013-10-03" with at least one trip. Round Cancellation Rate to two decimal points.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Trips table:
+----+-----------+-----------+---------+---------------------+------------+
| id | client_id | driver_id | city_id | status              | request_at |
+----+-----------+-----------+---------+---------------------+------------+
| 1  | 1         | 10        | 1       | completed           | 2013-10-01 |
| 2  | 2         | 11        | 1       | cancelled_by_driver | 2013-10-01 |
| 3  | 3         | 12        | 6       | completed           | 2013-10-01 |
| 4  | 4         | 13        | 6       | cancelled_by_client | 2013-10-01 |
| 5  | 1         | 10        | 1       | completed           | 2013-10-02 |
| 6  | 2         | 11        | 6       | completed           | 2013-10-02 |
| 7  | 3         | 12        | 6       | completed           | 2013-10-02 |
| 8  | 2         | 12        | 12      | completed           | 2013-10-03 |
| 9  | 3         | 10        | 12      | completed           | 2013-10-03 |
| 10 | 4         | 13        | 12      | cancelled_by_driver | 2013-10-03 |
+----+-----------+-----------+---------+---------------------+------------+
Users table:
+----------+--------+--------+
| users_id | banned | role   |
+----------+--------+--------+
| 1        | No     | client |
| 2        | Yes    | client |
| 3        | No     | client |
| 4        | No     | client |
| 10       | No     | driver |
| 11       | No     | driver |
| 12       | No     | driver |
| 13       | No     | driver |
+----------+--------+--------+
Output: 
+------------+-------------------+
| Day        | Cancellation Rate |
+------------+-------------------+
| 2013-10-01 | 0.33              |
| 2013-10-02 | 0.00              |
| 2013-10-03 | 0.50              |
+------------+-------------------+
Explanation: 
On 2013-10-01:
  - There were 4 requests in total, 2 of which were canceled.
  - However, the request with Id=2 was made by a banned client (User_Id=2), so it is ignored in the calculation.
  - Hence there are 3 unbanned requests in total, 1 of which was canceled.
  - The Cancellation Rate is (1 / 3) = 0.33
On 2013-10-02:
  - There were 3 requests in total, 0 of which were canceled.
  - The request with Id=6 was made by a banned client, so it is ignored.
  - Hence there are 2 unbanned requests in total, 0 of which were canceled.
  - The Cancellation Rate is (0 / 2) = 0.00
On 2013-10-03:
  - There were 3 requests in total, 1 of which was canceled.
  - The request with Id=8 was made by a banned client, so it is ignored.
  - Hence there are 2 unbanned request in total, 1 of which were canceled.
  - The Cancellation Rate is (1 / 2) = 0.50

 

정답 쿼리

SELECT 
    t.request_at AS Day,
    ROUND(SUM(IF(t.status != 'completed',1,0)) / COUNT(*), 2) AS 'Cancellation Rate'
FROM Trips t
JOIN Users u1 ON t.client_id = u1.users_id AND u1.banned = 'No'
JOIN Users u2 ON t.driver_id = u2.users_id AND u2.banned = 'No'
WHERE t.request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY t.request_at;