通知 网站从因情语写改为晴雨,这个网站的模板也从calmlog_ex改为 whimurmur

leetcode数据库 系列题 报告的记录

364人浏览 / 0人评论 / | 作者:因情语写  | 分类: 数据库  | 标签: 数据库  /  leetcode  | 

作者:因情语写

链接:https://www.proprogrammar.com/article/682

声明:请尊重原作者的劳动,如需转载请注明出处


1113. 报告的记录

动作表:Actions

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user_id       | int     |
| post_id       | int     |
| action_date   | date    | 
| action        | enum    |
| extra         | varchar |
+---------------+---------+
此表没有主键,所以可能会有重复的行。
action 字段是 ENUM 类型的,包含:('view', 'like', 'reaction', 'comment', 'report', 'share')
extra 字段是可选的信息(可能为 null),其中的信息例如有:1.报告理由(a reason for report) 2.反应类型(a type of reaction)

编写一条SQL,查询每种 报告理由(report reason)在昨天的报告数量。假设今天是 2019-07-05

查询及结果的格式示例:

Actions table:
+---------+---------+-------------+--------+--------+
| user_id | post_id | action_date | action | extra  |
+---------+---------+-------------+--------+--------+
| 1       | 1       | 2019-07-01  | view   | null   |
| 1       | 1       | 2019-07-01  | like   | null   |
| 1       | 1       | 2019-07-01  | share  | null   |
| 2       | 4       | 2019-07-04  | view   | null   |
| 2       | 4       | 2019-07-04  | report | spam   |
| 3       | 4       | 2019-07-04  | view   | null   |
| 3       | 4       | 2019-07-04  | report | spam   |
| 4       | 3       | 2019-07-02  | view   | null   |
| 4       | 3       | 2019-07-02  | report | spam   |
| 5       | 2       | 2019-07-04  | view   | null   |
| 5       | 2       | 2019-07-04  | report | racism |
| 5       | 5       | 2019-07-04  | view   | null   |
| 5       | 5       | 2019-07-04  | report | racism |
+---------+---------+-------------+--------+--------+

Result table:
+---------------+--------------+
| report_reason | report_count |
+---------------+--------------+
| spam          | 1            |
| racism        | 2            |
+---------------+--------------+ 
注意,我们只关心报告数量非零的结果。
select extra report_reason, count(distinct post_id) report_count
from Actions
where action = 'report' and extra is not null and date_format(action_date, '%Y-%m-%d') = '2019-07-04'
group by extra

    学一下is not null,date_format,group+count+distinct

1132. 报告的记录 II

动作表: Actions

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user_id       | int     |
| post_id       | int     |
| action_date   | date    |
| action        | enum    |
| extra         | varchar |
+---------------+---------+
这张表没有主键,并有可能存在重复的行。
action 列的类型是 ENUM,可能的值为 ('view', 'like', 'reaction', 'comment', 'report', 'share')。
extra 列拥有一些可选信息,例如:报告理由(a reason for report)或反应类型(a type of reaction)等。

移除表: Removals

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| post_id       | int     |
| remove_date   | date    | 
+---------------+---------+
这张表的主键是 post_id。
这张表的每一行表示一个被移除的帖子,原因可能是由于被举报或被管理员审查。

编写一段 SQL 来查找:在被报告为垃圾广告的帖子中,被移除的帖子的每日平均占比,四舍五入到小数点后 2 位

查询结果的格式如下:

Actions table:
+---------+---------+-------------+--------+--------+
| user_id | post_id | action_date | action | extra  |
+---------+---------+-------------+--------+--------+
| 1       | 1       | 2019-07-01  | view   | null   |
| 1       | 1       | 2019-07-01  | like   | null   |
| 1       | 1       | 2019-07-01  | share  | null   |
| 2       | 2       | 2019-07-04  | view   | null   |
| 2       | 2       | 2019-07-04  | report | spam   |
| 3       | 4       | 2019-07-04  | view   | null   |
| 3       | 4       | 2019-07-04  | report | spam   |
| 4       | 3       | 2019-07-02  | view   | null   |
| 4       | 3       | 2019-07-02  | report | spam   |
| 5       | 2       | 2019-07-03  | view   | null   |
| 5       | 2       | 2019-07-03  | report | racism |
| 5       | 5       | 2019-07-03  | view   | null   |
| 5       | 5       | 2019-07-03  | report | racism |
+---------+---------+-------------+--------+--------+

Removals table:
+---------+-------------+
| post_id | remove_date |
+---------+-------------+
| 2       | 2019-07-20  |
| 3       | 2019-07-18  |
+---------+-------------+

Result table:
+-----------------------+
| average_daily_percent |
+-----------------------+
| 75.00                 |
+-----------------------+
2019-07-04 的垃圾广告移除率是 50%,因为有两张帖子被报告为垃圾广告,但只有一个得到移除。
2019-07-02 的垃圾广告移除率是 100%,因为有一张帖子被举报为垃圾广告并得到移除。
其余几天没有收到垃圾广告的举报,因此平均值为:(50 + 100) / 2 = 75%
注意,输出仅需要一个平均值即可,我们并不关注移除操作的日期。
select round(avg(p.per) * 100, 2) average_daily_percent
from(
    select count(distinct spam.post_id) / count(distinct s.post_id) per
    from Actions s left join (
        select distinct a.post_id
        from Actions a, Removals r
        where a.extra = 'spam' and a.post_id = r.post_id
    ) spam on s.post_id = spam.post_id
    where s.extra = 'spam'
    group by s.action_date
) p

先找出移除的垃圾贴子,再找出垃圾贴子,求每天的移除比率,再求平均比率,思路很清楚,学一下distinct, left join, group by + count + distinct, round用法,

再看另一种做法

SELECT round(SUM(delCount / spamCount * 100)  / COUNT(DISTINCT action_date), 2) AS average_daily_percent
FROM (
	SELECT action_date, COUNT(distinct a.post_id) AS spamCount, count(distinct b.post_id) AS delCount
	FROM Actions a
		LEFT JOIN Removals b ON a.post_id = b.post_id
    where a.extra = 'spam'
	GROUP BY a.action_date
) a

    上面是三层select嵌套,这里是两层,充分利用了left join的性质(右边可能不存在为null),从而一次连接从连接的左右求出两种想要的结果


亲爱的读者:有时间可以点赞评论一下

点赞(0) 打赏

全部评论

还没有评论!
广告位-帮帮忙点下广告