1、以前统计总量pv,uv和各分类的pv,uv都这么写也就是
SELECT a.type,a.pv,a.uv FROM
(
SELECT type,
count
(1)
as
pv,
COUNT
(distinct(uid))
as
uv
FROM t1
WHERE dt=
'201410129'
AND req_url like
'mbloglist?domain=100808&ajwvr=6%'
group by type
union all
SELECT
'all'
as
type,
count
(1)
as
pv,
COUNT
(distinct(uid))
as
uv
FROM t1
WHERE dt=
'201410129'
AND req_url like
'mbloglist?domain=100808&ajwvr=6%'
) a
说明:distinct虽然写起来挺方便的,但是效率真的太差,建议永远不要用distinct
2、然后我们的语句就可以改为:
SELECT a.type,sum(pv),
count
(uid) FROM
(
SELECT type,
count
(1)
as
pv,uid
FROM t1
WHERE dt=
'201410129'
AND req_url like
'mbloglist?domain=100808&ajwvr=6%'
group by uid,type
union all
SELECT
'all'
as
type,
count
(1)
as
pv,uid
FROM t1
WHERE dt=
'201410129'
AND req_url like
'mbloglist?domain=100808&ajwvr=6%'
group by uid
) a
group by type
这样虽然效率提高了些,而且我也一直这么用了,有段时间,但总感觉还是很不爽,总觉得没有发挥union all的功能
3、今天才发现,这group by 不能写在里面,真的严重影响效率,而且按照上面写job数量还多,果断需改:
SELECT type,SUM(pv),
count
(uid) FROM (
SELECT a.type,sum(pv),uid FROM
(
SELECT type,1
as
pv,uid
FROM t1
WHERE dt=
'201410129'
AND req_url like
'mbloglist?domain=100808&ajwvr=6%'
union all
SELECT
'all'
as
type,1
as
pv,uid
FROM t1
WHERE dt=
'201410129'
AND req_url like
'mbloglist?domain=100808&ajwvr=6%'
) a
group by uid,type) b group by type
经测试,效率果然杠杠的