These are all of the queries which we have used for mailing list statistics in the monthly report:
Note: last_month is a parameter set once in the master report representing the month and year of the previous month, in the format 'YYYYMM'. For example, in September 2011, lastmonth is '201108'.
SELECT
if(messages.mailing_list_url like '%-community%' or
messages.mailing_list_url like '%meego-dev%' or
messages.mailing_list_url like '%-sdk%' or
messages.mailing_list_url like '%-packaging%' or
messages.mailing_list_url like '%-qa%',
substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1),
'Other') as list,
year(first_date) AS y,
monthname(first_date) AS mon,
month(first_date) AS m,
date_format(first_date, '%b %Y') as monthstr,
date_format(first_date,'%Y%m') as monthnum,
count(*) AS posts
FROM
`messages`
WHERE
month(first_date)>=MONTH(DATE_SUB(NOW(), INTERVAL 4 MONTH)) and
year(first_date)>=YEAR(DATE_SUB(NOW(), INTERVAL 4 MONTH)) and
mailing_list_url not like '%meego-commits%' and
date_format(first_date,'%Y%m') <= date_format(${last_month},'%Y%m')
GROUP BY
list,
y,m
ORDER BY
list ASC,
monthnum ASC
This query is unused in the report, I think. Left in for posterity.
SELECT
substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1) AS list,
year(first_date) AS y,
monthname(first_date) AS mon,
month(first_date) AS m,
date_format(first_date, '%M %Y') as monthstr,
date_format(first_date,'%Y%m') as monthnum,
count(*) AS c
FROM
`messages`
WHERE
year(first_date) > 1979 and
mailing_list_url not like '%meego-commits%'
GROUP BY
`messages`.`mailing_list_url`,
y,m
ORDER BY
list ASC,
monthnum ASC,
c ASC
Provides the number of active mailing list participants - sending more than one email per month.
select
monthname(${last_month}) as monthname,
year(${last_month}) as y,
count(*) as c
from
( select
p.email_address as member,
count(*) AS c
from messages as m,
messages_people as p
where
m.message_id=p.message_ID and
month(m.first_date)=month(${last_month}) and
year(m.first_date)=year(${last_month}) and
p.email_address NOT LIKE '%no_reply@%'
group by p.email_address
)
as repeat_posters
where c>=2;
SELECT
monthname(${last_month}) as monthname,
count(*) AS message_count
FROM
`messages`
WHERE
month(first_date) = month(${last_month}) and
year(first_date) = year(${last_month}) and
mailing_list_url not like '%meego-commits%'
select
subject,
count(*) as c
from messages
where year(first_date)=year(${last_month})
and month(first_date)=month(${last_month})
group by subject
order by c desc
limit 10;
select
p.email_address,
count(*) as c
from
messages as m,
messages_people as p
where
m.message_id=p.message_ID
and month(m.first_date)=month(${last_month})
and year(m.first_date)=year(${last_month})
and p.email_address NOT LIKE '%no_reply@%'
group by p.email_address
order by c desc
limit 10;
This query is used to graph the evolution of each mailing list in a graph. The top 5 lists are taken out, and the others are grouped together.
select
if(messages.mailing_list_url like '%-community%' or
messages.mailing_list_url like '%meego-dev%' or
messages.mailing_list_url like '%-sdk%' or
messages.mailing_list_url like '%-packaging%' or
messages.mailing_list_url like '%-qa%',
substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1),
'Other') as list,
year(first_date) AS y,
monthname(first_date) AS mon,
month(first_date) AS m,
date_format(first_date, '%b %Y') as monthstr,
date_format(first_date,'%Y%m') as monthnum,
count(*) AS c
FROM
`messages`
WHERE
year(first_date) > 1979 and
date_format(first_date,'%Y%m') <= date_format(${last_month},'%Y%m') and
messages.mailing_list_url not like '%meego-commit%'
GROUP BY
list,y,m
ORDER BY
monthnum ASC,
list asc,
c asc;
I should really have done a "top N posters last month" and made N a parameter. This was the quick copy & paste way to go.
select
p.email_address,
count(*) as c
from
messages as m,
messages_people as p
where
m.message_id=p.message_ID
and month(m.first_date)=month(${last_month})
and year(m.first_date)=year(${last_month})
and p.email_address NOT LIKE '%no_reply@%'
group by p.email_address
order by c desc
limit 20;
select
count(distinct p.email_address) as member_count,
monthname(${last_month}) AS monthname
from
messages as m,
messages_people as p
where
m.message_id=p.message_ID
and month(m.first_date)=month(${last_month})
and year(m.first_date)=year(${last_month})
and p.email_address NOT LIKE '%no_reply@%';
All mailing lists. The graph was far too cluttered.
SELECT
substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1) AS list,
year(first_date) AS y,
monthname(first_date) AS mon,
month(first_date) AS m,
date_format(first_date, '%b %Y') as monthstr,
date_format(first_date,'%Y%m') as monthnum,
count(*) AS c
FROM
`messages`
WHERE
year(first_date) > 1979 and
mailing_list_url not like '%meego-commits%' and
date_format(first_date,'%Y%m') <= date_format(${last_month},'%Y%m')
GROUP BY
list,
y,m
ORDER BY
monthnum ASC,
list ASC,
c ASC;