(→MLStats) |
(→Mailing list graph over time) |
||
| Line 35: | Line 35: | ||
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this: | The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this: | ||
| - | + | <pre> | |
SELECT | SELECT | ||
`messages`.`mailing_list_url` AS list, | `messages`.`mailing_list_url` AS list, | ||
| Line 56: | Line 56: | ||
list ASC, | list ASC, | ||
c ASC | c ASC | ||
| - | + | </pre> | |
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later): | We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later): | ||
[[File:Mlgraph.png]] | [[File:Mlgraph.png]] | ||
For each of the services we gather data for, here's a guide to getting that data:
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using MLStats.
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch has been submitted upstream.
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it withsetup.py install --prefix=/install/pathyou will need to "prime the pump", and download and import the archives to all of the Maemo mailing lists.
The format of the mlstats command line is:
/path/to/mlstats --db-user=<username> --db-password=<password> http://lists.meego.com/pipermail/meego-announce/The command line option
--no-reportsuppresses the creation of a report after the import, useful for a cron job.
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is here.
for list in meego-adaptation-intel-automotive meego-announce meego-architecture \
meego-commits meego-community meego-dev meego-distribution-tools \
meego-events meego-handset meego-il10n meego-inputmethods meego-it \
meego-ivi meego-kernel meego-packaging meego-pm meego-porting \
meego-python meego-qa meego-releases meego-sdk meego-security \
meego-security-discussion meego-touch-dev meego-tv;
do
/path/to/mlstats --no-report --db-user=<username> --db-password=<password> http://lists.meego.com/pipermail/${list} >> /tmp/output.txt
done;
This should be run every night through cron.
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:
SELECT
`messages`.`mailing_list_url` AS list,
year(first_date) AS y,
monthname(first_date) AS mon,
month(first_date) AS m,
date_format(first_date, '%M %Y') as monthstr,
date_format(first_date,'%Y%m') as monthnum,
count(*) AS c
FROM
`messages`
WHERE
year(first_date) > 1979 and
mailing_list_url not like '%meego-commits%' and first_date<'2011-05-01'
GROUP BY
`messages`.`mailing_list_url`,
y,m
ORDER BY
monthnum ASC,
list ASC,
c ASC
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):