Meego Wiki
Views

Metrics/Gathering data

From MeeGo wiki
(Difference between revisions)
Jump to: navigation, search
(Mailing list graph over time)
Line 60: Line 60:
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):
-
[[File:Mlgraph.png]]
+
[[Image:Mlgraph.png]]

Revision as of 12:12, 14 June 2011

For each of the services we gather data for, here's a guide to getting that data:

Mailing lists

MLStats

Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using MLStats.

The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.

We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch has been submitted upstream.

Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with
setup.py install --prefix=/install/path
you will need to "prime the pump", and download and import the archives to all of the Maemo mailing lists.

The format of the mlstats command line is:

/path/to/mlstats --db-user=<username> --db-password=<password> http://lists.meego.com/pipermail/meego-announce/
The command line option
--no-report
suppresses the creation of a report after the import, useful for a cron job.

You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is here.

for list in meego-adaptation-intel-automotive meego-announce meego-architecture \
            meego-commits meego-community meego-dev meego-distribution-tools \
            meego-events meego-handset meego-il10n meego-inputmethods meego-it \
            meego-ivi meego-kernel meego-packaging meego-pm meego-porting \
            meego-python meego-qa meego-releases meego-sdk meego-security \
            meego-security-discussion meego-touch-dev meego-tv;
do
  /path/to/mlstats --no-report --db-user=<username> --db-password=<password> http://lists.meego.com/pipermail/${list} >> /tmp/output.txt
done;

This should be run every night through cron.

Mailing list graph over time

The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:

SELECT
     `messages`.`mailing_list_url` AS list,
     year(first_date) AS y,
     monthname(first_date) AS mon,
     month(first_date) AS m,
     date_format(first_date, '%M %Y') as monthstr,
     date_format(first_date,'%Y%m') as monthnum, 
     count(*) AS c
FROM
     `messages`
WHERE
     year(first_date)		 > 1979 and 
     mailing_list_url not like '%meego-commits%' and first_date<'2011-05-01'
GROUP BY
     `messages`.`mailing_list_url`,
     y,m
ORDER BY
     monthnum ASC,
     list ASC,
     c ASC

We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):

Mlgraph.png

Personal tools