<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://wiki.meego.com/skins/common/feed.css?270"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>http://wiki.meego.com/index.php?title=Special:Contributions/Dneary&amp;feed=atom&amp;limit=50&amp;target=Dneary&amp;year=&amp;month=</id>
		<title>MeeGo wiki - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="http://wiki.meego.com/index.php?title=Special:Contributions/Dneary&amp;feed=atom&amp;limit=50&amp;target=Dneary&amp;year=&amp;month="/>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Special:Contributions/Dneary"/>
		<updated>2013-05-22T05:23:54Z</updated>
		<subtitle>From MeeGo wiki</subtitle>
		<generator>MediaWiki 1.16.2</generator>

	<entry>
		<id>http://wiki.meego.com/Metrics/Dashboard</id>
		<title>Metrics/Dashboard</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Dashboard"/>
				<updated>2011-10-18T09:35:17Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Community Metrics Dashboard ==&lt;br /&gt;
&lt;br /&gt;
The goal is to provide a web page summarising metrics about various aspects of the MeeGo project. The data should update regularly - depending on the metric, that could be real time or  updated automatically on a regular basis.&lt;br /&gt;
&lt;br /&gt;
The dashboard tracks track the following community resources, ideally: &lt;br /&gt;
* Drupal members&lt;br /&gt;
* Bugzilla (bugs opened, bugs closed, active users)&lt;br /&gt;
* Mailing lists (members, posts, threads)&lt;br /&gt;
* gitorious (commits, employer details for committers) - should use Jon Corbet's scripts like are used in the LF yearly kernel data.&lt;br /&gt;
* Wiki (edits, new pages)&lt;br /&gt;
* Forums (members, posts)&lt;br /&gt;
* IRC (total comments, people on channel)&lt;br /&gt;
* Transifex (Languages, translators, strings translated)&lt;br /&gt;
* Community OBS (uploads, users)&lt;br /&gt;
* SDK downloads (potentially extrapolated from meego.com)&lt;br /&gt;
&lt;br /&gt;
The data should also be available for custom reports for usage and analysis in the monthly MeeGo Metrics report published by [[User:DawnFoster]]&lt;br /&gt;
&lt;br /&gt;
To fulfill these goals, the dashboard will gather data from the various resource into a centralised database, using some sort of Business Intelligence platform including ETL for data acquisition and storage, and a reporting service for generating reports and dashboards.. A web page will provide a view into this database with predefined reports.&lt;br /&gt;
&lt;br /&gt;
Download the monthly summary report as a Pentaho report file here: [[File:Meego metrics summary.prpt]]&lt;br /&gt;
&lt;br /&gt;
== Architecture ==&lt;br /&gt;
&lt;br /&gt;
Pentaho runs as a webapp in Tomcat6. It can use a variety of databases for its internal data structures, the default (Hypersonic) is a Java database. However, because it's both standard &amp;amp; well understood and to allow consolidation of databases under one DB server, I prefer to use MySQL. The configuration of Pentaho with a MySQL database is a little tricky, but almost all of the steps are covered well [https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en in this tutorial].&lt;br /&gt;
&lt;br /&gt;
The data which is useful for metrics will be copied into a local database from each of the services we query. The copying of data will be accomplished by a set of Kettle &amp;quot;xactions&amp;quot;, which can be created and edited easily with the Spoon tool.&lt;br /&gt;
&lt;br /&gt;
A number of reports will be generated using the Pentaho Report Designer, including a static HTML/Flash dashboard which will be published regularly. Other reports can be created for the community managers, and a more advanced dashboard, allowing detailed analysis of basic metrics, can be provided via the Community Dashboard Framework.&lt;br /&gt;
&lt;br /&gt;
We will need to see how much load the dashboard will generate on the server. I suspect that it will not be practical to expose the dashboard in public.&lt;br /&gt;
&lt;br /&gt;
We will document here everything you need to do to replicate the MeeGo Community Dashboard, with the exception of data which is not publicly available because it contains security related or confidential information (mainly bugzilla).&lt;br /&gt;
&lt;br /&gt;
* [[../Installing Pentaho]]: A guide to getting Pentaho up and running for MeeGo Dashboard&lt;br /&gt;
* [[../Gathering data]]: A guide to installing and using the tools to gather data for a dashboard&lt;br /&gt;
* [[../Creating a report]]: Given data in a database, how do we generate a report in Pentaho, and deploy it to the dashboard?&lt;br /&gt;
&lt;br /&gt;
* [[../Mailing list queries]]: SQL queries against the MLStats database&lt;br /&gt;
* [[../MediaWiki queries]]: SQL queries against a &amp;lt;nowiki&amp;gt;MediaWiki&amp;lt;/nowiki&amp;gt; database&lt;br /&gt;
* [[../IRC queries]]: SQL queries against the superseriousstats database&lt;br /&gt;
* [[../Forum queries]]: SQL queries against the forum database&lt;br /&gt;
* [[../Bugzilla queries]]: SQL queries against the Bugzilla database&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Extracting data ===&lt;br /&gt;
&lt;br /&gt;
For SQL databases, we have access to the database server. This applies to MediaWiki, Bugzilla, and Drupal.&lt;br /&gt;
&lt;br /&gt;
For the forum, we integrate the CSV files currently being exported, which provide the basic analytics we need. A cron job with mysqlimport is sufficient.&lt;br /&gt;
&lt;br /&gt;
Individual mailing lists are parsed by [http://forge.morfeo-project.org/projects/libresoft-tools/ MLStats]. We use the resulting database directly in the dashboard.&lt;br /&gt;
&lt;br /&gt;
IRC logs will be parsed with [http://code.google.com/p/superseriousstats/ superseriousstats], a PHP command line tool that parses IRC logs and stores the results in an SQL database.&lt;br /&gt;
&lt;br /&gt;
We still need to figure out how to do data interchange with Transifex and OBS, and how to get code metrics from the commit mailing list or git. Dimitris tells me that there are already [http://meego.transifex.net/stats/ some analytics] available on Transifex, and that there is a RESTful API available to query this data.&lt;br /&gt;
&lt;br /&gt;
== Data in report ==&lt;br /&gt;
&lt;br /&gt;
For each of the resources, the following statistics (at a minimum) should be extracted:&lt;br /&gt;
&lt;br /&gt;
=== Mailing lists ===&lt;br /&gt;
* Subscriber numbers (+ evolution) - from Mailman directly, not available in mlstats&lt;br /&gt;
* Emails sent (+ evolution) - from mlstats&lt;br /&gt;
* Active participants (individuals with &amp;gt;=2 emails during month) - from mlstats&lt;br /&gt;
* Hot threads - from mlstats&lt;br /&gt;
* Top posters - from mlstats&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* Count posts of most popular threads:&lt;br /&gt;
*: &amp;lt;code&amp;gt;select subject,year(first_date) as y, monthname(first_date),count(*) as c from messages group by subject, month(first_date) order by y, month(first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Count the number of posts for each person&lt;br /&gt;
*: &amp;lt;code&amp;gt;select p.email_address,year(m.first_date) as y, monthname(m.first_date),count(*) as c from messages as m,messages_people as p where m.message_id=p.message_ID group by p.email_address, month(m.first_date) order by y, month(m.first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Restricting queries to a date range / month&lt;br /&gt;
*: The query above gives month-by-month totals, you could add an order by year(first_date) to get the year too&lt;br /&gt;
*: The easiest way is to add &amp;lt;code&amp;gt;where month(first_date)=3 and year(first_date)=2010&amp;lt;/code&amp;gt; for March 2010. For the current month, &amp;lt;code&amp;gt;month(m.first_date)=month(NOW()) and year(first_date)=year(NOW())&amp;lt;/code&amp;gt; works.&lt;br /&gt;
&lt;br /&gt;
=== Forum ===&lt;br /&gt;
* Posts per month (+evolution)&lt;br /&gt;
* Active posters (2+ posts during month)&lt;br /&gt;
* Hot topics&lt;br /&gt;
* Top posters&lt;br /&gt;
&lt;br /&gt;
[http://forum.meego.com/stats/ Stats] are exported from the Forum in CSV format monthly.&lt;br /&gt;
&lt;br /&gt;
=== Bugzilla ===&lt;br /&gt;
* Bugs created (+ evolution)&lt;br /&gt;
* Bugs resolved (+ evolution)&lt;br /&gt;
* Comments on bugs this month&lt;br /&gt;
* Active Bugzilla contributors (2+ comments during month)&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* [http://sourceforge.net/projects/qareports/ Software Quality Reports for Pentaho] of course&lt;br /&gt;
* See the [http://bugs.meego.com/query.cgi?format=report-table bugzilla tabular reports] and relative dates for more. &lt;br /&gt;
* Other potential resources:&lt;br /&gt;
** [http://www.ravenbrook.com/project/p4dti/tool/cgi/bugzilla-schema/ Bugzilla database schema by version]&lt;br /&gt;
** [https://landfill.bugzilla.org/bugzilla-tip/report.cgi Demo site for standard Bugzilla reports]&lt;br /&gt;
** [https://wiki.mozilla.org/Bugzilla:SQLCookBook The Bugzilla SQL cookbook]&lt;br /&gt;
** [http://www.bugzillametrics.org/ Bugzilla Metrics] - Third party add-on to Bugzilla that produces lots of metrics&lt;br /&gt;
** browse.cgi and weekly-summary.html extensions of GNOME Bugzilla: [https://launchpad.net/bugzilla.gnome.org Code], examples in practice: [http://bugzilla.gnome.org/browse.cgi?product=Evolution browse.cgi], [https://bugzilla.gnome.org/page.cgi?id=weekly-bug-summary.html weekly-summary.cgi]&lt;br /&gt;
&lt;br /&gt;
=== Mediawiki ===&lt;br /&gt;
* New wiki pages (+ evolution)&lt;br /&gt;
* Edits this month (+ evolution)&lt;br /&gt;
* Pages deleted this month (+ evolution)&lt;br /&gt;
* Unique editors this month (+ evoluion)&lt;br /&gt;
&lt;br /&gt;
=== Drupal ===&lt;br /&gt;
* Members of meego.com (+ evolution month over month)&lt;br /&gt;
* Active members (need a decent way to hook up different ways a person can be active: wiki, ML, IRC, forum, git)&lt;br /&gt;
&lt;br /&gt;
=== Transifex ===&lt;br /&gt;
* Total languages ordered by translation coverage (+ evolution)&lt;br /&gt;
* Top languages&lt;br /&gt;
* Top translators/teams&lt;br /&gt;
&lt;br /&gt;
This all depends on what is available from Transifex.&lt;br /&gt;
&lt;br /&gt;
=== Git ===&lt;br /&gt;
* Commits this month (+ evolution)&lt;br /&gt;
* Top committers (+ evolution)&lt;br /&gt;
* Committers by company (possible)&lt;br /&gt;
* Active modules (+ evolution)&lt;br /&gt;
&lt;br /&gt;
Using a modified version of gitdm to dump Git logs into a MySQL database for analysis. Modifications required:&lt;br /&gt;
* Create database &amp;amp; tables based on the gitdm data structures&lt;br /&gt;
* Dump data in correct order, avoiding redundancy if possible, into SQL database&lt;br /&gt;
&lt;br /&gt;
gitdm has 3 basic data structures: Hacker, Employer &amp;amp; Patch. Each changeset is a Patch object, each Patch has an Author, and is assigned to an Employer (based on who the Hacker was working for at the time of the Patch). Each Patch also has a list of Hackers who reviewed, reported, signed-off on and tested the patch. Each Hacker links to a list of Patches for which they are the author, a list of email addresses they have used to commit, and separate lists for reviewed, reported, SOB and tested. In addition, each Hacker has a list of Employers he has worked for, and each Employer has a list of Hackers who have worked for them.&lt;br /&gt;
&lt;br /&gt;
=== Community &amp;amp; Official OBS ===&lt;br /&gt;
* Package submissions (+ evolution)&lt;br /&gt;
* Active participants (+ evolution)&lt;br /&gt;
&lt;br /&gt;
== Business intelligence lexicography ==&lt;br /&gt;
&lt;br /&gt;
The area of Business Intelligence is littered with acronyms. Here's a quick overview of the main ones, and how they all fit together.&lt;br /&gt;
&lt;br /&gt;
; BI&lt;br /&gt;
: Business Intelligence - general name for any middleware which allows you to query business processes (sales, inventory, etc) and get data overviews from it&lt;br /&gt;
; ETL&lt;br /&gt;
: Extract, Transform, and Load - the process if extracting data from a data source (database, screen scraping, text file parsing, whatever), transforming it to a well understood format, and loading it in your BI engine database or data warehouse. Good ETL solutions provide a nice way for you to connect another database and have new data sucked in at regular intervals, define views into the source data store which you can then query within your BI engine, etc. Pentaho's ETL, [http://kettle.pentaho.com/ Kettle], and [http://www.jaspersoft.com/jasperetl JasperETL], used by JasperReports, both provide (kind of) straightforward ways to hook into a MySQL database.&lt;br /&gt;
; ESB&lt;br /&gt;
: [http://en.wikipedia.org/wiki/Enterprise_service_bus Enterprise Service Bus] - a middleware bus providing a unique interface to applications on the front-end and data stores on the back end. Often used to link up many front-end applications (eg. library, student registration, employee payroll, syllabus management, accounting, supply-chain, student lodgement programmes, etc in a university). Not really useful for us, as far as I can tell.&lt;br /&gt;
; EAI&lt;br /&gt;
: Enterprise Application Integration - using software to integrate different applications together. As far as I can tell, this is a meaningless catch-all phrase for anything from kludges to architected business intelligence solutions.&lt;br /&gt;
; DW&lt;br /&gt;
: Data Warehouse. Basically the same thing as a database, as far as I can tell, but bigger and more impressive sounding.&lt;br /&gt;
; OLAP&lt;br /&gt;
: On-Line Analytical Processing. Commonly used acronym for extracting data via multi-dimensional queries. Databases can be configured to provide the results of this kind of query. As far as I can tell this is mostly a buzzword - an &amp;quot;OLAP database&amp;quot; like [http://mondrian.pentaho.com/ Mondrian] is basically the same thing as a database. &amp;quot;speed-of-thought&amp;quot; response times indeed.&lt;br /&gt;
; Business reporting&lt;br /&gt;
: An application which allows a graphical view of a database, and allows you to construct queries interactively, often using drag &amp;amp; drop. The results of these queries can then be plugged into graphing software for presentation in a dashboard.&lt;br /&gt;
; Dashboard&lt;br /&gt;
: Organised presentation of information in a web-page or other similar format allowing an at-a-glance overview of the situation for the data being measured.&lt;br /&gt;
&lt;br /&gt;
The community dashboard project uses a business reporting engine to query that data and present it in a report.&lt;br /&gt;
&lt;br /&gt;
=== Comparison of candidate ETL/reporting ===&lt;br /&gt;
&lt;br /&gt;
Modules available:&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
! Software !! License !! ETL !! OLAP database !! BI server !! Reporting !! Dashboard module&lt;br /&gt;
|-&lt;br /&gt;
| Pentaho || EPL || [http://kettle.pentaho.com/ Kettle] || [http://mondrian.pentaho.com/ Mondrian] || [http://community.pentaho.com/projects/bi_platform/ Pentaho BI Platform] || [http://reporting.pentaho.com/ Pentaho Reporting] || [http://wiki.pentaho.com/display/COM/Community+Dashboard+Framework Community Dashboard Framework]&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.jaspersoft.com/editions Jaspersoft] || AGPL v3 || JasperETL (Talend Open Studio) || JasperOLAP || JasperReports Server || iReports editor || No (commercial only)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Pentaho is used as the basis of Mozilla's metrics project, and provides a very strong community software option for both the dashboard and for managing the BI server. Since Mozilla metrics work overlaps what we are trying to achieve, particularly their work on SQR, the Software Quality Reports analytic module for Bugzilla and JIRA, Pentaho is my preference for the dashboard project. In general, I have observed that the Pentaho community provides very good support.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/File:Meego_metrics_summary.prpt</id>
		<title>File:Meego metrics summary.prpt</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/File:Meego_metrics_summary.prpt"/>
				<updated>2011-10-18T09:27:20Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: MeeGo metrics monthly summary statistics. All data is for the previous calendar month. Statistics are grouped to allow qualitative and quantitative analysis to be done afterwards.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;MeeGo metrics monthly summary statistics. All data is for the previous calendar month. Statistics are grouped to allow qualitative and quantitative analysis to be done afterwards.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Bugzilla_queries</id>
		<title>Metrics/Bugzilla queries</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Bugzilla_queries"/>
				<updated>2011-10-14T18:34:46Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Created page with &amp;quot; === BZ bugs opened and resolved month by month ===  &amp;lt;pre&amp;gt; &amp;lt;nowiki&amp;gt; select  	opened.monthnum as monthnum, 	opened.monthstr as monthstr, 	opened_count, 	resolved_count from ( 	SEL...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
=== BZ bugs opened and resolved month by month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	opened.monthnum as monthnum,&lt;br /&gt;
	opened.monthstr as monthstr,&lt;br /&gt;
	opened_count,&lt;br /&gt;
	resolved_count&lt;br /&gt;
from&lt;br /&gt;
(&lt;br /&gt;
	SELECT&lt;br /&gt;
		date_format(creation_ts,'%Y%m') as monthnum, &lt;br /&gt;
		date_format(creation_ts,'%b %y') as monthstr, &lt;br /&gt;
		count(bugs.bug_id) as opened_count&lt;br /&gt;
	FROM bugs &lt;br /&gt;
	GROUP BY monthnum&lt;br /&gt;
) AS opened&lt;br /&gt;
INNER JOIN&lt;br /&gt;
(&lt;br /&gt;
	SELECT &lt;br /&gt;
		date_format(bug_when, '%Y%m') AS monthnum, &lt;br /&gt;
		date_format(bug_when, '%b %y') AS monthstr, &lt;br /&gt;
		COUNT(*) as resolved_count &lt;br /&gt;
	FROM bugs_activity&lt;br /&gt;
	WHERE &lt;br /&gt;
		fieldid=9 AND &lt;br /&gt;
		added='RESOLVED' &lt;br /&gt;
	GROUP BY monthnum&lt;br /&gt;
) AS resolved&lt;br /&gt;
ON &lt;br /&gt;
	opened.monthnum = resolved.monthnum&lt;br /&gt;
where opened.monthnum &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
ORDER BY opened.monthnum;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== BZ active users last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select&lt;br /&gt;
	count(distinct userid)&lt;br /&gt;
from&lt;br /&gt;
(&lt;br /&gt;
	select &lt;br /&gt;
		ba.who as userid,&lt;br /&gt;
		ba.bug_when as action_date &lt;br /&gt;
	from bugs_activity ba &lt;br /&gt;
	where  &lt;br /&gt;
		date_format(ba.bug_when,'%Y%m')=date_format(${last_month},'%Y%m') and&lt;br /&gt;
		ba.fieldid in (2,4,5,6,7,8,9,10,11,12,13,14,15,16,18,19,30,35,36,37,38,40,41,42,47,55,56,57,58) &lt;br /&gt;
	group by action_date,userid &lt;br /&gt;
	union all &lt;br /&gt;
		select &lt;br /&gt;
			b.reporter,&lt;br /&gt;
			b.creation_ts &lt;br /&gt;
		from bugs b &lt;br /&gt;
		where &lt;br /&gt;
			date_format(b.creation_ts,'%Y%m')=date_format(${last_month},'%Y%m')&lt;br /&gt;
) as filtered_actions;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== BZ active users month by month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select&lt;br /&gt;
	date_format(action_date,'%Y%m') as monthnum,&lt;br /&gt;
	date_format(action_date,'%b %y') as monthstr,&lt;br /&gt;
	count(distinct userid)&lt;br /&gt;
from&lt;br /&gt;
(&lt;br /&gt;
	select &lt;br /&gt;
		ba.who as userid,&lt;br /&gt;
		ba.bug_when as action_date &lt;br /&gt;
	from bugs_activity ba &lt;br /&gt;
	where  &lt;br /&gt;
		ba.fieldid in (2,4,5,6,7,8,9,10,11,12,13,14,15,16,18,19,30,35,36,37,38,40,41,42,47,55,56,57,58) &lt;br /&gt;
	group by action_date,userid &lt;br /&gt;
	union all &lt;br /&gt;
		select &lt;br /&gt;
			b.reporter,&lt;br /&gt;
			b.creation_ts &lt;br /&gt;
		from bugs b &lt;br /&gt;
) as filtered_actions &lt;br /&gt;
where date_format(action_date,'%Y%m') &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
group by monthnum&lt;br /&gt;
order by monthnum;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== BZ top 20 bug reporters last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT &lt;br /&gt;
	profiles.realname AS username,&lt;br /&gt;
	COUNT(bugs.bug_id) AS bugcount &lt;br /&gt;
FROM bugs      &lt;br /&gt;
INNER JOIN profiles &lt;br /&gt;
ON &lt;br /&gt;
	bugs.reporter = profiles.userid &lt;br /&gt;
WHERE date_format(creation_ts,'%Y%m')=date_format(${last_month},'%Y%m') &lt;br /&gt;
GROUP BY username &lt;br /&gt;
ORDER BY bugcount DESC &lt;br /&gt;
LIMIT 20;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== BZ top 20 active users last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	p.realname, &lt;br /&gt;
	count(*) as activity_count &lt;br /&gt;
from &lt;br /&gt;
(&lt;br /&gt;
	select &lt;br /&gt;
		ba.bug_id as bug_id,&lt;br /&gt;
		ba.who as userid,&lt;br /&gt;
		ba.bug_when as date &lt;br /&gt;
	from bugs_activity ba &lt;br /&gt;
	where  &lt;br /&gt;
		date_format(ba.bug_when,'%Y%m')=date_format(${last_month},'%Y%m') and &lt;br /&gt;
		ba.fieldid in (2,4,5,6,7,8,9,10,11,12,13,14,15,16,18,19,30,35,36,37,38,40,41,42,47,55,56,57,58) &lt;br /&gt;
	group by ba.bug_when,ba.who &lt;br /&gt;
	union all &lt;br /&gt;
		select &lt;br /&gt;
			b.bug_id, &lt;br /&gt;
			b.reporter,&lt;br /&gt;
			b.creation_ts &lt;br /&gt;
		from bugs b &lt;br /&gt;
		where &lt;br /&gt;
			date_format(b.creation_ts,'%Y%m')=date_format(${last_month},'%Y%m')&lt;br /&gt;
	order by bug_id&lt;br /&gt;
) as filtered_actions &lt;br /&gt;
inner join profiles p &lt;br /&gt;
on &lt;br /&gt;
(&lt;br /&gt;
	p.userid=filtered_actions.userid&lt;br /&gt;
) &lt;br /&gt;
group by filtered_actions.userid &lt;br /&gt;
order by activity_count desc &lt;br /&gt;
limit 20;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Forum_queries</id>
		<title>Metrics/Forum queries</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Forum_queries"/>
				<updated>2011-10-14T18:32:48Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Created page with &amp;quot; The forum database structure could not be simpler, it mirrors precisely [http://forum.meego.com/stats/ the CVS files which are generated every month].  === Forum posts per forum...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
The forum database structure could not be simpler, it mirrors precisely [http://forum.meego.com/stats/ the CVS files which are generated every month].&lt;br /&gt;
&lt;br /&gt;
=== Forum posts per forum by month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `forum_posts`.`month` as month,&lt;br /&gt;
     `forum_posts`.`year` as year,&lt;br /&gt;
     date_format(date_add( makedate(year, 1), interval month-1 MONTH), '%b %Y') as monthstr, &lt;br /&gt;
     concat(cast(year as char), LPAD(cast(month as char), 2, '0')) as monthnum,&lt;br /&gt;
     `forum_posts`.`forum` as forum,&lt;br /&gt;
     `forum_posts`.`posts` as posts&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_posts`&lt;br /&gt;
ORDER BY&lt;br /&gt;
     `forum_posts`.`forum` ASC,&lt;br /&gt;
     `forum_posts`.`year` ASC,&lt;br /&gt;
     `forum_posts`.`month` ASC&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Forum top 20 posters last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `forum_top_posters`.`month`,&lt;br /&gt;
     `forum_top_posters`.`year`,&lt;br /&gt;
     `forum_top_posters`.`rank`,&lt;br /&gt;
     `forum_top_posters`.`member`,&lt;br /&gt;
     `forum_top_posters`.`posts`,&lt;br /&gt;
     monthname(${last_month}) AS monthname&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_top_posters`&lt;br /&gt;
WHERE&lt;br /&gt;
     month = month(${last_month})&lt;br /&gt;
 AND year = year(${last_month})&lt;br /&gt;
 limit 20&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Forum most viewed threads last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     month,&lt;br /&gt;
     year,&lt;br /&gt;
     rank,&lt;br /&gt;
     title,&lt;br /&gt;
     views&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_most_viewed_threads`&lt;br /&gt;
WHERE&lt;br /&gt;
     month=MONTH(${last_month})&lt;br /&gt;
     AND&lt;br /&gt;
     year=YEAR(${last_month})&lt;br /&gt;
ORDER BY&lt;br /&gt;
     views ASC&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Forum posts per forum cumulative ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `forum_cumulative_posts`.`month`,&lt;br /&gt;
     `forum_cumulative_posts`.`year`,&lt;br /&gt;
     `forum_cumulative_posts`.`forum`,&lt;br /&gt;
     `forum_cumulative_posts`.`posts`&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_cumulative_posts`&lt;br /&gt;
ORDER BY&lt;br /&gt;
`forum_cumulative_posts`.`forum`,&lt;br /&gt;
     `forum_cumulative_posts`.`year` ASC,&lt;br /&gt;
     `forum_cumulative_posts`.`month`&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Forum top thanks last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `forum_top_thanked`.`month`,&lt;br /&gt;
     `forum_top_thanked`.`year`,&lt;br /&gt;
     `forum_top_thanked`.`rank`,&lt;br /&gt;
     `forum_top_thanked`.`member`,&lt;br /&gt;
     `forum_top_thanked`.`thanks`&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_top_thanked`&lt;br /&gt;
 WHERE&lt;br /&gt;
     month = month(${last_month})&lt;br /&gt;
 AND year = year(${last_month})&lt;br /&gt;
order by rank&lt;br /&gt;
limit 20&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Forum threads per forum cumulative ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `forum_cumulative_threads`.`month`,&lt;br /&gt;
     `forum_cumulative_threads`.`year`,&lt;br /&gt;
     date_format(date_add( makedate(year, 1), interval month-1 MONTH), '%b %Y') as monthstr, &lt;br /&gt;
     concat(cast(year as char), LPAD(cast(month as char), 2, '0')) as monthnum,&lt;br /&gt;
     `forum_cumulative_threads`.`forum`,&lt;br /&gt;
     `forum_cumulative_threads`.`threads`&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_cumulative_threads`&lt;br /&gt;
ORDER BY&lt;br /&gt;
     `forum_cumulative_threads`.`forum` ASC,&lt;br /&gt;
     `forum_cumulative_threads`.`year` ASC,&lt;br /&gt;
     `forum_cumulative_threads`.`month` ASC&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Forum top 10 posters last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `forum_top_posters`.`month`,&lt;br /&gt;
     `forum_top_posters`.`year`,&lt;br /&gt;
     `forum_top_posters`.`rank`,&lt;br /&gt;
     `forum_top_posters`.`member`,&lt;br /&gt;
     `forum_top_posters`.`posts`,&lt;br /&gt;
     monthname(${last_month}) AS monthname&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_top_posters`&lt;br /&gt;
WHERE&lt;br /&gt;
     month = month(${last_month})&lt;br /&gt;
 AND year = year(${last_month})&lt;br /&gt;
 limit 10&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Forum posts per forum last 4 months ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `forum_posts`.`month` as month,&lt;br /&gt;
     `forum_posts`.`year` as year,&lt;br /&gt;
     date_format(date_add( makedate(year, 1), interval month-1 MONTH), '%b %Y') as monthstr, &lt;br /&gt;
     concat(cast(year as char), LPAD(cast(month as char), 2, '0')) as monthnum,&lt;br /&gt;
     `forum_posts`.`forum` as forum,&lt;br /&gt;
     `forum_posts`.`posts` as posts&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_posts`&lt;br /&gt;
where&lt;br /&gt;
	month&amp;gt;=MONTH(DATE_SUB(NOW(), INTERVAL 4 MONTH)) and&lt;br /&gt;
	year&amp;gt;=YEAR(DATE_SUB(NOW(), INTERVAL 4 MONTH))&lt;br /&gt;
ORDER BY&lt;br /&gt;
     `forum_posts`.`forum` ASC,&lt;br /&gt;
     monthnum ASC&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Forum repeat posters last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     count(`forum_top_posters`.`member`) as active_count,&lt;br /&gt;
     monthname(${last_month}) AS monthname&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_top_posters`&lt;br /&gt;
WHERE&lt;br /&gt;
     month = month(${last_month})&lt;br /&gt;
 AND year = year(${last_month})&lt;br /&gt;
 AND `forum_top_posters`.`posts`&amp;gt;=2&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Forum posters last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     count(`forum_top_posters`.`member`) as poster_count,&lt;br /&gt;
     monthname(${last_month}) AS monthname&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_top_posters`&lt;br /&gt;
WHERE&lt;br /&gt;
     month = month(${last_month})&lt;br /&gt;
 AND year = year(${last_month})&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Forum total posts last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select&lt;br /&gt;
     sum(posts) as total_posts&lt;br /&gt;
FROM&lt;br /&gt;
     forum_posts&lt;br /&gt;
WHERE&lt;br /&gt;
     forum_posts.month = MONTH(${last_month})&lt;br /&gt;
AND&lt;br /&gt;
     forum_posts.year = YEAR(${last_month})&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Forum hottest threads last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `forum_hottest_threads`.`month`,&lt;br /&gt;
     `forum_hottest_threads`.`year`,&lt;br /&gt;
     `forum_hottest_threads`.`rank`,&lt;br /&gt;
     `forum_hottest_threads`.`title`,&lt;br /&gt;
     `forum_hottest_threads`.`posts`&lt;br /&gt;
FROM&lt;br /&gt;
     `forum_hottest_threads`&lt;br /&gt;
WHERE&lt;br /&gt;
     `forum_hottest_threads`.`month`=month(${last_month})&lt;br /&gt;
     AND&lt;br /&gt;
     `forum_hottest_threads`.`year`=year(${last_month})&lt;br /&gt;
ORDER BY&lt;br /&gt;
     `forum_hottest_threads`.`rank` ASC&lt;br /&gt;
LIMIT 10&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/IRC_queries</id>
		<title>Metrics/IRC queries</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/IRC_queries"/>
				<updated>2011-10-13T15:59:24Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Created page with &amp;quot; For IRC, we use [https://github.com/tommyrot/superseriousstats/wiki superseriousstats] with a daily cron job. The database stores aggregated data. Its schema is a little funky, ...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
For IRC, we use [https://github.com/tommyrot/superseriousstats/wiki superseriousstats] with a daily cron job. The database stores aggregated data. Its schema is a little funky, but effective enough. As before lastmonth is a parameter containing a date one month ago.&lt;br /&gt;
&lt;br /&gt;
=== Hour by hour (all time) ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select sum(`l_00`) as `l_00`, sum(`l_01`) as `l_01`, sum(`l_02`) as `l_02`,&lt;br /&gt;
        sum(`l_03`) as `l_03`, sum(`l_04`) as `l_04`, sum(`l_05`) as `l_05`,&lt;br /&gt;
        sum(`l_06`) as `l_06`, sum(`l_07`) as `l_07`, sum(`l_08`) as `l_08`,&lt;br /&gt;
        sum(`l_09`) as `l_09`, sum(`l_10`) as `l_10`, sum(`l_11`) as `l_11`,&lt;br /&gt;
        sum(`l_12`) as `l_12`, sum(`l_13`) as `l_13`, sum(`l_14`) as `l_14`,&lt;br /&gt;
        sum(`l_15`) as `l_15`, sum(`l_16`) as `l_16`, sum(`l_17`) as `l_17`,&lt;br /&gt;
        sum(`l_18`) as `l_18`, sum(`l_19`) as `l_19`, sum(`l_20`) as `l_20`,&lt;br /&gt;
        sum(`l_21`) as `l_21`, sum(`l_22`) as `l_22`, sum(`l_23`) as `l_23`&lt;br /&gt;
  from `channel`&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Top participants ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select  `q_lines`.`ruid`,&lt;br /&gt;
         `csnick` as nick,&lt;br /&gt;
         `l_total` as total,&lt;br /&gt;
         `l_night` as night,&lt;br /&gt;
         `l_morning` as morning, `l_afternoon` as afternoon,&lt;br /&gt;
         `l_evening` as evening,&lt;br /&gt;
         `quote` from `q_lines` &lt;br /&gt;
   join `user_details`&lt;br /&gt;
     on `q_lines`.`ruid` = `user_details`.`uid`&lt;br /&gt;
   join `user_status`&lt;br /&gt;
     on `q_lines`.`ruid` = `user_status`.`uid`&lt;br /&gt;
     where `status` != 3&lt;br /&gt;
   order by `l_total` desc,&lt;br /&gt;
             `q_lines`.`ruid` asc&lt;br /&gt;
   limit 20&lt;br /&gt;
   &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Hour by hour (last month) ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select 'Time of day (UTC)' as timeofday,&lt;br /&gt;
        sum(`l_00`) as `l_00`, sum(`l_01`) as `l_01`, sum(`l_02`) as `l_02`,&lt;br /&gt;
        sum(`l_03`) as `l_03`, sum(`l_04`) as `l_04`, sum(`l_05`) as `l_05`,&lt;br /&gt;
        sum(`l_06`) as `l_06`, sum(`l_07`) as `l_07`, sum(`l_08`) as `l_08`,&lt;br /&gt;
        sum(`l_09`) as `l_09`, sum(`l_10`) as `l_10`, sum(`l_11`) as `l_11`,&lt;br /&gt;
        sum(`l_12`) as `l_12`, sum(`l_13`) as `l_13`, sum(`l_14`) as `l_14`,&lt;br /&gt;
        sum(`l_15`) as `l_15`, sum(`l_16`) as `l_16`, sum(`l_17`) as `l_17`,&lt;br /&gt;
        sum(`l_18`) as `l_18`, sum(`l_19`) as `l_19`, sum(`l_20`) as `l_20`,&lt;br /&gt;
        sum(`l_21`) as `l_21`, sum(`l_22`) as `l_22`, sum(`l_23`) as `l_23`&lt;br /&gt;
  from `channel`&lt;br /&gt;
  where month(`date`) = month(DATE_SUB(CURDATE(),INTERVAL 1 MONTH)) and&lt;br /&gt;
         year(`date`) = year(DATE_SUB(CURDATE(),INTERVAL 1 MONTH))  &lt;br /&gt;
  ;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Month by month (&amp;lt;= last month) ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select date_format(`date`, '%Y-%m') as `date`,&lt;br /&gt;
	  date_format(date, '%b %y') as monthname,&lt;br /&gt;
	  month(date) as month,&lt;br /&gt;
	  year(date) as year,&lt;br /&gt;
       sum(`l_total`) as total,&lt;br /&gt;
       sum(`l_night`) as night,&lt;br /&gt;
       sum(`l_morning`) as morning,&lt;br /&gt;
       sum(`l_afternoon`) as afternoon,&lt;br /&gt;
       sum(`l_evening`) as evening&lt;br /&gt;
  from `channel`&lt;br /&gt;
  where year(date) &amp;lt;= year(${last_month}) and&lt;br /&gt;
  	month(date) &amp;lt;= month(${last_month})&lt;br /&gt;
  group by year(`date`), month(`date`);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Top participants last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select `q_lines`.`ruid`,&lt;br /&gt;
        `csnick` as nick,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_total`) as total,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_night`) as night,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_morning`) as morning,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_afternoon`) as afternoon,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_evening`) as evening,&lt;br /&gt;
        `quote` from `q_lines`&lt;br /&gt;
  join `q_activity_by_month`&lt;br /&gt;
    on `q_lines`.`ruid` = `q_activity_by_month`.`ruid`&lt;br /&gt;
  join `user_status`&lt;br /&gt;
    on `q_lines`.`ruid` = `user_status`.`uid`&lt;br /&gt;
  join `user_details`&lt;br /&gt;
    on `q_lines`.`ruid` = `user_details`.`uid`&lt;br /&gt;
  where `status` != 3 and&lt;br /&gt;
         `date` = date_format(${last_month}, '%Y-%m')&lt;br /&gt;
  group by `q_lines`.`ruid`&lt;br /&gt;
  order by `q_activity_by_month`.`l_total` desc,&lt;br /&gt;
            `q_lines`.`ruid` asc&lt;br /&gt;
  limit 20&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Popular words ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select `total` as `v1`, &lt;br /&gt;
       `word` as `v2`&lt;br /&gt;
from `words`&lt;br /&gt;
where length(`word`) &amp;gt; 4&lt;br /&gt;
order by `v1` desc,&lt;br /&gt;
         `v2` asc&lt;br /&gt;
limit 20&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Day by day (last month) ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select `date`,&lt;br /&gt;
        `l_total` as total,&lt;br /&gt;
        `l_night` as night,&lt;br /&gt;
        `l_morning` as morning, &lt;br /&gt;
        `l_afternoon` as afternoon, &lt;br /&gt;
        `l_evening` as evening&lt;br /&gt;
  from `channel`&lt;br /&gt;
  where &lt;br /&gt;
		year(date) = year(${last_month}) and&lt;br /&gt;
  		month(date) = month(${last_month});&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== SSS database schema ===&lt;br /&gt;
&lt;br /&gt;
The schema reflects the fact that there is already aggregation done by SSS when it reads logs. The database schema is [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v4.sql documented in the SSS source code].&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Dashboard</id>
		<title>Metrics/Dashboard</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Dashboard"/>
				<updated>2011-10-13T14:19:24Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Architecture */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Community Metrics Dashboard ==&lt;br /&gt;
&lt;br /&gt;
The goal is to provide a web page summarising metrics about various aspects of the MeeGo project. The data should update regularly - depending on the metric, that could be real time or  updated automatically on a regular basis.&lt;br /&gt;
&lt;br /&gt;
The dashboard will track the following community resources, ideally: &lt;br /&gt;
* Drupal members&lt;br /&gt;
* Bugzilla (bugs opened, bugs closed, active users)&lt;br /&gt;
* Mailing lists (members, posts, threads)&lt;br /&gt;
* gitorious (commits, employer details for committers) - should use Jon Corbet's scripts like are used in the LF yearly kernel data.&lt;br /&gt;
* Wiki (edits, new pages)&lt;br /&gt;
* Forums (members, posts)&lt;br /&gt;
* IRC (total comments, people on channel)&lt;br /&gt;
* Transifex (Languages, translators, strings translated)&lt;br /&gt;
* Community OBS (uploads, users)&lt;br /&gt;
* SDK downloads (potentially extrapolated from meego.com)&lt;br /&gt;
&lt;br /&gt;
The data should also be available for custom reports for usage and analysis in the monthly MeeGo Metrics report published by [[User:DawnFoster]]&lt;br /&gt;
&lt;br /&gt;
To fulfill these goals, the dashboard will gather data from the various resource into a centralised database, using some sort of Business Intelligence platform including ETL for data acquisition and storage, and a reporting service for generating reports and dashboards.. A web page will provide a view into this database with predefined reports.&lt;br /&gt;
&lt;br /&gt;
Candidate reporting solutions:&lt;br /&gt;
&lt;br /&gt;
* [http://jasperforge.org/index.php?q=project/jasperreports JasperReports]&lt;br /&gt;
* [http://www.pentaho.com/ Pentaho]&lt;br /&gt;
&lt;br /&gt;
The following are essentially ETL engines, and do not provide reporting or dashboard functionality:&lt;br /&gt;
&lt;br /&gt;
* [http://www.talend.com/index.php Talend]&lt;br /&gt;
* [http://petals.ow2.org/ Petals]&lt;br /&gt;
&lt;br /&gt;
[http://www.mulesoft.com/ MuleSoft] is an open source ESB, but does not seem adapted to our needs. The field is thus narrowed to Pentaho and JasperReports.&lt;br /&gt;
&lt;br /&gt;
For each community resource, we need to figure out how to get the data into a usable form, and come up with appropriate queries for metrics reports, and finally present the results on a webpage.&lt;br /&gt;
&lt;br /&gt;
=== Business intelligence engines ===&lt;br /&gt;
&lt;br /&gt;
The area of Business Intelligence is littered with acronyms. Here's a quick overview of the main ones, and how they all fit together.&lt;br /&gt;
&lt;br /&gt;
; BI&lt;br /&gt;
: Business Intelligence - general name for any middleware which allows you to query business processes (sales, inventory, etc) and get data overviews from it&lt;br /&gt;
; ETL&lt;br /&gt;
: Extract, Transform, and Load - the process if extracting data from a data source (database, screen scraping, text file parsing, whatever), transforming it to a well understood format, and loading it in your BI engine database or data warehouse. Good ETL solutions provide a nice way for you to connect another database and have new data sucked in at regular intervals, define views into the source data store which you can then query within your BI engine, etc. Pentaho's ETL, [http://kettle.pentaho.com/ Kettle], and [http://www.jaspersoft.com/jasperetl JasperETL], used by JasperReports, both provide (kind of) straightforward ways to hook into a MySQL database.&lt;br /&gt;
; ESB&lt;br /&gt;
: [http://en.wikipedia.org/wiki/Enterprise_service_bus Enterprise Service Bus] - a middleware bus providing a unique interface to applications on the front-end and data stores on the back end. Often used to link up many front-end applications (eg. library, student registration, employee payroll, syllabus management, accounting, supply-chain, student lodgement programmes, etc in a university). Not really useful for us, as far as I can tell.&lt;br /&gt;
; EAI&lt;br /&gt;
: Enterprise Application Integration - using software to integrate different applications together. As far as I can tell, this is a meaningless catch-all phrase for anything from kludges to architected business intelligence solutions.&lt;br /&gt;
; DW&lt;br /&gt;
: Data Warehouse. Basically the same thing as a database, as far as I can tell, but bigger and more impressive sounding.&lt;br /&gt;
; OLAP&lt;br /&gt;
: On-Line Analytical Processing. Commonly used acronym for extracting data via multi-dimensional queries. Databases can be configured to provide the results of this kind of query. As far as I can tell this is mostly a buzzword - an &amp;quot;OLAP database&amp;quot; like [http://mondrian.pentaho.com/ Mondrian] is basically the same thing as a database. &amp;quot;speed-of-thought&amp;quot; response times indeed.&lt;br /&gt;
; Business reporting&lt;br /&gt;
: An application which allows a graphical view of a database, and allows you to construct queries interactively, often using drag &amp;amp; drop. The results of these queries can then be plugged into graphing software for presentation in a dashboard.&lt;br /&gt;
; Dashboard&lt;br /&gt;
: Organised presentation of information in a web-page or other similar format allowing an at-a-glance overview of the situation for the data being measured.&lt;br /&gt;
&lt;br /&gt;
So, in short, the community dashboard project will likely use an ETL to plug data into an OLAP server, and then use a business reporting engine to query that data and present it in a dashboard.&lt;br /&gt;
&lt;br /&gt;
=== Comparison of candidate ETL/reporting ===&lt;br /&gt;
&lt;br /&gt;
Modules available:&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
! Software !! License !! ETL !! OLAP database !! BI server !! Reporting !! Dashboard module&lt;br /&gt;
|-&lt;br /&gt;
| Pentaho || EPL || [http://kettle.pentaho.com/ Kettle] || [http://mondrian.pentaho.com/ Mondrian] || [http://community.pentaho.com/projects/bi_platform/ Pentaho BI Platform] || [http://reporting.pentaho.com/ Pentaho Reporting] || [http://wiki.pentaho.com/display/COM/Community+Dashboard+Framework Community Dashboard Framework]&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.jaspersoft.com/editions Jaspersoft] || AGPL v3 || JasperETL (Talend Open Studio) || JasperOLAP || JasperReports Server || iReports editor || No (commercial only)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Pentaho is used as the basis of Mozilla's metrics project, and provides a very strong community software option for both the dashboard and for managing the BI server. Since Mozilla metrics work overlaps what we are trying to achieve, particularly their work on SQR, the Software Quality Reports analytic module for Bugzilla and JIRA, Pentaho is my preference for the dashboard project. In general, I have observed that the Pentaho community provides very good support.&lt;br /&gt;
&lt;br /&gt;
== Architecture ==&lt;br /&gt;
&lt;br /&gt;
Pentaho runs as a webapp in Tomcat6. It can use a variety of databases for its internal data structures, the default (Hypersonic) is a Java database. However, because it's both standard &amp;amp; well understood and to allow consolidation of databases under one DB server, I prefer to use MySQL. The configuration of Pentaho with a MySQL database is a little tricky, but almost all of the steps are covered well [https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en in this tutorial].&lt;br /&gt;
&lt;br /&gt;
The data which is useful for metrics will be copied into a local database from each of the services we query. The copying of data will be accomplished by a set of Kettle &amp;quot;xactions&amp;quot;, which can be created and edited easily with the Spoon tool.&lt;br /&gt;
&lt;br /&gt;
A number of reports will be generated using the Pentaho Report Designer, including a static HTML/Flash dashboard which will be published regularly. Other reports can be created for the community managers, and a more advanced dashboard, allowing detailed analysis of basic metrics, can be provided via the Community Dashboard Framework.&lt;br /&gt;
&lt;br /&gt;
We will need to see how much load the dashboard will generate on the server. I suspect that it will not be practical to expose the dashboard in public.&lt;br /&gt;
&lt;br /&gt;
We will document here everything you need to do to replicate the MeeGo Community Dashboard, with the exception of data which is not publicly available because it contains security related or confidential information (mainly bugzilla).&lt;br /&gt;
&lt;br /&gt;
* [[../Installing Pentaho]]: A guide to getting Pentaho up and running for MeeGo Dashboard&lt;br /&gt;
* [[../Gathering data]]: A guide to installing and using the tools to gather data for a dashboard&lt;br /&gt;
* [[../Creating a report]]: Given data in a database, how do we generate a report in Pentaho, and deploy it to the dashboard?&lt;br /&gt;
&lt;br /&gt;
* [[../Mailing list queries]]: SQL queries against the MLStats database&lt;br /&gt;
* [[../MediaWiki queries]]: SQL queries against a &amp;lt;nowiki&amp;gt;MediaWiki&amp;lt;/nowiki&amp;gt; database&lt;br /&gt;
* [[../IRC queries]]: SQL queries against the superseriousstats database&lt;br /&gt;
* [[../Forum queries]]: SQL queries against the forum database&lt;br /&gt;
* [[../Bugzilla queries]]: SQL queries against the Bugzilla database&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Extracting data ===&lt;br /&gt;
&lt;br /&gt;
For SQL databases, this implies that the server where the dashboard will run should have access to the database server for MediaWiki, Bugzilla, and Drupal.&lt;br /&gt;
&lt;br /&gt;
For the forum, we integrate the CSV files currently being exported, which provide the basic analytics we need.&lt;br /&gt;
&lt;br /&gt;
Individual mailing lists are parsed by [http://forge.morfeo-project.org/projects/libresoft-tools/ MLStats]. We use the resulting database directly in the dashboard.&lt;br /&gt;
&lt;br /&gt;
IRC logs will be parsed with [http://code.google.com/p/superseriousstats/ superseriousstats], a PHP command line tool that parses IRC logs and stores the results in an SQL database.&lt;br /&gt;
&lt;br /&gt;
Git repositories will be queried with &amp;quot;git log&amp;quot;, and parsed with the parser module from [http://lwn.net/Articles/290957/ gitdm], before being stored directly in a database. we will be able to run analytics on the results from there. gitdm can also do basic analytics of git logs, and we may decide to simply reuse gitdm's analytics. However, if we want to extend them, we will want to have the raw data.&lt;br /&gt;
&lt;br /&gt;
We still need to figure out how to do data interchange with Transifex and OBS. Dimitris tells me that there are already [http://meego.transifex.net/stats/ some analytics] available on Transifex, and that there is a RESTful API available to query this data.&lt;br /&gt;
&lt;br /&gt;
== Data to report ==&lt;br /&gt;
&lt;br /&gt;
For each of the resources, the following statistics (at a minimum) should be extracted:&lt;br /&gt;
&lt;br /&gt;
=== Drupal ===&lt;br /&gt;
* Members of meego.com (+ evolution month over month)&lt;br /&gt;
* Active members (need a decent way to hook up different ways a person can be active: wiki, ML, IRC, forum, git)&lt;br /&gt;
&lt;br /&gt;
=== Mailing lists ===&lt;br /&gt;
* Subscriber numbers (+ evolution) - from Mailman directly, not available in mlstats&lt;br /&gt;
* Emails sent (+ evolution) - from mlstats&lt;br /&gt;
* Active participants (individuals with &amp;gt;=2 emails during month) - from mlstats&lt;br /&gt;
* Hot threads - from mlstats&lt;br /&gt;
* Top posters - from mlstats&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* Count posts of most popular threads:&lt;br /&gt;
*: &amp;lt;code&amp;gt;select subject,year(first_date) as y, monthname(first_date),count(*) as c from messages group by subject, month(first_date) order by y, month(first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Count the number of posts for each person&lt;br /&gt;
*: &amp;lt;code&amp;gt;select p.email_address,year(m.first_date) as y, monthname(m.first_date),count(*) as c from messages as m,messages_people as p where m.message_id=p.message_ID group by p.email_address, month(m.first_date) order by y, month(m.first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Restricting queries to a date range / month&lt;br /&gt;
*: The query above gives month-by-month totals, you could add an order by year(first_date) to get the year too&lt;br /&gt;
*: The easiest way is to add &amp;lt;code&amp;gt;where month(first_date)=3 and year(first_date)=2010&amp;lt;/code&amp;gt; for March 2010. For the current month, &amp;lt;code&amp;gt;month(m.first_date)=month(NOW()) and year(first_date)=year(NOW())&amp;lt;/code&amp;gt; works.&lt;br /&gt;
&lt;br /&gt;
=== Forum ===&lt;br /&gt;
* Posts per month (+evolution)&lt;br /&gt;
* Active posters (2+ posts during month)&lt;br /&gt;
* Hot topics&lt;br /&gt;
* Top posters&lt;br /&gt;
&lt;br /&gt;
[http://forum.meego.com/stats/ Stats] are exported from the Forum in CSV format monthly.&lt;br /&gt;
&lt;br /&gt;
=== Bugzilla ===&lt;br /&gt;
* Bugs created (+ evolution)&lt;br /&gt;
* Bugs resolved (+ evolution)&lt;br /&gt;
* Comments on bugs this month&lt;br /&gt;
* Active Bugzilla contributors (2+ comments during month)&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* [http://sourceforge.net/projects/qareports/ Software Quality Reports for Pentaho] of course&lt;br /&gt;
* See the [http://bugs.meego.com/query.cgi?format=report-table bugzilla tabular reports] and relative dates for more. &lt;br /&gt;
* Other potential resources:&lt;br /&gt;
** [http://www.ravenbrook.com/project/p4dti/tool/cgi/bugzilla-schema/ Bugzilla database schema by version]&lt;br /&gt;
** [https://landfill.bugzilla.org/bugzilla-tip/report.cgi Demo site for standard Bugzilla reports]&lt;br /&gt;
** [https://wiki.mozilla.org/Bugzilla:SQLCookBook The Bugzilla SQL cookbook]&lt;br /&gt;
** [http://www.bugzillametrics.org/ Bugzilla Metrics] - Third party add-on to Bugzilla that produces lots of metrics&lt;br /&gt;
** browse.cgi and weekly-summary.html extensions of GNOME Bugzilla: [https://launchpad.net/bugzilla.gnome.org Code], examples in practice: [http://bugzilla.gnome.org/browse.cgi?product=Evolution browse.cgi], [https://bugzilla.gnome.org/page.cgi?id=weekly-bug-summary.html weekly-summary.cgi]&lt;br /&gt;
&lt;br /&gt;
=== Mediawiki ===&lt;br /&gt;
* New wiki pages (+ evolution)&lt;br /&gt;
* Edits this month (+ evolution)&lt;br /&gt;
* Pages deleted this month (+ evolution)&lt;br /&gt;
* Unique editors this month (+ evoluion)&lt;br /&gt;
&lt;br /&gt;
==== Queries ====&lt;br /&gt;
&lt;br /&gt;
A [http://www.mediawiki.org/wiki/Extension:SpecialUserScore MediaWiki extension] exists to provide &amp;quot;user scores&amp;quot; for MediaWiki users, ordered by number of edits and number of pages changed. The guts of the query is:&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(wr.rev_id) as value,&lt;br /&gt;
        COUNT(DISTINCT wr.rev_page) as page_value,&lt;br /&gt;
        wu.user_name as name,&lt;br /&gt;
        wu.user_real_name as real_name&lt;br /&gt;
 FROM   $user wu,&lt;br /&gt;
        $revision wr,&lt;br /&gt;
        $page wp&lt;br /&gt;
 WHERE  wu.user_id = wr.rev_user&lt;br /&gt;
    and wp.page_id = wr.rev_page&lt;br /&gt;
    and wp.page_namespace = 0&lt;br /&gt;
 GROUP BY wu.user_name&lt;br /&gt;
 ORDER BY value desc;&lt;br /&gt;
&lt;br /&gt;
where $user, $revision and $page are the names of the respective MediaWiki tables (MediaWiki tables have a prefix associated with them for a given instance, specified by $wgDBprefix in LocalSettings.php).&lt;br /&gt;
&lt;br /&gt;
For the following group-by-month queries, I did a cross join of (2008,2009,2010,2011) and (01-12) to generate a &amp;quot;year and month&amp;quot; data table.&lt;br /&gt;
&lt;br /&gt;
'''Top editors by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year AS yyyy,&lt;br /&gt;
        mon.timestamp_month AS mm,&lt;br /&gt;
        rev_user_text AS user,&lt;br /&gt;
        COUNT(*) AS c&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months AS mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm,user&lt;br /&gt;
 HAVING c&amp;gt;5&lt;br /&gt;
 ORDER BY yyyy,mm,c desc;&lt;br /&gt;
&lt;br /&gt;
'''Number of edits by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*) AS edits&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
'''New pages per month:'''&lt;br /&gt;
To get the number of new pages per month is a bit trickier - first we need to query $revision to get the page_ids and their date of creation, then group by date. The query is O(n²) on the number of pages, although it should be possible to make it O(n) by grouping the result of the subquery without doing in() on the list of timestamps.&lt;br /&gt;
&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*)&lt;br /&gt;
 FROM mw_revision as rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE CONCAT(CONCAT(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
   AND rev.rev_timestamp in (&lt;br /&gt;
                SELECT MIN(rev_timestamp)&lt;br /&gt;
                FROM mw_revision&lt;br /&gt;
                GROUP BY rev_page)&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
To get just the list of pages &amp;amp; timestamps (this is used as the subquery for above):&lt;br /&gt;
 SELECT rev_page as p,&lt;br /&gt;
        MIN(rev_timestamp) as t&lt;br /&gt;
 FROM mw_revision&lt;br /&gt;
 GROUP BY rev_page;&lt;br /&gt;
&lt;br /&gt;
=== IRC ===&lt;br /&gt;
&lt;br /&gt;
superseriousstats does some preliminary analysis on data it stores in its database. Its author (tommyrot) has kindly added a parser for the format of the IRC logs we use (supybot) on my request. The [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v3.sql database schema] is a little hard to work out; Several key tables have fields with undescriptive names like l_01. There are some queries in [https://github.com/tommyrot/superseriousstats/blob/master/html.class.php html.class.php] which we can use to generate some reports, though.&lt;br /&gt;
 &lt;br /&gt;
* Total IRC activity (by hour)&lt;br /&gt;
 select sum(`l_00`) as `l_00`, sum(`l_01`) as `l_01`, sum(`l_02`) as `l_02`,&lt;br /&gt;
        sum(`l_03`) as `l_03`, sum(`l_04`) as `l_04`, sum(`l_05`) as `l_05`,&lt;br /&gt;
        sum(`l_06`) as `l_06`, sum(`l_07`) as `l_07`, sum(`l_08`) as `l_08`,&lt;br /&gt;
        sum(`l_09`) as `l_09`, sum(`l_10`) as `l_10`, sum(`l_11`) as `l_11`,&lt;br /&gt;
        sum(`l_12`) as `l_12`, sum(`l_13`) as `l_13`, sum(`l_14`) as `l_14`,&lt;br /&gt;
        sum(`l_15`) as `l_15`, sum(`l_16`) as `l_16`, sum(`l_17`) as `l_17`,&lt;br /&gt;
        sum(`l_18`) as `l_18`, sum(`l_19`) as `l_19`, sum(`l_20`) as `l_20`,&lt;br /&gt;
        sum(`l_21`) as `l_21`, sum(`l_22`) as `l_22`, sum(`l_23`) as `l_23`&lt;br /&gt;
   from `channel`&lt;br /&gt;
* Total active participants (+ evolution) - we may be able to get &amp;quot;number of participants per hour/day/month&amp;quot; (so you can see if it's 2 guys taking amongst themselves or a larger group) - I'll ask tommyrot what the query should look like.&lt;br /&gt;
* Top contributors (per month)&lt;br /&gt;
 select `q_lines`.`ruid`, `csnick`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_total`) as `l_total`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_night`) as `l_night`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_morning`) as `l_morning`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_afternoon`) as `l_afternoon`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_evening`) as `l_evening`,&lt;br /&gt;
        `quote` from `q_lines`&lt;br /&gt;
    join `q_activity_by_month` on `q_lines`.`ruid` = `q_activity_by_month`.`ruid`&lt;br /&gt;
    join `user_status` on `q_lines`.`ruid` = `user_status`.`uid`&lt;br /&gt;
    join `user_details` on `q_lines`.`ruid` = `user_details`.`uid`&lt;br /&gt;
    where `status` != 3&lt;br /&gt;
      and `date` = '2011-02'&lt;br /&gt;
    group by `q_lines`.`ruid`&lt;br /&gt;
    order by `q_activity_by_month`.`l_total` desc, `q_lines`.`ruid` asc limit 30&lt;br /&gt;
&lt;br /&gt;
=== Transifex ===&lt;br /&gt;
* Total languages ordered by translation coverage (+ evolution)&lt;br /&gt;
* Top languages&lt;br /&gt;
* Top translators/teams&lt;br /&gt;
&lt;br /&gt;
This all depends on what is available from Transifex.&lt;br /&gt;
&lt;br /&gt;
=== Git ===&lt;br /&gt;
* Commits this month (+ evolution)&lt;br /&gt;
* Top committers (+ evolution)&lt;br /&gt;
* Committers by company (possible)&lt;br /&gt;
* Active modules (+ evolution)&lt;br /&gt;
&lt;br /&gt;
Using a modified version of gitdm to dump Git logs into a MySQL database for analysis. Modifications required:&lt;br /&gt;
* Create database &amp;amp; tables based on the gitdm data structures&lt;br /&gt;
* Dump data in correct order, avoiding redundancy if possible, into SQL database&lt;br /&gt;
&lt;br /&gt;
gitdm has 3 basic data structures: Hacker, Employer &amp;amp; Patch. Each changeset is a Patch object, each Patch has an Author, and is assigned to an Employer (based on who the Hacker was working for at the time of the Patch). Each Patch also has a list of Hackers who reviewed, reported, signed-off on and tested the patch. Each Hacker links to a list of Patches for which they are the author, a list of email addresses they have used to commit, and separate lists for reviewed, reported, SOB and tested. In addition, each Hacker has a list of Employers he has worked for, and each Employer has a list of Hackers who have worked for them.&lt;br /&gt;
&lt;br /&gt;
=== Community &amp;amp; Official OBS ===&lt;br /&gt;
* Package submissions (+ evolution)&lt;br /&gt;
* Active participants (+ evolution)&lt;br /&gt;
&lt;br /&gt;
== Not yet in scope ==&lt;br /&gt;
&lt;br /&gt;
I have not yet considered how I might get web analytics and download stats.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/MediaWiki_queries</id>
		<title>Metrics/MediaWiki queries</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/MediaWiki_queries"/>
				<updated>2011-10-13T14:19:07Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: MediaWiki queries &amp;amp; data structure.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
A word of warning with &amp;lt;nowiki&amp;gt;MediaWiki&amp;lt;/nowiki&amp;gt;: when initialising the database, a table prefix is used for table names, which is configured on a per-wiki database (this allows different &amp;lt;nowiki&amp;gt;MediaWiki&amp;lt;/nowiki&amp;gt; instances to be stored in the same database instance, which is kind of a useless feature, but anyway...). On wiki.meego.org, this prefix is set to &amp;quot;&amp;quot;, while the detault &amp;lt;nowiki&amp;gt;MediaWiki&amp;lt;/nowiki&amp;gt; prefix is &amp;quot;mw_&amp;quot;, and this is what I used. Ideally, we would do like Wikipedia, and have a set of parameters containing the table names which would be used: &amp;lt;pre&amp;gt;&amp;lt;nowiki&amp;gt;${revision}&amp;lt;/nowiki&amp;gt;&amp;lt;/pre&amp;gt; instead of &amp;lt;pre&amp;gt;mw_revision&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Wiki editors per month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select&lt;br /&gt;
	user_edits.monthnum,&lt;br /&gt;
	user_edits.monthstr,&lt;br /&gt;
	count(*) as editor_count,&lt;br /&gt;
	user_active_edits.active_editor_count as active_editor_count&lt;br /&gt;
from&lt;br /&gt;
(&lt;br /&gt;
	select&lt;br /&gt;
		date_format(rev_timestamp,'%Y%m') as monthnum,&lt;br /&gt;
		date_format(rev_timestamp,'%b %y') as monthstr,&lt;br /&gt;
	    	CAST(rev_user_text as CHAR) AS username,&lt;br /&gt;
		count(*) as editcount&lt;br /&gt;
	from&lt;br /&gt;
		`mw_revision`&lt;br /&gt;
	group by &lt;br /&gt;
		monthnum,username&lt;br /&gt;
	HAVING monthnum &amp;gt; 201001&lt;br /&gt;
		and monthnum &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
	order by &lt;br /&gt;
		monthnum,editcount desc&lt;br /&gt;
)&lt;br /&gt;
as user_edits&lt;br /&gt;
inner join&lt;br /&gt;
(&lt;br /&gt;
	select&lt;br /&gt;
		monthnum,&lt;br /&gt;
		count(*) as active_editor_count&lt;br /&gt;
	from&lt;br /&gt;
	(&lt;br /&gt;
		select&lt;br /&gt;
			date_format(rev_timestamp,'%Y%m') as monthnum,&lt;br /&gt;
		    	CAST(rev_user_text as CHAR) AS username,&lt;br /&gt;
			count(*) as editcount&lt;br /&gt;
		from&lt;br /&gt;
			`mw_revision`&lt;br /&gt;
		group by &lt;br /&gt;
			monthnum,username&lt;br /&gt;
		having &lt;br /&gt;
			editcount &amp;gt;1&lt;br /&gt;
			and monthnum &amp;gt; 201001&lt;br /&gt;
			and monthnum &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
	)&lt;br /&gt;
	as&lt;br /&gt;
		active_editors&lt;br /&gt;
	group by &lt;br /&gt;
		monthnum&lt;br /&gt;
	order by &lt;br /&gt;
		monthnum&lt;br /&gt;
)&lt;br /&gt;
as user_active_edits&lt;br /&gt;
on user_active_edits.monthnum=user_edits.monthnum&lt;br /&gt;
group by&lt;br /&gt;
	user_edits.monthnum&lt;br /&gt;
order by&lt;br /&gt;
	user_edits.monthnum;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Wiki edits per month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT &lt;br /&gt;
		date_format(rev_timestamp,'%Y%m') as monthnum, &lt;br /&gt;
		count(*) as edit_count &lt;br /&gt;
	FROM `mw_revision` &lt;br /&gt;
	GROUP BY monthnum &lt;br /&gt;
	HAVING monthnum &amp;gt; 201001 and&lt;br /&gt;
		monthnum &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Wiki top editors last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     CAST(rev_user_text as CHAR) AS user,&lt;br /&gt;
     COUNT(*) AS edits&lt;br /&gt;
FROM `mw_revision`&lt;br /&gt;
WHERE date_format(rev_timestamp, '%Y%m') = date_format(${last_month}, '%Y%m') &lt;br /&gt;
GROUP BY user&lt;br /&gt;
ORDER BY edits desc&lt;br /&gt;
LIMIT 20;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Edits all time ===&lt;br /&gt;
&lt;br /&gt;
A test query, unused in the report.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     COUNT(*) AS edits&lt;br /&gt;
FROM&lt;br /&gt;
     `mw_revision` rev&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Wiki edits and new pages per month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	new_pages.monthnum,&lt;br /&gt;
	new_pages.monthstr,&lt;br /&gt;
	edit_count,&lt;br /&gt;
	count(*) as new_page_count &lt;br /&gt;
from &lt;br /&gt;
(&lt;br /&gt;
	select &lt;br /&gt;
		page_id,&lt;br /&gt;
		date_format(min(rev_timestamp),'%Y%m') as monthnum,&lt;br /&gt;
		date_format(min(rev_timestamp),'%b %y') as monthstr &lt;br /&gt;
	from &lt;br /&gt;
		mw_page, &lt;br /&gt;
		mw_revision &lt;br /&gt;
	where &lt;br /&gt;
		page_id=rev_page and &lt;br /&gt;
		date_format(rev_timestamp,'%Y%m')&amp;gt;'201001'&lt;br /&gt;
	group by page_id&lt;br /&gt;
) as new_pages &lt;br /&gt;
inner join &lt;br /&gt;
(&lt;br /&gt;
	SELECT &lt;br /&gt;
		date_format(rev_timestamp,'%Y%m') as monthnum, &lt;br /&gt;
		count(*) as edit_count &lt;br /&gt;
	FROM `mw_revision` &lt;br /&gt;
	GROUP BY monthnum &lt;br /&gt;
	HAVING monthnum &amp;gt; 201001&lt;br /&gt;
) as edits &lt;br /&gt;
on &lt;br /&gt;
	edits.monthnum=new_pages.monthnum and&lt;br /&gt;
	new_pages.monthnum &amp;lt;= date_format(${last_month},'%Y%m') &lt;br /&gt;
group by &lt;br /&gt;
	new_pages.monthnum &lt;br /&gt;
order by &lt;br /&gt;
	new_pages.monthnum;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Wiki new pages per month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     COUNT(*) AS pages,&lt;br /&gt;
     date_format(rev_timestamp,'%Y%m') as monthnum, &lt;br /&gt;
     date_format(rev_timestamp, '%M %Y') as monthstr,&lt;br /&gt;
     year(rev_timestamp) as year,&lt;br /&gt;
     monthname(rev_timestamp) as month&lt;br /&gt;
FROM&lt;br /&gt;
     `mw_revision` rev&lt;br /&gt;
WHERE&lt;br /&gt;
     date_format(rev_timestamp,'%Y%m') &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
GROUP BY&lt;br /&gt;
     year,&lt;br /&gt;
     month&lt;br /&gt;
HAVING&lt;br /&gt;
     monthnum &amp;gt; 201001&lt;br /&gt;
ORDER BY monthnum&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Wiki editors &amp;gt;1 edits per month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select&lt;br /&gt;
	monthnum,&lt;br /&gt;
	count(*) as active_editor_count&lt;br /&gt;
from&lt;br /&gt;
(&lt;br /&gt;
	select&lt;br /&gt;
		date_format(rev_timestamp,'%Y%m') as monthnum,&lt;br /&gt;
	    	CAST(rev_user_text as CHAR) AS username,&lt;br /&gt;
		count(*) as editcount&lt;br /&gt;
	from&lt;br /&gt;
		`mw_revision`&lt;br /&gt;
	group by &lt;br /&gt;
		monthnum,username&lt;br /&gt;
	having &lt;br /&gt;
		editcount &amp;gt;1&lt;br /&gt;
	order by &lt;br /&gt;
		monthnum,editcount desc&lt;br /&gt;
)&lt;br /&gt;
as user_edits&lt;br /&gt;
group by&lt;br /&gt;
	monthnum&lt;br /&gt;
order by&lt;br /&gt;
	monthnum&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Top editors all time ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT CAST(rev_user_text as CHAR) AS user,&lt;br /&gt;
       COUNT(*) AS c&lt;br /&gt;
FROM `mw_revision` AS rev&lt;br /&gt;
GROUP BY user&lt;br /&gt;
ORDER BY c desc&lt;br /&gt;
LIMIT 10;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== &amp;lt;nowiki&amp;gt;MediaWiki&amp;lt;/nowiki&amp;gt; database schema ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;nowiki&amp;gt;MediaWiki&amp;lt;/nowiki&amp;gt; uses 41 different tables in version 1.15. Not all of them are useful for community metrics. Those with potentially useful information are:&lt;br /&gt;
&lt;br /&gt;
* mw_page: Each page in the wiki has one entry in the &amp;quot;page&amp;quot; table, containing the name of its creator, a unique ID, and a pointer to the latest revision.&lt;br /&gt;
* mw_revision: Every edit made in the wiki is recorded in the revision page. Information recorded includes the author, timestamp, and a pointer to the text of the page after the revision&lt;br /&gt;
* mw_text: Contains the page text for a given revision of a given page. Potentially content analysis possible.&lt;br /&gt;
* mw_user: Allows cross-referencing of usernames and user IDs, among other things.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
What follows is the schema for the &amp;quot;important&amp;quot; tables (user, page, revision, text)&lt;br /&gt;
&lt;br /&gt;
=== Table 'user' ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `mw_user` (&lt;br /&gt;
  `user_id` int(10) unsigned NOT NULL,&lt;br /&gt;
  `user_name` varbinary(255) NOT NULL DEFAULT '',&lt;br /&gt;
  `user_real_name` varbinary(255) NOT NULL DEFAULT '',&lt;br /&gt;
  `user_password` tinyblob NOT NULL,&lt;br /&gt;
  `user_newpassword` tinyblob NOT NULL,&lt;br /&gt;
  `user_newpass_time` binary(14) DEFAULT NULL,&lt;br /&gt;
  `user_email` tinyblob NOT NULL,&lt;br /&gt;
  `user_options` blob NOT NULL,&lt;br /&gt;
  `user_touched` binary(14) NOT NULL DEFAULT '\0\0\0\0\0\0\0\0\0\0\0\0\0\0',&lt;br /&gt;
  `user_token` binary(32) NOT NULL DEFAULT '\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0',&lt;br /&gt;
  `user_email_authenticated` binary(14) DEFAULT NULL,&lt;br /&gt;
  `user_email_token` binary(32) DEFAULT NULL,&lt;br /&gt;
  `user_email_token_expires` binary(14) DEFAULT NULL,&lt;br /&gt;
  `user_registration` binary(14) DEFAULT NULL,&lt;br /&gt;
  `user_editcount` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`user_id`),&lt;br /&gt;
  UNIQUE KEY `user_name` (`user_name`),&lt;br /&gt;
  KEY `user_email_token` (`user_email_token`)&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Table 'page' ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `mw_page` (&lt;br /&gt;
  `page_id` int(10) unsigned NOT NULL,&lt;br /&gt;
  `page_namespace` int(11) NOT NULL,&lt;br /&gt;
  `page_title` varbinary(255) NOT NULL,&lt;br /&gt;
  `page_restrictions` tinyblob NOT NULL,&lt;br /&gt;
  `page_counter` bigint(20) unsigned NOT NULL DEFAULT '0',&lt;br /&gt;
  `page_is_redirect` tinyint(3) unsigned NOT NULL DEFAULT '0',&lt;br /&gt;
  `page_is_new` tinyint(3) unsigned NOT NULL DEFAULT '0',&lt;br /&gt;
  `page_random` double unsigned NOT NULL,&lt;br /&gt;
  `page_touched` binary(14) NOT NULL DEFAULT '\0\0\0\0\0\0\0\0\0\0\0\0\0\0',&lt;br /&gt;
  `page_latest` int(10) unsigned NOT NULL,&lt;br /&gt;
  `page_len` int(10) unsigned NOT NULL,&lt;br /&gt;
  PRIMARY KEY (`page_id`),&lt;br /&gt;
  UNIQUE KEY `name_title` (`page_namespace`,`page_title`),&lt;br /&gt;
  KEY `page_random` (`page_random`),&lt;br /&gt;
  KEY `page_len` (`page_len`)&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Table 'revision' ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `mw_revision` (&lt;br /&gt;
  `rev_id` int(10) unsigned NOT NULL,&lt;br /&gt;
  `rev_page` int(10) unsigned NOT NULL,&lt;br /&gt;
  `rev_text_id` int(10) unsigned NOT NULL,&lt;br /&gt;
  `rev_comment` tinyblob NOT NULL,&lt;br /&gt;
  `rev_user` int(10) unsigned NOT NULL DEFAULT '0',&lt;br /&gt;
  `rev_user_text` varbinary(255) NOT NULL DEFAULT '',&lt;br /&gt;
  `rev_timestamp` binary(14) NOT NULL DEFAULT '\0\0\0\0\0\0\0\0\0\0\0\0\0\0',&lt;br /&gt;
  `rev_minor_edit` tinyint(3) unsigned NOT NULL DEFAULT '0',&lt;br /&gt;
  `rev_deleted` tinyint(3) unsigned NOT NULL DEFAULT '0',&lt;br /&gt;
  `rev_len` int(10) unsigned DEFAULT NULL,&lt;br /&gt;
  `rev_parent_id` int(10) unsigned DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`rev_id`),&lt;br /&gt;
  UNIQUE KEY `rev_page_id` (`rev_page`,`rev_id`),&lt;br /&gt;
  KEY `rev_timestamp` (`rev_timestamp`),&lt;br /&gt;
  KEY `page_timestamp` (`rev_page`,`rev_timestamp`),&lt;br /&gt;
  KEY `user_timestamp` (`rev_user`,`rev_timestamp`),&lt;br /&gt;
  KEY `usertext_timestamp` (`rev_user_text`,`rev_timestamp`)&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Table 'text' ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `mw_text` (&lt;br /&gt;
  `old_id` int(10) unsigned NOT NULL,&lt;br /&gt;
  `old_text` mediumblob NOT NULL,&lt;br /&gt;
  `old_flags` tinyblob NOT NULL,&lt;br /&gt;
  PRIMARY KEY (`old_id`)&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Mailing_list_queries</id>
		<title>Metrics/Mailing list queries</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Mailing_list_queries"/>
				<updated>2011-10-12T16:22:55Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Add schema for mlstats&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;These are all of the queries which we have used for mailing list statistics in the monthly report:&lt;br /&gt;
&lt;br /&gt;
Note: last_month is a parameter set once in the master report representing the month and year of the previous month, in the format 'YYYYMM'. For example, in September 2011, lastmonth is '201108'.&lt;br /&gt;
&lt;br /&gt;
=== ML posts per list last 4 months ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;nowiki&amp;gt;SELECT&lt;br /&gt;
     if(messages.mailing_list_url like '%-community%' or &lt;br /&gt;
        messages.mailing_list_url like '%meego-dev%' or&lt;br /&gt;
        messages.mailing_list_url like '%-sdk%' or&lt;br /&gt;
        messages.mailing_list_url like '%-packaging%' or&lt;br /&gt;
        messages.mailing_list_url like '%-qa%',&lt;br /&gt;
     substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1),&lt;br /&gt;
     'Other') as list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%b %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS posts&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
	month(first_date)&amp;gt;=MONTH(DATE_SUB(NOW(), INTERVAL 4 MONTH)) and&lt;br /&gt;
	year(first_date)&amp;gt;=YEAR(DATE_SUB(NOW(), INTERVAL 4 MONTH)) and&lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and&lt;br /&gt;
     date_format(first_date,'%Y%m') &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
GROUP BY&lt;br /&gt;
     list,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     list ASC,&lt;br /&gt;
     monthnum ASC&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML message count per list name first ===&lt;br /&gt;
&lt;br /&gt;
This query is unused in the report, I think. Left in for posterity.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1) AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     list ASC,&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML repeat posters last month ===&lt;br /&gt;
&lt;br /&gt;
Provides the number of active mailing list participants - sending more than one email per month.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select&lt;br /&gt;
    monthname(${last_month}) as monthname,&lt;br /&gt;
    year(${last_month}) as y,&lt;br /&gt;
    count(*) as c&lt;br /&gt;
from&lt;br /&gt;
    ( select &lt;br /&gt;
          p.email_address as member,&lt;br /&gt;
          count(*) AS c&lt;br /&gt;
      from messages as m,&lt;br /&gt;
           messages_people as p&lt;br /&gt;
      where&lt;br /&gt;
          m.message_id=p.message_ID and&lt;br /&gt;
          month(m.first_date)=month(${last_month}) and&lt;br /&gt;
          year(m.first_date)=year(${last_month}) and&lt;br /&gt;
          p.email_address NOT LIKE '%no_reply@%'&lt;br /&gt;
      group by p.email_address&lt;br /&gt;
    )&lt;br /&gt;
as repeat_posters&lt;br /&gt;
where c&amp;gt;=2;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML total mails last month ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     monthname(${last_month}) as monthname,&lt;br /&gt;
     count(*) AS message_count&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     month(first_date) = month(${last_month}) and &lt;br /&gt;
     year(first_date) = year(${last_month}) and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%'&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML popular threads last month ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	subject,&lt;br /&gt;
	count(*) as c&lt;br /&gt;
from messages&lt;br /&gt;
where year(first_date)=year(${last_month})&lt;br /&gt;
  and month(first_date)=month(${last_month})&lt;br /&gt;
group by subject&lt;br /&gt;
order by c desc&lt;br /&gt;
limit 10;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML top 10 posters last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	p.email_address,&lt;br /&gt;
	count(*) as c&lt;br /&gt;
from&lt;br /&gt;
	messages as m,&lt;br /&gt;
	messages_people as p &lt;br /&gt;
where &lt;br /&gt;
	m.message_id=p.message_ID&lt;br /&gt;
 and month(m.first_date)=month(${last_month})&lt;br /&gt;
 and year(m.first_date)=year(${last_month})&lt;br /&gt;
 and p.email_address NOT LIKE '%no_reply@%'&lt;br /&gt;
group by p.email_address&lt;br /&gt;
order by c desc&lt;br /&gt;
limit 10;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML message count top 5 lists time first ===&lt;br /&gt;
&lt;br /&gt;
This query is used to graph the evolution of each mailing list in a graph. The top 5 lists are taken out, and the others are grouped together.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
    if(messages.mailing_list_url like '%-community%' or &lt;br /&gt;
       messages.mailing_list_url like '%meego-dev%' or&lt;br /&gt;
       messages.mailing_list_url like '%-sdk%' or&lt;br /&gt;
       messages.mailing_list_url like '%-packaging%' or&lt;br /&gt;
       messages.mailing_list_url like '%-qa%',&lt;br /&gt;
    substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1),&lt;br /&gt;
    'Other') as list,&lt;br /&gt;
    year(first_date) AS y,&lt;br /&gt;
    monthname(first_date) AS mon,&lt;br /&gt;
    month(first_date) AS m,&lt;br /&gt;
    date_format(first_date, '%b %Y') as monthstr,&lt;br /&gt;
    date_format(first_date,'%Y%m') as monthnum,&lt;br /&gt;
    count(*) AS c&lt;br /&gt;
FROM      &lt;br /&gt;
    `messages` &lt;br /&gt;
WHERE      &lt;br /&gt;
    year(first_date) &amp;gt; 1979 and       &lt;br /&gt;
    date_format(first_date,'%Y%m') &amp;lt;= date_format(${last_month},'%Y%m') and &lt;br /&gt;
    messages.mailing_list_url not like '%meego-commit%'&lt;br /&gt;
GROUP BY&lt;br /&gt;
    list,y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
    monthnum ASC,&lt;br /&gt;
    list asc,&lt;br /&gt;
    c asc;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML top 20 posters last month ===&lt;br /&gt;
&lt;br /&gt;
I should really have done a &amp;quot;top N posters last month&amp;quot; and made N a parameter. This was the quick copy &amp;amp; paste way to go.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	p.email_address,&lt;br /&gt;
	count(*) as c&lt;br /&gt;
from&lt;br /&gt;
	messages as m,&lt;br /&gt;
	messages_people as p &lt;br /&gt;
where &lt;br /&gt;
	m.message_id=p.message_ID&lt;br /&gt;
 and month(m.first_date)=month(${last_month})&lt;br /&gt;
 and year(m.first_date)=year(${last_month})&lt;br /&gt;
 and p.email_address NOT LIKE '%no_reply@%'&lt;br /&gt;
group by p.email_address&lt;br /&gt;
order by c desc&lt;br /&gt;
limit 20;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML posters count last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	count(distinct p.email_address) as member_count,&lt;br /&gt;
	monthname(${last_month}) AS monthname&lt;br /&gt;
from&lt;br /&gt;
	messages as m,&lt;br /&gt;
	messages_people as p &lt;br /&gt;
where&lt;br /&gt;
	m.message_id=p.message_ID&lt;br /&gt;
 and month(m.first_date)=month(${last_month})&lt;br /&gt;
 and year(m.first_date)=year(${last_month})&lt;br /&gt;
 and p.email_address NOT LIKE '%no_reply@%';&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML message count per list time first ===&lt;br /&gt;
&lt;br /&gt;
All mailing lists. The graph was far too cluttered.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1) AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%b %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and&lt;br /&gt;
     date_format(first_date,'%Y%m') &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
GROUP BY&lt;br /&gt;
     list,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== MLStats database schema ==&lt;br /&gt;
&lt;br /&gt;
=== Table compressed_files ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `compressed_files` (&lt;br /&gt;
  `url` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  `mailing_list_url` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  `status` enum('new','visited','failed') DEFAULT NULL,&lt;br /&gt;
  `last_analysis` datetime DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`url`),&lt;br /&gt;
  KEY `mailing_list_url` (`mailing_list_url`),&lt;br /&gt;
  CONSTRAINT `compressed_files_ibfk_1` FOREIGN KEY (`mailing_list_url`) REFERENCES `mailing_lists` (`mailing_list_url`)&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Table mailing_lists ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `mailing_lists` (&lt;br /&gt;
  `mailing_list_url` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  `mailing_list_name` varchar(255) CHARACTER SET utf8 DEFAULT 'NULL',&lt;br /&gt;
  `project_name` varchar(255) CHARACTER SET utf8 DEFAULT 'NULL',&lt;br /&gt;
  `last_analysis` datetime DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`mailing_list_url`)&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Table mailing_lists_people ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `mailing_lists_people` (&lt;br /&gt;
  `email_address` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  `mailing_list_url` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  PRIMARY KEY (`email_address`,`mailing_list_url`),&lt;br /&gt;
  KEY `mailing_list_url` (`mailing_list_url`),&lt;br /&gt;
  CONSTRAINT `mailing_lists_people_ibfk_1` FOREIGN KEY (`mailing_list_url`) REFERENCES `mailing_lists` (`mailing_list_url`) ON DELETE CASCADE ON UPDATE CASCADE,&lt;br /&gt;
  CONSTRAINT `mailing_lists_people_ibfk_2` FOREIGN KEY (`email_address`) REFERENCES `people` (`email_address`) ON DELETE CASCADE ON UPDATE CASCADE&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Table messages ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `messages` (&lt;br /&gt;
  `message_ID` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  `mailing_list_url` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  `mailing_list` varchar(255) CHARACTER SET utf8 DEFAULT NULL,&lt;br /&gt;
  `first_date` datetime DEFAULT NULL,&lt;br /&gt;
  `first_date_tz` int(11) DEFAULT NULL,&lt;br /&gt;
  `arrival_date` datetime DEFAULT NULL,&lt;br /&gt;
  `arrival_date_tz` int(11) DEFAULT NULL,&lt;br /&gt;
  `subject` varchar(255) CHARACTER SET utf8 DEFAULT NULL,&lt;br /&gt;
  `message_body` text CHARACTER SET utf8,&lt;br /&gt;
  `is_response_of` varchar(255) CHARACTER SET utf8 DEFAULT NULL,&lt;br /&gt;
  `mail_path` text CHARACTER SET utf8,&lt;br /&gt;
  PRIMARY KEY (`message_ID`),&lt;br /&gt;
  KEY `response` (`is_response_of`),&lt;br /&gt;
  KEY `mailing_list_url` (`mailing_list_url`),&lt;br /&gt;
  CONSTRAINT `messages_ibfk_1` FOREIGN KEY (`mailing_list_url`) REFERENCES `mailing_lists` (`mailing_list_url`) ON DELETE CASCADE ON UPDATE CASCADE&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Table messages_people ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `messages_people` (&lt;br /&gt;
  `type_of_recipient` enum('From','To','Cc') NOT NULL DEFAULT 'From',&lt;br /&gt;
  `message_id` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  `email_address` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  PRIMARY KEY (`type_of_recipient`,`message_id`,`email_address`),&lt;br /&gt;
  KEY `m_id` (`message_id`),&lt;br /&gt;
  KEY `message_id` (`message_id`),&lt;br /&gt;
  KEY `email_address` (`email_address`),&lt;br /&gt;
  CONSTRAINT `messages_people_ibfk_1` FOREIGN KEY (`message_id`) REFERENCES `messages` (`message_ID`) ON DELETE CASCADE ON UPDATE CASCADE,&lt;br /&gt;
  CONSTRAINT `messages_people_ibfk_2` FOREIGN KEY (`email_address`) REFERENCES `people` (`email_address`) ON DELETE CASCADE ON UPDATE CASCADE&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Table people ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
CREATE TABLE `people` (&lt;br /&gt;
  `email_address` varchar(255) CHARACTER SET utf8 NOT NULL,&lt;br /&gt;
  `name` varchar(255) CHARACTER SET utf8 DEFAULT NULL,&lt;br /&gt;
  `username` varchar(255) CHARACTER SET utf8 DEFAULT NULL,&lt;br /&gt;
  `domain_name` varchar(255) CHARACTER SET utf8 DEFAULT NULL,&lt;br /&gt;
  `top_level_domain` varchar(255) CHARACTER SET utf8 DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`email_address`)&lt;br /&gt;
);&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Mailing_list_queries</id>
		<title>Metrics/Mailing list queries</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Mailing_list_queries"/>
				<updated>2011-10-11T16:33:41Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: MLStats queries&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;These are all of the queries which we have used for mailing list statistics in the monthly report:&lt;br /&gt;
&lt;br /&gt;
Note: last_month is a parameter set once in the master report representing the month and year of the previous month, in the format 'YYYYMM'. For example, in September 2011, lastmonth is '201108'.&lt;br /&gt;
&lt;br /&gt;
=== ML posts per list last 4 months ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;nowiki&amp;gt;SELECT&lt;br /&gt;
     if(messages.mailing_list_url like '%-community%' or &lt;br /&gt;
        messages.mailing_list_url like '%meego-dev%' or&lt;br /&gt;
        messages.mailing_list_url like '%-sdk%' or&lt;br /&gt;
        messages.mailing_list_url like '%-packaging%' or&lt;br /&gt;
        messages.mailing_list_url like '%-qa%',&lt;br /&gt;
     substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1),&lt;br /&gt;
     'Other') as list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%b %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS posts&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
	month(first_date)&amp;gt;=MONTH(DATE_SUB(NOW(), INTERVAL 4 MONTH)) and&lt;br /&gt;
	year(first_date)&amp;gt;=YEAR(DATE_SUB(NOW(), INTERVAL 4 MONTH)) and&lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and&lt;br /&gt;
     date_format(first_date,'%Y%m') &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
GROUP BY&lt;br /&gt;
     list,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     list ASC,&lt;br /&gt;
     monthnum ASC&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML message count per list name first ===&lt;br /&gt;
&lt;br /&gt;
This query is unused in the report, I think. Left in for posterity.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1) AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     list ASC,&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML repeat posters last month ===&lt;br /&gt;
&lt;br /&gt;
Provides the number of active mailing list participants - sending more than one email per month.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select&lt;br /&gt;
    monthname(${last_month}) as monthname,&lt;br /&gt;
    year(${last_month}) as y,&lt;br /&gt;
    count(*) as c&lt;br /&gt;
from&lt;br /&gt;
    ( select &lt;br /&gt;
          p.email_address as member,&lt;br /&gt;
          count(*) AS c&lt;br /&gt;
      from messages as m,&lt;br /&gt;
           messages_people as p&lt;br /&gt;
      where&lt;br /&gt;
          m.message_id=p.message_ID and&lt;br /&gt;
          month(m.first_date)=month(${last_month}) and&lt;br /&gt;
          year(m.first_date)=year(${last_month}) and&lt;br /&gt;
          p.email_address NOT LIKE '%no_reply@%'&lt;br /&gt;
      group by p.email_address&lt;br /&gt;
    )&lt;br /&gt;
as repeat_posters&lt;br /&gt;
where c&amp;gt;=2;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML total mails last month ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     monthname(${last_month}) as monthname,&lt;br /&gt;
     count(*) AS message_count&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     month(first_date) = month(${last_month}) and &lt;br /&gt;
     year(first_date) = year(${last_month}) and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%'&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML popular threads last month ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	subject,&lt;br /&gt;
	count(*) as c&lt;br /&gt;
from messages&lt;br /&gt;
where year(first_date)=year(${last_month})&lt;br /&gt;
  and month(first_date)=month(${last_month})&lt;br /&gt;
group by subject&lt;br /&gt;
order by c desc&lt;br /&gt;
limit 10;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML top 10 posters last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	p.email_address,&lt;br /&gt;
	count(*) as c&lt;br /&gt;
from&lt;br /&gt;
	messages as m,&lt;br /&gt;
	messages_people as p &lt;br /&gt;
where &lt;br /&gt;
	m.message_id=p.message_ID&lt;br /&gt;
 and month(m.first_date)=month(${last_month})&lt;br /&gt;
 and year(m.first_date)=year(${last_month})&lt;br /&gt;
 and p.email_address NOT LIKE '%no_reply@%'&lt;br /&gt;
group by p.email_address&lt;br /&gt;
order by c desc&lt;br /&gt;
limit 10;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML message count top 5 lists time first ===&lt;br /&gt;
&lt;br /&gt;
This query is used to graph the evolution of each mailing list in a graph. The top 5 lists are taken out, and the others are grouped together.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
    if(messages.mailing_list_url like '%-community%' or &lt;br /&gt;
       messages.mailing_list_url like '%meego-dev%' or&lt;br /&gt;
       messages.mailing_list_url like '%-sdk%' or&lt;br /&gt;
       messages.mailing_list_url like '%-packaging%' or&lt;br /&gt;
       messages.mailing_list_url like '%-qa%',&lt;br /&gt;
    substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1),&lt;br /&gt;
    'Other') as list,&lt;br /&gt;
    year(first_date) AS y,&lt;br /&gt;
    monthname(first_date) AS mon,&lt;br /&gt;
    month(first_date) AS m,&lt;br /&gt;
    date_format(first_date, '%b %Y') as monthstr,&lt;br /&gt;
    date_format(first_date,'%Y%m') as monthnum,&lt;br /&gt;
    count(*) AS c&lt;br /&gt;
FROM      &lt;br /&gt;
    `messages` &lt;br /&gt;
WHERE      &lt;br /&gt;
    year(first_date) &amp;gt; 1979 and       &lt;br /&gt;
    date_format(first_date,'%Y%m') &amp;lt;= date_format(${last_month},'%Y%m') and &lt;br /&gt;
    messages.mailing_list_url not like '%meego-commit%'&lt;br /&gt;
GROUP BY&lt;br /&gt;
    list,y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
    monthnum ASC,&lt;br /&gt;
    list asc,&lt;br /&gt;
    c asc;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML top 20 posters last month ===&lt;br /&gt;
&lt;br /&gt;
I should really have done a &amp;quot;top N posters last month&amp;quot; and made N a parameter. This was the quick copy &amp;amp; paste way to go.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	p.email_address,&lt;br /&gt;
	count(*) as c&lt;br /&gt;
from&lt;br /&gt;
	messages as m,&lt;br /&gt;
	messages_people as p &lt;br /&gt;
where &lt;br /&gt;
	m.message_id=p.message_ID&lt;br /&gt;
 and month(m.first_date)=month(${last_month})&lt;br /&gt;
 and year(m.first_date)=year(${last_month})&lt;br /&gt;
 and p.email_address NOT LIKE '%no_reply@%'&lt;br /&gt;
group by p.email_address&lt;br /&gt;
order by c desc&lt;br /&gt;
limit 20;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML posters count last month ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
select &lt;br /&gt;
	count(distinct p.email_address) as member_count,&lt;br /&gt;
	monthname(${last_month}) AS monthname&lt;br /&gt;
from&lt;br /&gt;
	messages as m,&lt;br /&gt;
	messages_people as p &lt;br /&gt;
where&lt;br /&gt;
	m.message_id=p.message_ID&lt;br /&gt;
 and month(m.first_date)=month(${last_month})&lt;br /&gt;
 and year(m.first_date)=year(${last_month})&lt;br /&gt;
 and p.email_address NOT LIKE '%no_reply@%';&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== ML message count per list time first ===&lt;br /&gt;
&lt;br /&gt;
All mailing lists. The graph was far too cluttered.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;nowiki&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     substring_index(TRIM(TRAILING '/' FROM `messages`.`mailing_list_url`), '/', -1) AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%b %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and&lt;br /&gt;
     date_format(first_date,'%Y%m') &amp;lt;= date_format(${last_month},'%Y%m')&lt;br /&gt;
GROUP BY&lt;br /&gt;
     list,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC;&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Dashboard</id>
		<title>Metrics/Dashboard</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Dashboard"/>
				<updated>2011-10-11T16:21:44Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Architecture */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Community Metrics Dashboard ==&lt;br /&gt;
&lt;br /&gt;
The goal is to provide a web page summarising metrics about various aspects of the MeeGo project. The data should update regularly - depending on the metric, that could be real time or  updated automatically on a regular basis.&lt;br /&gt;
&lt;br /&gt;
The dashboard will track the following community resources, ideally: &lt;br /&gt;
* Drupal members&lt;br /&gt;
* Bugzilla (bugs opened, bugs closed, active users)&lt;br /&gt;
* Mailing lists (members, posts, threads)&lt;br /&gt;
* gitorious (commits, employer details for committers) - should use Jon Corbet's scripts like are used in the LF yearly kernel data.&lt;br /&gt;
* Wiki (edits, new pages)&lt;br /&gt;
* Forums (members, posts)&lt;br /&gt;
* IRC (total comments, people on channel)&lt;br /&gt;
* Transifex (Languages, translators, strings translated)&lt;br /&gt;
* Community OBS (uploads, users)&lt;br /&gt;
* SDK downloads (potentially extrapolated from meego.com)&lt;br /&gt;
&lt;br /&gt;
The data should also be available for custom reports for usage and analysis in the monthly MeeGo Metrics report published by [[User:DawnFoster]]&lt;br /&gt;
&lt;br /&gt;
To fulfill these goals, the dashboard will gather data from the various resource into a centralised database, using some sort of Business Intelligence platform including ETL for data acquisition and storage, and a reporting service for generating reports and dashboards.. A web page will provide a view into this database with predefined reports.&lt;br /&gt;
&lt;br /&gt;
Candidate reporting solutions:&lt;br /&gt;
&lt;br /&gt;
* [http://jasperforge.org/index.php?q=project/jasperreports JasperReports]&lt;br /&gt;
* [http://www.pentaho.com/ Pentaho]&lt;br /&gt;
&lt;br /&gt;
The following are essentially ETL engines, and do not provide reporting or dashboard functionality:&lt;br /&gt;
&lt;br /&gt;
* [http://www.talend.com/index.php Talend]&lt;br /&gt;
* [http://petals.ow2.org/ Petals]&lt;br /&gt;
&lt;br /&gt;
[http://www.mulesoft.com/ MuleSoft] is an open source ESB, but does not seem adapted to our needs. The field is thus narrowed to Pentaho and JasperReports.&lt;br /&gt;
&lt;br /&gt;
For each community resource, we need to figure out how to get the data into a usable form, and come up with appropriate queries for metrics reports, and finally present the results on a webpage.&lt;br /&gt;
&lt;br /&gt;
=== Business intelligence engines ===&lt;br /&gt;
&lt;br /&gt;
The area of Business Intelligence is littered with acronyms. Here's a quick overview of the main ones, and how they all fit together.&lt;br /&gt;
&lt;br /&gt;
; BI&lt;br /&gt;
: Business Intelligence - general name for any middleware which allows you to query business processes (sales, inventory, etc) and get data overviews from it&lt;br /&gt;
; ETL&lt;br /&gt;
: Extract, Transform, and Load - the process if extracting data from a data source (database, screen scraping, text file parsing, whatever), transforming it to a well understood format, and loading it in your BI engine database or data warehouse. Good ETL solutions provide a nice way for you to connect another database and have new data sucked in at regular intervals, define views into the source data store which you can then query within your BI engine, etc. Pentaho's ETL, [http://kettle.pentaho.com/ Kettle], and [http://www.jaspersoft.com/jasperetl JasperETL], used by JasperReports, both provide (kind of) straightforward ways to hook into a MySQL database.&lt;br /&gt;
; ESB&lt;br /&gt;
: [http://en.wikipedia.org/wiki/Enterprise_service_bus Enterprise Service Bus] - a middleware bus providing a unique interface to applications on the front-end and data stores on the back end. Often used to link up many front-end applications (eg. library, student registration, employee payroll, syllabus management, accounting, supply-chain, student lodgement programmes, etc in a university). Not really useful for us, as far as I can tell.&lt;br /&gt;
; EAI&lt;br /&gt;
: Enterprise Application Integration - using software to integrate different applications together. As far as I can tell, this is a meaningless catch-all phrase for anything from kludges to architected business intelligence solutions.&lt;br /&gt;
; DW&lt;br /&gt;
: Data Warehouse. Basically the same thing as a database, as far as I can tell, but bigger and more impressive sounding.&lt;br /&gt;
; OLAP&lt;br /&gt;
: On-Line Analytical Processing. Commonly used acronym for extracting data via multi-dimensional queries. Databases can be configured to provide the results of this kind of query. As far as I can tell this is mostly a buzzword - an &amp;quot;OLAP database&amp;quot; like [http://mondrian.pentaho.com/ Mondrian] is basically the same thing as a database. &amp;quot;speed-of-thought&amp;quot; response times indeed.&lt;br /&gt;
; Business reporting&lt;br /&gt;
: An application which allows a graphical view of a database, and allows you to construct queries interactively, often using drag &amp;amp; drop. The results of these queries can then be plugged into graphing software for presentation in a dashboard.&lt;br /&gt;
; Dashboard&lt;br /&gt;
: Organised presentation of information in a web-page or other similar format allowing an at-a-glance overview of the situation for the data being measured.&lt;br /&gt;
&lt;br /&gt;
So, in short, the community dashboard project will likely use an ETL to plug data into an OLAP server, and then use a business reporting engine to query that data and present it in a dashboard.&lt;br /&gt;
&lt;br /&gt;
=== Comparison of candidate ETL/reporting ===&lt;br /&gt;
&lt;br /&gt;
Modules available:&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
! Software !! License !! ETL !! OLAP database !! BI server !! Reporting !! Dashboard module&lt;br /&gt;
|-&lt;br /&gt;
| Pentaho || EPL || [http://kettle.pentaho.com/ Kettle] || [http://mondrian.pentaho.com/ Mondrian] || [http://community.pentaho.com/projects/bi_platform/ Pentaho BI Platform] || [http://reporting.pentaho.com/ Pentaho Reporting] || [http://wiki.pentaho.com/display/COM/Community+Dashboard+Framework Community Dashboard Framework]&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.jaspersoft.com/editions Jaspersoft] || AGPL v3 || JasperETL (Talend Open Studio) || JasperOLAP || JasperReports Server || iReports editor || No (commercial only)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Pentaho is used as the basis of Mozilla's metrics project, and provides a very strong community software option for both the dashboard and for managing the BI server. Since Mozilla metrics work overlaps what we are trying to achieve, particularly their work on SQR, the Software Quality Reports analytic module for Bugzilla and JIRA, Pentaho is my preference for the dashboard project. In general, I have observed that the Pentaho community provides very good support.&lt;br /&gt;
&lt;br /&gt;
== Architecture ==&lt;br /&gt;
&lt;br /&gt;
Pentaho runs as a webapp in Tomcat6. It can use a variety of databases for its internal data structures, the default (Hypersonic) is a Java database. However, because it's both standard &amp;amp; well understood and to allow consolidation of databases under one DB server, I prefer to use MySQL. The configuration of Pentaho with a MySQL database is a little tricky, but almost all of the steps are covered well [https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en in this tutorial].&lt;br /&gt;
&lt;br /&gt;
The data which is useful for metrics will be copied into a local database from each of the services we query. The copying of data will be accomplished by a set of Kettle &amp;quot;xactions&amp;quot;, which can be created and edited easily with the Spoon tool.&lt;br /&gt;
&lt;br /&gt;
A number of reports will be generated using the Pentaho Report Designer, including a static HTML/Flash dashboard which will be published regularly. Other reports can be created for the community managers, and a more advanced dashboard, allowing detailed analysis of basic metrics, can be provided via the Community Dashboard Framework.&lt;br /&gt;
&lt;br /&gt;
We will need to see how much load the dashboard will generate on the server. I suspect that it will not be practical to expose the dashboard in public.&lt;br /&gt;
&lt;br /&gt;
We will document here everything you need to do to replicate the MeeGo Community Dashboard, with the exception of data which is not publicly available because it contains security related or confidential information (mainly bugzilla).&lt;br /&gt;
&lt;br /&gt;
* [[../Installing Pentaho]]: A guide to getting Pentaho up and running for MeeGo Dashboard&lt;br /&gt;
* [[../Gathering data]]: A guide to installing and using the tools to gather data for a dashboard&lt;br /&gt;
* [[../Creating a report]]: Given data in a database, how do we generate a report in Pentaho, and deploy it to the dashboard?&lt;br /&gt;
&lt;br /&gt;
* [[../Mailing list queries]]: SQL queries against the MLStats database&lt;br /&gt;
&lt;br /&gt;
=== Extracting data ===&lt;br /&gt;
&lt;br /&gt;
For SQL databases, this implies that the server where the dashboard will run should have access to the database server for MediaWiki, Bugzilla, and Drupal.&lt;br /&gt;
&lt;br /&gt;
For the forum, we will integrate the CSV files currently being exported, which provide the basic analytics we need.&lt;br /&gt;
&lt;br /&gt;
Individual mailing lists will be parsed by [http://forge.morfeo-project.org/projects/libresoft-tools/ MLStats]. We will use the resulting database directly in the dashboard.&lt;br /&gt;
&lt;br /&gt;
Git repositories will be queried with &amp;quot;git log&amp;quot;, and parsed with the parser module from [http://lwn.net/Articles/290957/ gitdm], before being stored directly in a database. we will be able to run analytics on the results from there. gitdm can also do basic analytics of git logs, and we may decide to simply reuse gitdm's analytics. However, if we want to extend them, we will want to have the raw data.&lt;br /&gt;
&lt;br /&gt;
IRC logs will be parsed with [http://code.google.com/p/superseriousstats/ superseriousstats], a PHP command line tool that parses IRC logs and stores the results in an SQL database.&lt;br /&gt;
&lt;br /&gt;
We still need to figure out how to do data interchange with Transifex and OBS. Dimitris tells me that there are already [http://meego.transifex.net/stats/ some analytics] available on Transifex, and that there is a RESTful API available to query this data.&lt;br /&gt;
&lt;br /&gt;
== Data to report ==&lt;br /&gt;
&lt;br /&gt;
For each of the resources, the following statistics (at a minimum) should be extracted:&lt;br /&gt;
&lt;br /&gt;
=== Drupal ===&lt;br /&gt;
* Members of meego.com (+ evolution month over month)&lt;br /&gt;
* Active members (need a decent way to hook up different ways a person can be active: wiki, ML, IRC, forum, git)&lt;br /&gt;
&lt;br /&gt;
=== Mailing lists ===&lt;br /&gt;
* Subscriber numbers (+ evolution) - from Mailman directly, not available in mlstats&lt;br /&gt;
* Emails sent (+ evolution) - from mlstats&lt;br /&gt;
* Active participants (individuals with &amp;gt;=2 emails during month) - from mlstats&lt;br /&gt;
* Hot threads - from mlstats&lt;br /&gt;
* Top posters - from mlstats&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* Count posts of most popular threads:&lt;br /&gt;
*: &amp;lt;code&amp;gt;select subject,year(first_date) as y, monthname(first_date),count(*) as c from messages group by subject, month(first_date) order by y, month(first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Count the number of posts for each person&lt;br /&gt;
*: &amp;lt;code&amp;gt;select p.email_address,year(m.first_date) as y, monthname(m.first_date),count(*) as c from messages as m,messages_people as p where m.message_id=p.message_ID group by p.email_address, month(m.first_date) order by y, month(m.first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Restricting queries to a date range / month&lt;br /&gt;
*: The query above gives month-by-month totals, you could add an order by year(first_date) to get the year too&lt;br /&gt;
*: The easiest way is to add &amp;lt;code&amp;gt;where month(first_date)=3 and year(first_date)=2010&amp;lt;/code&amp;gt; for March 2010. For the current month, &amp;lt;code&amp;gt;month(m.first_date)=month(NOW()) and year(first_date)=year(NOW())&amp;lt;/code&amp;gt; works.&lt;br /&gt;
&lt;br /&gt;
=== Forum ===&lt;br /&gt;
* Posts per month (+evolution)&lt;br /&gt;
* Active posters (2+ posts during month)&lt;br /&gt;
* Hot topics&lt;br /&gt;
* Top posters&lt;br /&gt;
&lt;br /&gt;
[http://forum.meego.com/stats/ Stats] are exported from the Forum in CSV format monthly.&lt;br /&gt;
&lt;br /&gt;
=== Bugzilla ===&lt;br /&gt;
* Bugs created (+ evolution)&lt;br /&gt;
* Bugs resolved (+ evolution)&lt;br /&gt;
* Comments on bugs this month&lt;br /&gt;
* Active Bugzilla contributors (2+ comments during month)&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* [http://sourceforge.net/projects/qareports/ Software Quality Reports for Pentaho] of course&lt;br /&gt;
* See the [http://bugs.meego.com/query.cgi?format=report-table bugzilla tabular reports] and relative dates for more. &lt;br /&gt;
* Other potential resources:&lt;br /&gt;
** [http://www.ravenbrook.com/project/p4dti/tool/cgi/bugzilla-schema/ Bugzilla database schema by version]&lt;br /&gt;
** [https://landfill.bugzilla.org/bugzilla-tip/report.cgi Demo site for standard Bugzilla reports]&lt;br /&gt;
** [https://wiki.mozilla.org/Bugzilla:SQLCookBook The Bugzilla SQL cookbook]&lt;br /&gt;
** [http://www.bugzillametrics.org/ Bugzilla Metrics] - Third party add-on to Bugzilla that produces lots of metrics&lt;br /&gt;
** browse.cgi and weekly-summary.html extensions of GNOME Bugzilla: [https://launchpad.net/bugzilla.gnome.org Code], examples in practice: [http://bugzilla.gnome.org/browse.cgi?product=Evolution browse.cgi], [https://bugzilla.gnome.org/page.cgi?id=weekly-bug-summary.html weekly-summary.cgi]&lt;br /&gt;
&lt;br /&gt;
=== Mediawiki ===&lt;br /&gt;
* New wiki pages (+ evolution)&lt;br /&gt;
* Edits this month (+ evolution)&lt;br /&gt;
* Pages deleted this month (+ evolution)&lt;br /&gt;
* Unique editors this month (+ evoluion)&lt;br /&gt;
&lt;br /&gt;
==== Queries ====&lt;br /&gt;
&lt;br /&gt;
A [http://www.mediawiki.org/wiki/Extension:SpecialUserScore MediaWiki extension] exists to provide &amp;quot;user scores&amp;quot; for MediaWiki users, ordered by number of edits and number of pages changed. The guts of the query is:&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(wr.rev_id) as value,&lt;br /&gt;
        COUNT(DISTINCT wr.rev_page) as page_value,&lt;br /&gt;
        wu.user_name as name,&lt;br /&gt;
        wu.user_real_name as real_name&lt;br /&gt;
 FROM   $user wu,&lt;br /&gt;
        $revision wr,&lt;br /&gt;
        $page wp&lt;br /&gt;
 WHERE  wu.user_id = wr.rev_user&lt;br /&gt;
    and wp.page_id = wr.rev_page&lt;br /&gt;
    and wp.page_namespace = 0&lt;br /&gt;
 GROUP BY wu.user_name&lt;br /&gt;
 ORDER BY value desc;&lt;br /&gt;
&lt;br /&gt;
where $user, $revision and $page are the names of the respective MediaWiki tables (MediaWiki tables have a prefix associated with them for a given instance, specified by $wgDBprefix in LocalSettings.php).&lt;br /&gt;
&lt;br /&gt;
For the following group-by-month queries, I did a cross join of (2008,2009,2010,2011) and (01-12) to generate a &amp;quot;year and month&amp;quot; data table.&lt;br /&gt;
&lt;br /&gt;
'''Top editors by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year AS yyyy,&lt;br /&gt;
        mon.timestamp_month AS mm,&lt;br /&gt;
        rev_user_text AS user,&lt;br /&gt;
        COUNT(*) AS c&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months AS mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm,user&lt;br /&gt;
 HAVING c&amp;gt;5&lt;br /&gt;
 ORDER BY yyyy,mm,c desc;&lt;br /&gt;
&lt;br /&gt;
'''Number of edits by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*) AS edits&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
'''New pages per month:'''&lt;br /&gt;
To get the number of new pages per month is a bit trickier - first we need to query $revision to get the page_ids and their date of creation, then group by date. The query is O(n²) on the number of pages, although it should be possible to make it O(n) by grouping the result of the subquery without doing in() on the list of timestamps.&lt;br /&gt;
&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*)&lt;br /&gt;
 FROM mw_revision as rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE CONCAT(CONCAT(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
   AND rev.rev_timestamp in (&lt;br /&gt;
                SELECT MIN(rev_timestamp)&lt;br /&gt;
                FROM mw_revision&lt;br /&gt;
                GROUP BY rev_page)&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
To get just the list of pages &amp;amp; timestamps (this is used as the subquery for above):&lt;br /&gt;
 SELECT rev_page as p,&lt;br /&gt;
        MIN(rev_timestamp) as t&lt;br /&gt;
 FROM mw_revision&lt;br /&gt;
 GROUP BY rev_page;&lt;br /&gt;
&lt;br /&gt;
=== IRC ===&lt;br /&gt;
&lt;br /&gt;
superseriousstats does some preliminary analysis on data it stores in its database. Its author (tommyrot) has kindly added a parser for the format of the IRC logs we use (supybot) on my request. The [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v3.sql database schema] is a little hard to work out; Several key tables have fields with undescriptive names like l_01. There are some queries in [https://github.com/tommyrot/superseriousstats/blob/master/html.class.php html.class.php] which we can use to generate some reports, though.&lt;br /&gt;
 &lt;br /&gt;
* Total IRC activity (by hour)&lt;br /&gt;
 select sum(`l_00`) as `l_00`, sum(`l_01`) as `l_01`, sum(`l_02`) as `l_02`,&lt;br /&gt;
        sum(`l_03`) as `l_03`, sum(`l_04`) as `l_04`, sum(`l_05`) as `l_05`,&lt;br /&gt;
        sum(`l_06`) as `l_06`, sum(`l_07`) as `l_07`, sum(`l_08`) as `l_08`,&lt;br /&gt;
        sum(`l_09`) as `l_09`, sum(`l_10`) as `l_10`, sum(`l_11`) as `l_11`,&lt;br /&gt;
        sum(`l_12`) as `l_12`, sum(`l_13`) as `l_13`, sum(`l_14`) as `l_14`,&lt;br /&gt;
        sum(`l_15`) as `l_15`, sum(`l_16`) as `l_16`, sum(`l_17`) as `l_17`,&lt;br /&gt;
        sum(`l_18`) as `l_18`, sum(`l_19`) as `l_19`, sum(`l_20`) as `l_20`,&lt;br /&gt;
        sum(`l_21`) as `l_21`, sum(`l_22`) as `l_22`, sum(`l_23`) as `l_23`&lt;br /&gt;
   from `channel`&lt;br /&gt;
* Total active participants (+ evolution) - we may be able to get &amp;quot;number of participants per hour/day/month&amp;quot; (so you can see if it's 2 guys taking amongst themselves or a larger group) - I'll ask tommyrot what the query should look like.&lt;br /&gt;
* Top contributors (per month)&lt;br /&gt;
 select `q_lines`.`ruid`, `csnick`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_total`) as `l_total`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_night`) as `l_night`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_morning`) as `l_morning`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_afternoon`) as `l_afternoon`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_evening`) as `l_evening`,&lt;br /&gt;
        `quote` from `q_lines`&lt;br /&gt;
    join `q_activity_by_month` on `q_lines`.`ruid` = `q_activity_by_month`.`ruid`&lt;br /&gt;
    join `user_status` on `q_lines`.`ruid` = `user_status`.`uid`&lt;br /&gt;
    join `user_details` on `q_lines`.`ruid` = `user_details`.`uid`&lt;br /&gt;
    where `status` != 3&lt;br /&gt;
      and `date` = '2011-02'&lt;br /&gt;
    group by `q_lines`.`ruid`&lt;br /&gt;
    order by `q_activity_by_month`.`l_total` desc, `q_lines`.`ruid` asc limit 30&lt;br /&gt;
&lt;br /&gt;
=== Transifex ===&lt;br /&gt;
* Total languages ordered by translation coverage (+ evolution)&lt;br /&gt;
* Top languages&lt;br /&gt;
* Top translators/teams&lt;br /&gt;
&lt;br /&gt;
This all depends on what is available from Transifex.&lt;br /&gt;
&lt;br /&gt;
=== Git ===&lt;br /&gt;
* Commits this month (+ evolution)&lt;br /&gt;
* Top committers (+ evolution)&lt;br /&gt;
* Committers by company (possible)&lt;br /&gt;
* Active modules (+ evolution)&lt;br /&gt;
&lt;br /&gt;
Using a modified version of gitdm to dump Git logs into a MySQL database for analysis. Modifications required:&lt;br /&gt;
* Create database &amp;amp; tables based on the gitdm data structures&lt;br /&gt;
* Dump data in correct order, avoiding redundancy if possible, into SQL database&lt;br /&gt;
&lt;br /&gt;
gitdm has 3 basic data structures: Hacker, Employer &amp;amp; Patch. Each changeset is a Patch object, each Patch has an Author, and is assigned to an Employer (based on who the Hacker was working for at the time of the Patch). Each Patch also has a list of Hackers who reviewed, reported, signed-off on and tested the patch. Each Hacker links to a list of Patches for which they are the author, a list of email addresses they have used to commit, and separate lists for reviewed, reported, SOB and tested. In addition, each Hacker has a list of Employers he has worked for, and each Employer has a list of Hackers who have worked for them.&lt;br /&gt;
&lt;br /&gt;
=== Community &amp;amp; Official OBS ===&lt;br /&gt;
* Package submissions (+ evolution)&lt;br /&gt;
* Active participants (+ evolution)&lt;br /&gt;
&lt;br /&gt;
== Not yet in scope ==&lt;br /&gt;
&lt;br /&gt;
I have not yet considered how I might get web analytics and download stats.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Dashboard</id>
		<title>Metrics/Dashboard</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Dashboard"/>
				<updated>2011-09-30T13:07:25Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Data to report */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Community Metrics Dashboard ==&lt;br /&gt;
&lt;br /&gt;
The goal is to provide a web page summarising metrics about various aspects of the MeeGo project. The data should update regularly - depending on the metric, that could be real time or  updated automatically on a regular basis.&lt;br /&gt;
&lt;br /&gt;
The dashboard will track the following community resources, ideally: &lt;br /&gt;
* Drupal members&lt;br /&gt;
* Bugzilla (bugs opened, bugs closed, active users)&lt;br /&gt;
* Mailing lists (members, posts, threads)&lt;br /&gt;
* gitorious (commits, employer details for committers) - should use Jon Corbet's scripts like are used in the LF yearly kernel data.&lt;br /&gt;
* Wiki (edits, new pages)&lt;br /&gt;
* Forums (members, posts)&lt;br /&gt;
* IRC (total comments, people on channel)&lt;br /&gt;
* Transifex (Languages, translators, strings translated)&lt;br /&gt;
* Community OBS (uploads, users)&lt;br /&gt;
* SDK downloads (potentially extrapolated from meego.com)&lt;br /&gt;
&lt;br /&gt;
The data should also be available for custom reports for usage and analysis in the monthly MeeGo Metrics report published by [[User:DawnFoster]]&lt;br /&gt;
&lt;br /&gt;
To fulfill these goals, the dashboard will gather data from the various resource into a centralised database, using some sort of Business Intelligence platform including ETL for data acquisition and storage, and a reporting service for generating reports and dashboards.. A web page will provide a view into this database with predefined reports.&lt;br /&gt;
&lt;br /&gt;
Candidate reporting solutions:&lt;br /&gt;
&lt;br /&gt;
* [http://jasperforge.org/index.php?q=project/jasperreports JasperReports]&lt;br /&gt;
* [http://www.pentaho.com/ Pentaho]&lt;br /&gt;
&lt;br /&gt;
The following are essentially ETL engines, and do not provide reporting or dashboard functionality:&lt;br /&gt;
&lt;br /&gt;
* [http://www.talend.com/index.php Talend]&lt;br /&gt;
* [http://petals.ow2.org/ Petals]&lt;br /&gt;
&lt;br /&gt;
[http://www.mulesoft.com/ MuleSoft] is an open source ESB, but does not seem adapted to our needs. The field is thus narrowed to Pentaho and JasperReports.&lt;br /&gt;
&lt;br /&gt;
For each community resource, we need to figure out how to get the data into a usable form, and come up with appropriate queries for metrics reports, and finally present the results on a webpage.&lt;br /&gt;
&lt;br /&gt;
=== Business intelligence engines ===&lt;br /&gt;
&lt;br /&gt;
The area of Business Intelligence is littered with acronyms. Here's a quick overview of the main ones, and how they all fit together.&lt;br /&gt;
&lt;br /&gt;
; BI&lt;br /&gt;
: Business Intelligence - general name for any middleware which allows you to query business processes (sales, inventory, etc) and get data overviews from it&lt;br /&gt;
; ETL&lt;br /&gt;
: Extract, Transform, and Load - the process if extracting data from a data source (database, screen scraping, text file parsing, whatever), transforming it to a well understood format, and loading it in your BI engine database or data warehouse. Good ETL solutions provide a nice way for you to connect another database and have new data sucked in at regular intervals, define views into the source data store which you can then query within your BI engine, etc. Pentaho's ETL, [http://kettle.pentaho.com/ Kettle], and [http://www.jaspersoft.com/jasperetl JasperETL], used by JasperReports, both provide (kind of) straightforward ways to hook into a MySQL database.&lt;br /&gt;
; ESB&lt;br /&gt;
: [http://en.wikipedia.org/wiki/Enterprise_service_bus Enterprise Service Bus] - a middleware bus providing a unique interface to applications on the front-end and data stores on the back end. Often used to link up many front-end applications (eg. library, student registration, employee payroll, syllabus management, accounting, supply-chain, student lodgement programmes, etc in a university). Not really useful for us, as far as I can tell.&lt;br /&gt;
; EAI&lt;br /&gt;
: Enterprise Application Integration - using software to integrate different applications together. As far as I can tell, this is a meaningless catch-all phrase for anything from kludges to architected business intelligence solutions.&lt;br /&gt;
; DW&lt;br /&gt;
: Data Warehouse. Basically the same thing as a database, as far as I can tell, but bigger and more impressive sounding.&lt;br /&gt;
; OLAP&lt;br /&gt;
: On-Line Analytical Processing. Commonly used acronym for extracting data via multi-dimensional queries. Databases can be configured to provide the results of this kind of query. As far as I can tell this is mostly a buzzword - an &amp;quot;OLAP database&amp;quot; like [http://mondrian.pentaho.com/ Mondrian] is basically the same thing as a database. &amp;quot;speed-of-thought&amp;quot; response times indeed.&lt;br /&gt;
; Business reporting&lt;br /&gt;
: An application which allows a graphical view of a database, and allows you to construct queries interactively, often using drag &amp;amp; drop. The results of these queries can then be plugged into graphing software for presentation in a dashboard.&lt;br /&gt;
; Dashboard&lt;br /&gt;
: Organised presentation of information in a web-page or other similar format allowing an at-a-glance overview of the situation for the data being measured.&lt;br /&gt;
&lt;br /&gt;
So, in short, the community dashboard project will likely use an ETL to plug data into an OLAP server, and then use a business reporting engine to query that data and present it in a dashboard.&lt;br /&gt;
&lt;br /&gt;
=== Comparison of candidate ETL/reporting ===&lt;br /&gt;
&lt;br /&gt;
Modules available:&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
! Software !! License !! ETL !! OLAP database !! BI server !! Reporting !! Dashboard module&lt;br /&gt;
|-&lt;br /&gt;
| Pentaho || EPL || [http://kettle.pentaho.com/ Kettle] || [http://mondrian.pentaho.com/ Mondrian] || [http://community.pentaho.com/projects/bi_platform/ Pentaho BI Platform] || [http://reporting.pentaho.com/ Pentaho Reporting] || [http://wiki.pentaho.com/display/COM/Community+Dashboard+Framework Community Dashboard Framework]&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.jaspersoft.com/editions Jaspersoft] || AGPL v3 || JasperETL (Talend Open Studio) || JasperOLAP || JasperReports Server || iReports editor || No (commercial only)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Pentaho is used as the basis of Mozilla's metrics project, and provides a very strong community software option for both the dashboard and for managing the BI server. Since Mozilla metrics work overlaps what we are trying to achieve, particularly their work on SQR, the Software Quality Reports analytic module for Bugzilla and JIRA, Pentaho is my preference for the dashboard project. In general, I have observed that the Pentaho community provides very good support.&lt;br /&gt;
&lt;br /&gt;
== Architecture ==&lt;br /&gt;
&lt;br /&gt;
Pentaho runs as a webapp in Tomcat6. It can use a variety of databases for its internal data structures, the default (Hypersonic) is a Java database. However, because it's both standard &amp;amp; well understood and to allow consolidation of databases under one DB server, I prefer to use MySQL. The configuration of Pentaho with a MySQL database is a little tricky, but almost all of the steps are covered well [https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en in this tutorial].&lt;br /&gt;
&lt;br /&gt;
The data which is useful for metrics will be copied into a local database from each of the services we query. The copying of data will be accomplished by a set of Kettle &amp;quot;xactions&amp;quot;, which can be created and edited easily with the Spoon tool.&lt;br /&gt;
&lt;br /&gt;
A number of reports will be generated using the Pentaho Report Designer, including a static HTML/Flash dashboard which will be published regularly. Other reports can be created for the community managers, and a more advanced dashboard, allowing detailed analysis of basic metrics, can be provided via the Community Dashboard Framework.&lt;br /&gt;
&lt;br /&gt;
We will need to see how much load the dashboard will generate on the server. I suspect that it will not be practical to expose the dashboard in public.&lt;br /&gt;
&lt;br /&gt;
We will document here everything you need to do to replicate the MeeGo Community Dashboard, with the exception of data which is not publicly available because it contains security related or confidential information (mainly bugzilla).&lt;br /&gt;
&lt;br /&gt;
* [[../Installing Pentaho]]: A guide to getting Pentaho up and running for MeeGo Dashboard&lt;br /&gt;
* [[../Gathering data]]: A guide to installing and using the tools to gather data for a dashboard&lt;br /&gt;
* [[../Creating a report]]: Given data in a database, how do we generate a report in Pentaho, and deploy it to the dashboard?&lt;br /&gt;
&lt;br /&gt;
=== Extracting data ===&lt;br /&gt;
&lt;br /&gt;
For SQL databases, this implies that the server where the dashboard will run should have access to the database server for MediaWiki, Bugzilla, and Drupal.&lt;br /&gt;
&lt;br /&gt;
For the forum, we will integrate the CSV files currently being exported, which provide the basic analytics we need.&lt;br /&gt;
&lt;br /&gt;
Individual mailing lists will be parsed by [http://forge.morfeo-project.org/projects/libresoft-tools/ MLStats]. We will use the resulting database directly in the dashboard.&lt;br /&gt;
&lt;br /&gt;
Git repositories will be queried with &amp;quot;git log&amp;quot;, and parsed with the parser module from [http://lwn.net/Articles/290957/ gitdm], before being stored directly in a database. we will be able to run analytics on the results from there. gitdm can also do basic analytics of git logs, and we may decide to simply reuse gitdm's analytics. However, if we want to extend them, we will want to have the raw data.&lt;br /&gt;
&lt;br /&gt;
IRC logs will be parsed with [http://code.google.com/p/superseriousstats/ superseriousstats], a PHP command line tool that parses IRC logs and stores the results in an SQL database.&lt;br /&gt;
&lt;br /&gt;
We still need to figure out how to do data interchange with Transifex and OBS. Dimitris tells me that there are already [http://meego.transifex.net/stats/ some analytics] available on Transifex, and that there is a RESTful API available to query this data.&lt;br /&gt;
&lt;br /&gt;
== Data to report ==&lt;br /&gt;
&lt;br /&gt;
For each of the resources, the following statistics (at a minimum) should be extracted:&lt;br /&gt;
&lt;br /&gt;
=== Drupal ===&lt;br /&gt;
* Members of meego.com (+ evolution month over month)&lt;br /&gt;
* Active members (need a decent way to hook up different ways a person can be active: wiki, ML, IRC, forum, git)&lt;br /&gt;
&lt;br /&gt;
=== Mailing lists ===&lt;br /&gt;
* Subscriber numbers (+ evolution) - from Mailman directly, not available in mlstats&lt;br /&gt;
* Emails sent (+ evolution) - from mlstats&lt;br /&gt;
* Active participants (individuals with &amp;gt;=2 emails during month) - from mlstats&lt;br /&gt;
* Hot threads - from mlstats&lt;br /&gt;
* Top posters - from mlstats&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* Count posts of most popular threads:&lt;br /&gt;
*: &amp;lt;code&amp;gt;select subject,year(first_date) as y, monthname(first_date),count(*) as c from messages group by subject, month(first_date) order by y, month(first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Count the number of posts for each person&lt;br /&gt;
*: &amp;lt;code&amp;gt;select p.email_address,year(m.first_date) as y, monthname(m.first_date),count(*) as c from messages as m,messages_people as p where m.message_id=p.message_ID group by p.email_address, month(m.first_date) order by y, month(m.first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Restricting queries to a date range / month&lt;br /&gt;
*: The query above gives month-by-month totals, you could add an order by year(first_date) to get the year too&lt;br /&gt;
*: The easiest way is to add &amp;lt;code&amp;gt;where month(first_date)=3 and year(first_date)=2010&amp;lt;/code&amp;gt; for March 2010. For the current month, &amp;lt;code&amp;gt;month(m.first_date)=month(NOW()) and year(first_date)=year(NOW())&amp;lt;/code&amp;gt; works.&lt;br /&gt;
&lt;br /&gt;
=== Forum ===&lt;br /&gt;
* Posts per month (+evolution)&lt;br /&gt;
* Active posters (2+ posts during month)&lt;br /&gt;
* Hot topics&lt;br /&gt;
* Top posters&lt;br /&gt;
&lt;br /&gt;
[http://forum.meego.com/stats/ Stats] are exported from the Forum in CSV format monthly.&lt;br /&gt;
&lt;br /&gt;
=== Bugzilla ===&lt;br /&gt;
* Bugs created (+ evolution)&lt;br /&gt;
* Bugs resolved (+ evolution)&lt;br /&gt;
* Comments on bugs this month&lt;br /&gt;
* Active Bugzilla contributors (2+ comments during month)&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* [http://sourceforge.net/projects/qareports/ Software Quality Reports for Pentaho] of course&lt;br /&gt;
* See the [http://bugs.meego.com/query.cgi?format=report-table bugzilla tabular reports] and relative dates for more. &lt;br /&gt;
* Other potential resources:&lt;br /&gt;
** [http://www.ravenbrook.com/project/p4dti/tool/cgi/bugzilla-schema/ Bugzilla database schema by version]&lt;br /&gt;
** [https://landfill.bugzilla.org/bugzilla-tip/report.cgi Demo site for standard Bugzilla reports]&lt;br /&gt;
** [https://wiki.mozilla.org/Bugzilla:SQLCookBook The Bugzilla SQL cookbook]&lt;br /&gt;
** [http://www.bugzillametrics.org/ Bugzilla Metrics] - Third party add-on to Bugzilla that produces lots of metrics&lt;br /&gt;
** browse.cgi and weekly-summary.html extensions of GNOME Bugzilla: [https://launchpad.net/bugzilla.gnome.org Code], examples in practice: [http://bugzilla.gnome.org/browse.cgi?product=Evolution browse.cgi], [https://bugzilla.gnome.org/page.cgi?id=weekly-bug-summary.html weekly-summary.cgi]&lt;br /&gt;
&lt;br /&gt;
=== Mediawiki ===&lt;br /&gt;
* New wiki pages (+ evolution)&lt;br /&gt;
* Edits this month (+ evolution)&lt;br /&gt;
* Pages deleted this month (+ evolution)&lt;br /&gt;
* Unique editors this month (+ evoluion)&lt;br /&gt;
&lt;br /&gt;
==== Queries ====&lt;br /&gt;
&lt;br /&gt;
A [http://www.mediawiki.org/wiki/Extension:SpecialUserScore MediaWiki extension] exists to provide &amp;quot;user scores&amp;quot; for MediaWiki users, ordered by number of edits and number of pages changed. The guts of the query is:&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(wr.rev_id) as value,&lt;br /&gt;
        COUNT(DISTINCT wr.rev_page) as page_value,&lt;br /&gt;
        wu.user_name as name,&lt;br /&gt;
        wu.user_real_name as real_name&lt;br /&gt;
 FROM   $user wu,&lt;br /&gt;
        $revision wr,&lt;br /&gt;
        $page wp&lt;br /&gt;
 WHERE  wu.user_id = wr.rev_user&lt;br /&gt;
    and wp.page_id = wr.rev_page&lt;br /&gt;
    and wp.page_namespace = 0&lt;br /&gt;
 GROUP BY wu.user_name&lt;br /&gt;
 ORDER BY value desc;&lt;br /&gt;
&lt;br /&gt;
where $user, $revision and $page are the names of the respective MediaWiki tables (MediaWiki tables have a prefix associated with them for a given instance, specified by $wgDBprefix in LocalSettings.php).&lt;br /&gt;
&lt;br /&gt;
For the following group-by-month queries, I did a cross join of (2008,2009,2010,2011) and (01-12) to generate a &amp;quot;year and month&amp;quot; data table.&lt;br /&gt;
&lt;br /&gt;
'''Top editors by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year AS yyyy,&lt;br /&gt;
        mon.timestamp_month AS mm,&lt;br /&gt;
        rev_user_text AS user,&lt;br /&gt;
        COUNT(*) AS c&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months AS mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm,user&lt;br /&gt;
 HAVING c&amp;gt;5&lt;br /&gt;
 ORDER BY yyyy,mm,c desc;&lt;br /&gt;
&lt;br /&gt;
'''Number of edits by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*) AS edits&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
'''New pages per month:'''&lt;br /&gt;
To get the number of new pages per month is a bit trickier - first we need to query $revision to get the page_ids and their date of creation, then group by date. The query is O(n²) on the number of pages, although it should be possible to make it O(n) by grouping the result of the subquery without doing in() on the list of timestamps.&lt;br /&gt;
&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*)&lt;br /&gt;
 FROM mw_revision as rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE CONCAT(CONCAT(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
   AND rev.rev_timestamp in (&lt;br /&gt;
                SELECT MIN(rev_timestamp)&lt;br /&gt;
                FROM mw_revision&lt;br /&gt;
                GROUP BY rev_page)&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
To get just the list of pages &amp;amp; timestamps (this is used as the subquery for above):&lt;br /&gt;
 SELECT rev_page as p,&lt;br /&gt;
        MIN(rev_timestamp) as t&lt;br /&gt;
 FROM mw_revision&lt;br /&gt;
 GROUP BY rev_page;&lt;br /&gt;
&lt;br /&gt;
=== IRC ===&lt;br /&gt;
&lt;br /&gt;
superseriousstats does some preliminary analysis on data it stores in its database. Its author (tommyrot) has kindly added a parser for the format of the IRC logs we use (supybot) on my request. The [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v3.sql database schema] is a little hard to work out; Several key tables have fields with undescriptive names like l_01. There are some queries in [https://github.com/tommyrot/superseriousstats/blob/master/html.class.php html.class.php] which we can use to generate some reports, though.&lt;br /&gt;
 &lt;br /&gt;
* Total IRC activity (by hour)&lt;br /&gt;
 select sum(`l_00`) as `l_00`, sum(`l_01`) as `l_01`, sum(`l_02`) as `l_02`,&lt;br /&gt;
        sum(`l_03`) as `l_03`, sum(`l_04`) as `l_04`, sum(`l_05`) as `l_05`,&lt;br /&gt;
        sum(`l_06`) as `l_06`, sum(`l_07`) as `l_07`, sum(`l_08`) as `l_08`,&lt;br /&gt;
        sum(`l_09`) as `l_09`, sum(`l_10`) as `l_10`, sum(`l_11`) as `l_11`,&lt;br /&gt;
        sum(`l_12`) as `l_12`, sum(`l_13`) as `l_13`, sum(`l_14`) as `l_14`,&lt;br /&gt;
        sum(`l_15`) as `l_15`, sum(`l_16`) as `l_16`, sum(`l_17`) as `l_17`,&lt;br /&gt;
        sum(`l_18`) as `l_18`, sum(`l_19`) as `l_19`, sum(`l_20`) as `l_20`,&lt;br /&gt;
        sum(`l_21`) as `l_21`, sum(`l_22`) as `l_22`, sum(`l_23`) as `l_23`&lt;br /&gt;
   from `channel`&lt;br /&gt;
* Total active participants (+ evolution) - we may be able to get &amp;quot;number of participants per hour/day/month&amp;quot; (so you can see if it's 2 guys taking amongst themselves or a larger group) - I'll ask tommyrot what the query should look like.&lt;br /&gt;
* Top contributors (per month)&lt;br /&gt;
 select `q_lines`.`ruid`, `csnick`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_total`) as `l_total`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_night`) as `l_night`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_morning`) as `l_morning`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_afternoon`) as `l_afternoon`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_evening`) as `l_evening`,&lt;br /&gt;
        `quote` from `q_lines`&lt;br /&gt;
    join `q_activity_by_month` on `q_lines`.`ruid` = `q_activity_by_month`.`ruid`&lt;br /&gt;
    join `user_status` on `q_lines`.`ruid` = `user_status`.`uid`&lt;br /&gt;
    join `user_details` on `q_lines`.`ruid` = `user_details`.`uid`&lt;br /&gt;
    where `status` != 3&lt;br /&gt;
      and `date` = '2011-02'&lt;br /&gt;
    group by `q_lines`.`ruid`&lt;br /&gt;
    order by `q_activity_by_month`.`l_total` desc, `q_lines`.`ruid` asc limit 30&lt;br /&gt;
&lt;br /&gt;
=== Transifex ===&lt;br /&gt;
* Total languages ordered by translation coverage (+ evolution)&lt;br /&gt;
* Top languages&lt;br /&gt;
* Top translators/teams&lt;br /&gt;
&lt;br /&gt;
This all depends on what is available from Transifex.&lt;br /&gt;
&lt;br /&gt;
=== Git ===&lt;br /&gt;
* Commits this month (+ evolution)&lt;br /&gt;
* Top committers (+ evolution)&lt;br /&gt;
* Committers by company (possible)&lt;br /&gt;
* Active modules (+ evolution)&lt;br /&gt;
&lt;br /&gt;
Using a modified version of gitdm to dump Git logs into a MySQL database for analysis. Modifications required:&lt;br /&gt;
* Create database &amp;amp; tables based on the gitdm data structures&lt;br /&gt;
* Dump data in correct order, avoiding redundancy if possible, into SQL database&lt;br /&gt;
&lt;br /&gt;
gitdm has 3 basic data structures: Hacker, Employer &amp;amp; Patch. Each changeset is a Patch object, each Patch has an Author, and is assigned to an Employer (based on who the Hacker was working for at the time of the Patch). Each Patch also has a list of Hackers who reviewed, reported, signed-off on and tested the patch. Each Hacker links to a list of Patches for which they are the author, a list of email addresses they have used to commit, and separate lists for reviewed, reported, SOB and tested. In addition, each Hacker has a list of Employers he has worked for, and each Employer has a list of Hackers who have worked for them.&lt;br /&gt;
&lt;br /&gt;
=== Community &amp;amp; Official OBS ===&lt;br /&gt;
* Package submissions (+ evolution)&lt;br /&gt;
* Active participants (+ evolution)&lt;br /&gt;
&lt;br /&gt;
== Not yet in scope ==&lt;br /&gt;
&lt;br /&gt;
I have not yet considered how I might get web analytics and download stats.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-07-26T15:01:09Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
=== Data sources ===&lt;br /&gt;
&lt;br /&gt;
For SQL databases, the metrics server has access to the database server for MediaWiki, Bugzilla, and Drupal directly.&lt;br /&gt;
&lt;br /&gt;
For the forum, [[#Forums | we integrate the CSV files currently being exported]], which provide the basic analytics we need.&lt;br /&gt;
&lt;br /&gt;
Mailing lists are parsed by [http://forge.morfeo-project.org/projects/libresoft-tools/ MLStats]. We will use the resulting database directly in the dashboard. See [[#Mailing lists - MLStats]]&lt;br /&gt;
&lt;br /&gt;
We may eventually parse git activity in repositories with [http://lwn.net/Articles/290957/ gitdm], before storing the results directly in a database. For the moment, though, we extract developer information from the meego-commits mailing list.&lt;br /&gt;
&lt;br /&gt;
[[#IRC - SuperSeriousStats | IRC logs]] will be parsed with [http://code.google.com/p/superseriousstats/ superseriousstats], a PHP command line tool that parses IRC logs and stores the results in an SQL database.&lt;br /&gt;
&lt;br /&gt;
We still need to figure out how to do data interchange with Transifex and OBS. Dimitris tells me that there are already [http://meego.transifex.net/stats/ some analytics] available on Transifex, and that there is a RESTful API available to query this data.&lt;br /&gt;
&lt;br /&gt;
For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists - MLStats ==&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with &amp;lt;pre&amp;gt;setup.py install --prefix=/install/path&amp;lt;/pre&amp;gt; you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
 /path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
&lt;br /&gt;
The command line option &amp;lt;pre&amp;gt;--no-report&amp;lt;/pre&amp;gt; suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
 for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
             meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
             meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
             meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
             meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
             meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
 do&lt;br /&gt;
   /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
 done;&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[Image:Mlgraph.png]]&lt;br /&gt;
&lt;br /&gt;
== IRC - SuperSeriousStats ==&lt;br /&gt;
&lt;br /&gt;
[http://code.google.com/p/superseriousstats/ SuperSeriousStats] is a super serious IRC log analyser, developed by Jos de Ruiter. It is written in PHP.&lt;br /&gt;
&lt;br /&gt;
To use SuperSeriousStats, you need to install the php-cli command line interface. Once you have this, and the [https://github.com/tommyrot/superseriousstats/wiki/Setup-guide other dependencies for sss listed here], you will be able to run sss.&lt;br /&gt;
&lt;br /&gt;
Before running it for the first time, [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v4.sql initialise the database]. You will need to download all of the [http://mg.pov.lt/meego-irclog/index.html IRC logs] you wish to import first. Logs are expected in the Supybot format ([http://mg.pov.lt/meego-irclog/%23meego.2011-06-15.log example]).&lt;br /&gt;
&lt;br /&gt;
=== Getting the IRC logs ===&lt;br /&gt;
For the initial download, the following script should help get all the logs:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
for y in 2010 2011; do&lt;br /&gt;
 for m in `seq -f '%02g' 1 12`; do&lt;br /&gt;
  for d in `seq -f '%02g' 1 31`; do&lt;br /&gt;
   wget http://mg.pov.lt/meego-irclog/%23meego.$y-$m-$d.log;&lt;br /&gt;
  done;&lt;br /&gt;
 done;&lt;br /&gt;
done;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Thereafter, you can add the following line in your cron job to get the previous day's logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate= `date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Parsing the logs ===&lt;br /&gt;
&lt;br /&gt;
To run sss, you will need to modify a number of fields in sss.conf. Notably, I had to change the filename format to &amp;quot;*.Y-m-d.\lo\g&amp;quot; to match the log file names, and set log format to supybot. I recommend creating a separate config file for each channel you want to parse. Here is the appropriate sss.conf for #meego logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#&lt;br /&gt;
#    This file contains all of sss' settings along with their defaults.&lt;br /&gt;
#    All values must be placed between double quotes, even empty ones.&lt;br /&gt;
#&lt;br /&gt;
#################################  Required  ###################################&lt;br /&gt;
&lt;br /&gt;
channel = &amp;quot;#meego&amp;quot;		# Name of the IRC channel.&lt;br /&gt;
timezone = &amp;quot;UTC&amp;quot;	# Timezone the logs are in. Used for time offset&lt;br /&gt;
			# calculations and conversions.&lt;br /&gt;
			# See http://php.net/manual/en/timezones.php&lt;br /&gt;
db_host = &amp;quot;host&amp;quot;		# IP address or FQDN of the MySQL server.&lt;br /&gt;
db_port = &amp;quot;3306&amp;quot;	# Port the MySQL server is listening on.&lt;br /&gt;
db_user = &amp;quot;db_user&amp;quot;	# MySQL user.&lt;br /&gt;
db_pass = &amp;quot;db_password&amp;quot;		# MySQL password.&lt;br /&gt;
db_name = &amp;quot;db_name&amp;quot;		# Name of the MySQL database used for sss.&lt;br /&gt;
parser = &amp;quot;parser_supybot&amp;quot;	# The parser to use depending on logfile format.&lt;br /&gt;
			# e.g. &amp;quot;parser_irssi&amp;quot; or &amp;quot;parser_eggdrop&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# This string contains the format of the date within a logfile filename.&lt;br /&gt;
# Examples:&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: *.Ymd&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: \#c\h\atroo\m.Ymd&lt;br /&gt;
#   filename: chatroom.log-31012003	dateformat: *.*-dmY&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.\g\z&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.*&lt;br /&gt;
# See http://php.net/date_create_from_format for more specific syntax options.&lt;br /&gt;
logfile_dateformat = &amp;quot;*.Y-m-d.\lo\g&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
sss has a mechanism to prevent it from re-importing log files for the same channel, by storing dates for which it has already parsed files in the table &amp;quot;parse_history&amp;quot;, but you can store summary data from several channels in one database.&lt;br /&gt;
&lt;br /&gt;
To do the initial import, just run&lt;br /&gt;
 php sss.php -c meego.conf -i ${logdir} -o report.html&lt;br /&gt;
&lt;br /&gt;
And to do the nightly import, put the following in a shell script, and call it from cron:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate=`date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   logdir=/path/to/logfiles&lt;br /&gt;
   sssdir=/path/to/sss-4.0&lt;br /&gt;
&lt;br /&gt;
   /usr/bin/wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
   /usr/bin/php ${sssdir}/sss.php -c meego.conf -i ${logdir}/\#meego.${logdate}.log&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Queries ===&lt;br /&gt;
&lt;br /&gt;
The sss report is built from a number of queries. A number of other useful sss queries are already listed in [[Metrics/Dashboard#IRC | the Dashboard page]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Activity by month'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  select date_format(`date`, '%Y-%m') as `date`,&lt;br /&gt;
       sum(`l_total`) as `l_total`,&lt;br /&gt;
       sum(`l_night`) as `l_night`,&lt;br /&gt;
       sum(`l_morning`) as `l_morning`,&lt;br /&gt;
       sum(`l_afternoon`) as `l_afternoon`,&lt;br /&gt;
       sum(`l_evening`) as `l_evening`&lt;br /&gt;
  from `channel`&lt;br /&gt;
  group by year(`date`), month(`date`);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Activity by day over last 30 days'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  select `date`, `l_total`, `l_night`, `l_morning`, `l_afternoon`, `l_evening` from `channel`&lt;br /&gt;
  where `date` &amp;gt; DATE_SUB(CURDATE(),INTERVAL 30 DAY);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Forums ==&lt;br /&gt;
&lt;br /&gt;
Forum stats are available as a series of CSV files on [http://forums.meego.com/stats forums.meego.com], supplied monthly. We need to download the .csv files every month (just the latest ones), parse the CSV files into a database and generate a report from that.&lt;br /&gt;
&lt;br /&gt;
=== Downloading CSV files ===&lt;br /&gt;
&lt;br /&gt;
To get started and load up all of the old stats, run the following:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -nd -P &amp;lt;local dir for data&amp;gt; -r -l1 --no-parent -A.csv http://forum.meego.com/stats&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will download all CSV files to the local directory specified.&lt;br /&gt;
&lt;br /&gt;
For the monthly refresh, we use wget in a cron script as follows:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
y=`/bin/date -d &amp;quot;1 month ago&amp;quot; +%Y`&lt;br /&gt;
m=`/bin/date -d &amp;quot;1 month ago&amp;quot; +%m`&lt;br /&gt;
&lt;br /&gt;
wget -nd -P &amp;lt;local dir for data&amp;gt; -r -l1 --no-parent -A &amp;quot;${y}${m}*.csv&amp;quot; http://forum.meego.com/stats&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
That wget command line is worth explaining:&lt;br /&gt;
* -nd: Don't create the remote directory structure when downloading the files locally&lt;br /&gt;
* -P &amp;lt;directory&amp;gt;: Download files to the parent directory specified&lt;br /&gt;
* -r: Recursively download&lt;br /&gt;
* -l1: Limit to 1 level of directories (combining -r and -l1 allows us to download several files at the same time)&lt;br /&gt;
* --no-parent: Ignore the .. link&lt;br /&gt;
* -A &amp;quot;${y}${m}*.csv&amp;quot;: Match filenames of the form &amp;quot;YYYYMM*.csv&amp;quot; - gives us the latest stats files only&lt;br /&gt;
&lt;br /&gt;
=== Database schema ===&lt;br /&gt;
&lt;br /&gt;
We created 7 tables, one for each of the statistics provided by the forum.&lt;br /&gt;
&lt;br /&gt;
Here is the database schema:&lt;br /&gt;
&amp;lt;pre&amp;gt;--&lt;br /&gt;
-- Table structure for table `forum_cumulative_posts`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_cumulative_posts`;&lt;br /&gt;
CREATE TABLE `forum_cumulative_posts` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `forum` varchar(50) NOT NULL DEFAULT '',&lt;br /&gt;
  `posts` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`forum`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_cumulative_threads`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_cumulative_threads`;&lt;br /&gt;
CREATE TABLE `forum_cumulative_threads` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `forum` varchar(50) NOT NULL DEFAULT '',&lt;br /&gt;
  `threads` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`forum`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_hottest_threads`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_hottest_threads`;&lt;br /&gt;
CREATE TABLE `forum_hottest_threads` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `rank` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;
  `title` varchar(255) DEFAULT NULL,&lt;br /&gt;
  `posts` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`rank`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_most_viewed_threads`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_most_viewed_threads`;&lt;br /&gt;
CREATE TABLE `forum_most_viewed_threads` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `rank` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;
  `title` varchar(255) DEFAULT NULL,&lt;br /&gt;
  `views` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`rank`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_posts`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_posts`;&lt;br /&gt;
CREATE TABLE `forum_posts` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `forum` varchar(50) NOT NULL DEFAULT '',&lt;br /&gt;
  `posts` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`forum`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_top_posters`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_top_posters`;&lt;br /&gt;
CREATE TABLE `forum_top_posters` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `rank` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;
  `member` varchar(50) DEFAULT NULL,&lt;br /&gt;
  `posts` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`rank`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_top_thanked`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_top_thanked`;&lt;br /&gt;
CREATE TABLE `forum_top_thanked` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `rank` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;
  `member` varchar(50) DEFAULT NULL,&lt;br /&gt;
  `thanks` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`rank`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We import the files into the databases via &amp;lt;pre&amp;gt;LOAD DATA LOCAL INFILE&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd &amp;lt;local data directory&amp;gt;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_forum_cumulative_posts.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_forum_cumulative_posts.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # forum,posts&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_forum_cumulative_posts.csv'&lt;br /&gt;
         INTO TABLE forum_cumulative_posts&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (forum, posts)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_forum_cumulative_threads.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_forum_cumulative_threads.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # forum,threads&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_forum_cumulative_threads.csv'&lt;br /&gt;
         INTO TABLE forum_cumulative_threads&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (forum, threads)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_forum_posts.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_forum_posts.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # forum,posts&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_forum_posts.csv'&lt;br /&gt;
         INTO TABLE forum_posts&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (forum, posts)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_hottest_threads.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_hottest_threads.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # rank,title,posts&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_hottest_threads.csv'&lt;br /&gt;
         INTO TABLE forum_hottest_threads&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (rank, title, posts)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_most_viewed_threads.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_most_viewed_threads.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # rank,title,views&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_most_viewed_threads.csv'&lt;br /&gt;
         INTO TABLE forum_most_viewed_threads&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (rank, title, views)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_top_posters.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_top_posters.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # rank,member,posts&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_top_posters.csv'&lt;br /&gt;
         INTO TABLE forum_top_posters&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (rank, member, posts)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_top_thanked.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_top_thanked.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # rank,member,thanks&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_top_thanked.csv'&lt;br /&gt;
         INTO TABLE forum_top_thanked&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (rank, member, thanks)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We run this script (which downloads and imports the .csv files froim the server) monthly through cron. Not sure when the files are put up on the 1st, so I get them on the 2nd:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Retrieve forum stats monthly on 2nd of month&lt;br /&gt;
15 2 2 * * /home/dneary/bin/forum_stats.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Community_Office/Meetings</id>
		<title>Community Office/Meetings</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Community_Office/Meetings"/>
				<updated>2011-07-13T16:23:32Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Next CO meeting */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Purpose and Intent ==&lt;br /&gt;
&lt;br /&gt;
The purpose of these meetings is to provide status updates on community topics and help get people involved in the MeeGo community. This is the regular meeting of the [[Community Office]] and anyone with an interest in the MeeGo community is welcome to attend.&lt;br /&gt;
&lt;br /&gt;
A few notes:&lt;br /&gt;
* Anyone can attend these meetings or participate in the process!&lt;br /&gt;
* Anyone can add any proposed topic below.&lt;br /&gt;
* Learn more about the [[Community_Office]].&lt;br /&gt;
&lt;br /&gt;
== Meeting Logistics == &lt;br /&gt;
&lt;br /&gt;
The Community Office meetings are held every other Tuesday at 14:00 UTC. Meetings will end promptly after 1 hour.&lt;br /&gt;
&lt;br /&gt;
All CO meetings take place in the MeeGo [http://meego.com/community/irc-channel IRC channels]:&lt;br /&gt;
* Main meeting: #meego-meeting&lt;br /&gt;
* Back channel &amp;amp; other discussions (optional): #meego&lt;br /&gt;
&lt;br /&gt;
== Next CO meeting ==&lt;br /&gt;
&lt;br /&gt;
[http://www.timeanddate.com/worldclock/fixedtime.html?year=2011&amp;amp;month=7&amp;amp;day=26&amp;amp;hour=14&amp;amp;min=0&amp;amp;sec=0 July 26 at 14:00 UTC (7am Pacific)]&lt;br /&gt;
&lt;br /&gt;
'''Agenda'''&lt;br /&gt;
* Agenda and Process Review (Dawn)&lt;br /&gt;
* [[MeeGo Apps|Community Apps Update]] (Niels Breet)&lt;br /&gt;
* [[Meego IT]] (Stefano)&lt;br /&gt;
* [[Community Office/Community device program|Community Device Program]] (Randall)&lt;br /&gt;
* [[Events]] Update (Dawn)&lt;br /&gt;
* [http://developer.meego.com/ developer.meego.com updates] (Mike)&lt;br /&gt;
* [[Metrics/Dashboard | Community metrics dashboard]] (Dave Neary)&lt;br /&gt;
* Next Community Office Meeting (Dawn)&lt;br /&gt;
* All Other Business / General Questions&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* You might want to read our [[IRC Presentation Tips]] and [[IRC guidelines]].&lt;br /&gt;
&lt;br /&gt;
== Proposed Brainstorming Topics ==&lt;br /&gt;
&lt;br /&gt;
* Add your topics here.&lt;br /&gt;
&lt;br /&gt;
'''Standing Agenda Items (we'll have these updates on most weeks)''':&lt;br /&gt;
* Community Apps Update (Niels)&lt;br /&gt;
* Events Update (Amy)&lt;br /&gt;
* Community Device Program (Randall)&lt;br /&gt;
* IT (Stefano)&lt;br /&gt;
&lt;br /&gt;
== Past meetings ==&lt;br /&gt;
Click the links to get the meeting minutes. From there you can access the full logs.&lt;br /&gt;
* [http://irclogs.meego.com/meetbot/meego-meeting/2011/meego-meeting.2011-07-12-13.55.html 2011-7-12]&lt;br /&gt;
* [http://irclogs.meego.com/meetbot/meego-meeting/2011/meego-meeting.2011-06-28-13.56.html 2011-6-28]&lt;br /&gt;
* [http://irclogs.meego.com/meetbot/meego-meeting/2011/meego-meeting.2011-06-14-13.56.html 2011-6-14]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2011/meego-meeting.2011-03-01-19.58.html 2011-3-1]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2011/meego-meeting.2011-02-01-19.59.html 2011-2-1]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2011/meego-meeting.2011-01-18-14.59.html 2011-1-18]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-12-07-19.59.html 2010-12-7]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-11-02-18.59.html 2010-11-2]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-10-05-04.57.html 2010-10-5]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-09-21-13.57.html 2010-9-21]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-08-31-04.57.html 2010-8-31]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-08-17-13.58.html 2010-8-17]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-08-03-04.58.html 2010-8-3]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-07-20-13.59.html 2010-7-20]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-07-06-18.58.html 2010-7-6]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-06-15-13.59.html 2010-6-15]&lt;br /&gt;
* [http://trac.tspre.org/meetbot/meego-meeting/2010/meego-meeting.2010-05-04-18.57.html 2010-05-04] - First Community team meeting under the new regular process.&lt;br /&gt;
* [http://meego.mkdir.name/logs/meego-meeting/2010/meego-meeting.2010-02-24-20.04.html 2010-02-24] - First MeeGo community meeting, prior to the formalization of the Community WG (see [[Community website meeting 2010 2 24|wiki page]]).&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/MeeGoBugzilla_Customization</id>
		<title>MeeGoBugzilla Customization</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/MeeGoBugzilla_Customization"/>
				<updated>2011-07-11T10:11:04Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: moved MeeGoBugzilla Customization to MeeGo Bugzilla customization: Naming conventions&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[MeeGo Bugzilla customization]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/MeeGo_Bugzilla_customization</id>
		<title>MeeGo Bugzilla customization</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/MeeGo_Bugzilla_customization"/>
				<updated>2011-07-11T10:11:04Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: moved MeeGoBugzilla Customization to MeeGo Bugzilla customization: Naming conventions&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== MeeGo Bugzilla Features ==&lt;br /&gt;
&lt;br /&gt;
Configurations and customizations have been applied to MeeGo Bugzilla to support additional features for MeeGo bug tracking.&lt;br /&gt;
&lt;br /&gt;
* Support Drupal user authentication system&lt;br /&gt;
** MeeGo bugzilla share the same user DB in meego.com, so there is no need to ask user to register multiple accounts&lt;br /&gt;
* Copyright waiver flag for attachment&lt;br /&gt;
** User is provide a copyright waiver option when submit a patch&lt;br /&gt;
* Privilege Control for bugs' priority setting&lt;br /&gt;
* Bug vote option&lt;br /&gt;
** User is able to vote the bug, vote number indicates how many people care a specific bug&lt;br /&gt;
* Optimized the new bug navigation web page&lt;br /&gt;
** Expand the classification &amp;quot;MeeGo Platform&amp;quot; by listing products under it so that user could choose the product under this classification directly. This helps to reduce the redundant step when filing new bug to most common used classification &amp;quot;MeeGo Platform&amp;quot;.&lt;br /&gt;
* Use customized flag to manage release blocker bugs&lt;br /&gt;
** Normal user can propose blocker bugs for a specific release;&lt;br /&gt;
** Users in a privileged group can approve/reject blocker bugs;&lt;br /&gt;
* Customized Bugzilla new bug template&lt;br /&gt;
* Set the default value for cloned bugs&lt;br /&gt;
* Security flag&lt;br /&gt;
** A check box is provided to user to mark a bug as security bug;&lt;br /&gt;
* Security product&lt;br /&gt;
** The bugs in security product are limited to only visible to security group, bug reporter and users in bug cc list.&lt;br /&gt;
* Field name display renaming&lt;br /&gt;
** OS --&amp;gt; Architecture&lt;br /&gt;
** Hardware --&amp;gt; Profile&lt;br /&gt;
* New custom fields&lt;br /&gt;
** &amp;quot;UX Status&amp;quot; Track UX design status for those UX features. This field is expected to be showed only in feature products&lt;br /&gt;
** &amp;quot;Platform&amp;quot; To track which platform this bug is found. It's 1:M mapping relationship with Profile field&lt;br /&gt;
** &amp;quot;Update Release&amp;quot; To track which update release number the bug is expected to be fixed&lt;br /&gt;
** &amp;quot;Triaged By&amp;quot; To track who made the triage to the bug&lt;br /&gt;
* Show custom fields in advanced page&lt;br /&gt;
** Show dropdown custom fields as well as &amp;quot;security&amp;quot; checkbox in search page&lt;br /&gt;
* Per product configurable priority change control group&lt;br /&gt;
** This is the enhanced patch for customization &amp;quot;Privilege Control for bugs' priority setting&amp;quot;&lt;br /&gt;
** Admin is able to configure which group is able to change priority of given product from product administration page&lt;br /&gt;
* Admin configurable release blocker flag rendering&lt;br /&gt;
** This is the enhanced patch for customization &amp;quot;Use customized flag to manage release blocker bugs&amp;quot;&lt;br /&gt;
** Admin is able to configure which flag need to be showed as Release Blocker fashion. And most of hardcode is removed&lt;br /&gt;
* Known bug fixes for MeeGo 3.6 Bugzilla upgrading&lt;br /&gt;
** A meanlingless text box is showed next to resolution field even the bug is not the duplicated resolution&lt;br /&gt;
** RESOVED REJECTED feature is not showed correctly for users only granted with Feature Owners privilege&lt;br /&gt;
** Bugzilla should disallow to submit bug specific status to a feature and vice verse to submit feature specific status to a bug&lt;br /&gt;
* Toolkits for feature management&lt;br /&gt;
** Feature import tool to bulk load features from xls&lt;br /&gt;
** Export features' description to csv file&lt;br /&gt;
&lt;br /&gt;
[[Category:QA]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Quality/getting-started</id>
		<title>Quality/getting-started</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Quality/getting-started"/>
				<updated>2011-07-11T10:08:43Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: moved Quality/getting-started to Quality/Getting started: Naming conventions&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[Quality/Getting started]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Quality/Getting_started</id>
		<title>Quality/Getting started</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Quality/Getting_started"/>
				<updated>2011-07-11T10:08:43Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: moved Quality/getting-started to Quality/Getting started: Naming conventions&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are a lot of ways for you to become part of MeeGo quality team, no matter you are technical person or not.&lt;br /&gt;
&lt;br /&gt;
* '''Try our daily build'''&lt;br /&gt;
** MeeGo release engineers are producing in-development daily builds, which are available [http://repo.meego.com/MeeGo/snapshots/stable/ HERE]. By trying these builds, you will get the opportunity to enjoy latest fancy features, although you will also get chance to spot new issues as a side effect. Daily builds are not targeted for everyone, as many features may not be fully functional and regressions may happen, but your feedback for them would be very helpful to improve the quality of our releases. If it sounds too scary for you to try daily builds, you could pick up our [http://repo.meego.com/MeeGo/releases/ stable releases] by providing same types of feedback.&lt;br /&gt;
** You could feedback to us by reporting bugs @ https://bugs.meego.com, or sending emails to  [http://lists.meego.com/listinfo/meego-qa meego-qa@lists.meego.com mailing list].&lt;br /&gt;
&lt;br /&gt;
* '''Help out one of our [http://wiki.meego.com/Quality#Projects quality teams for projects]'''&lt;br /&gt;
** If you have special interest or expertise in one of our projects, you are welcomed to help corresponding quality team to contribute in executing existing test cases, developing new test cases, reporting and following up bugs etc.&lt;br /&gt;
&lt;br /&gt;
* '''Participate in QA tools development activities'''&lt;br /&gt;
** MeeGo QA tools team is developing and maintaining a bunch of [http://wiki.meego.com/Quality/QA-tools tools for quality assurance]. You are welcomed to contribute in [http://wiki.meego.com/Quality/QA_tools_development QA tools development activities].&lt;br /&gt;
&lt;br /&gt;
* '''Contribute in sys-debug activities to root cause bugs and assign them to right package'''&lt;br /&gt;
** sys-debug activities are meaningful in facilitating bug fixes since a number of bugs especially system level bugs could be fixed more efficiently if they could be analyzed first to decide the guilty package so that right owner could be assigned the bug and work on it. The details of sys-debug process has been defined @ http://wiki.meego.com/Quality/SysDebug&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Creating_a_report</id>
		<title>Metrics/Creating a report</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Creating_a_report"/>
				<updated>2011-07-07T15:16:32Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* How to create a Pentaho report */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== How to create a Pentaho report ==&lt;br /&gt;
&lt;br /&gt;
There are two ways to create a Pentaho report: the first is a method to create simple tabular reports using the Pentaho user console &amp;quot;ad-hoc reporting&amp;quot; capability, but as the more powerful method, we will concentrate on the [http://wiki.pentaho.com/display/Reporting/Report+Designer Pentaho Report Designer], available [http://sourceforge.net/projects/pentaho/files/Report%20Designer/  from Sourceforge].&lt;br /&gt;
&lt;br /&gt;
There are four basic steps to creating a report in PRD: &lt;br /&gt;
# Make a database connection available as a JNDI source for the Pentaho platform&lt;br /&gt;
# Duplicate the JDNI configuration locally for PRD (since the designer will not have direct access to the Pentaho database)&lt;br /&gt;
# Create queries which will be made available to the report designer&lt;br /&gt;
# Lay out the report to make the results of those queries available to the user (as a graph, tabular data, etc)&lt;br /&gt;
&lt;br /&gt;
This tutorial assumes that you have obtained and stored forum statistics in a local database, as described in [[../Gathering data#Forums]].&lt;br /&gt;
=== Configuring a new JNDI connection ===&lt;br /&gt;
&lt;br /&gt;
The easiest way to configure a new JNDI connection is to start the Pentaho Administration Console (found in $PENTAHO_JOME/administration_console) with &amp;quot;start_pac.sh&amp;quot;. The reason we use JNDI rather than a direct JDBC connection is that when we deploy the report to a different instance, where the username, password and database host are potentially different, we don't have to change the report, we simply use the same symbolic name for the JNDI source.&lt;br /&gt;
&lt;br /&gt;
In Pentaho Admin Console (PAC), click &amp;quot;Administration&amp;quot;, then &amp;quot;Database connections&amp;quot;, and create a new connection. For the forum database, for example, the local parameters here were:&lt;br /&gt;
&lt;br /&gt;
* Name: Forum&lt;br /&gt;
* Driver class: com.mysql.jdbc.Driver&lt;br /&gt;
* User: &amp;lt;db_user&amp;gt;&lt;br /&gt;
* Password: &amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
* URL: jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&lt;br /&gt;
Then click &amp;quot;Test&amp;quot;, and if everything is fine, you should see &amp;quot;Connection test successful&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Potential pitfalls are if you are using c3p0 libraries to do connection pooling [http://wiki.pentaho.com/display/ServerDoc2x/Configuring+for+MySQL as described in the Pentaho wiki], but [http://forums.pentaho.com/archive/index.php/t-78383.html have not included c3p0.jar on the classpath for PAC], or if the database user does not have the appropriate permissions or access for the database. You can troubleshoot permissions issues using the MySQL command line client.&lt;br /&gt;
&lt;br /&gt;
=== Configuring JNDI for PRD ===&lt;br /&gt;
&lt;br /&gt;
Before you can use a &amp;quot;pure&amp;quot; JNDI connection in PRD, you need to declare it locally to make it available.&lt;br /&gt;
&lt;br /&gt;
Open the file &amp;lt;pre&amp;gt;~/.pentaho/simple-jndi/default.properties&amp;lt;/pre&amp;gt; and add this to the bottom:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Forum/type=javax.sql.DataSource&lt;br /&gt;
Forum/driver=com.mysql.jdbc.Driver&lt;br /&gt;
Forum/user=&amp;lt;db_user&amp;gt;&lt;br /&gt;
Forum/password=&amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
Forum/url=jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hopefully the parallels between this configuration and the JNDI configuration in PAC are obvious! The prefix &amp;quot;Forum/&amp;quot; represents the symbolic name of the JNDI connection.&lt;br /&gt;
&lt;br /&gt;
=== Starting Pentaho Report Designer ===&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded PRD, you can run report_designer.sh directly from the directory in which you uncompressed the package. PRD looks like a typical Java application (that is to say, on Linux it's not very pretty).&lt;br /&gt;
&lt;br /&gt;
[[Image:PRD_screenshot.png]]&lt;br /&gt;
&lt;br /&gt;
Create a new report with File-&amp;gt;New, and you will see on the right-hand side of the screen a pane with two tabs, &amp;quot;Data&amp;quot; and &amp;quot;Structure&amp;quot;. PRD is semi-WYSIWYG. Reports have a fixed structure, and each part of the report is evaluated based on a query associated with the report. The group header, for example, will appear once for every group defined in a query, what we put in the data section will be evaluated once for each row returned by the query, and the report header will be evaluated only once per report generation. Styling of elements in the report is cascading and hierarchical: if I set a font family for the report root, this will cascade throughout the report. The styling syntax is very similar to CSS.&lt;br /&gt;
&lt;br /&gt;
To make a report do anything useful, we much associate a query with it. If we want to use several queries in one single document, we must declare all the queries at the top level, and then use sub-reports for each query to display.&lt;br /&gt;
&lt;br /&gt;
=== Creating and testing queries ===&lt;br /&gt;
&lt;br /&gt;
Click on the Data tab in PRD. We can define a number of sources of data for the report by right-clicking on &amp;quot;Data sets&amp;quot;. For a database source, we use the JDBC source type.&lt;br /&gt;
&lt;br /&gt;
On the screen which pops up, we can add a new connection by clicking on the + icon above the connection names, and then choose &amp;quot;JNDI&amp;quot; as our connection type, and set the name of the connection for PRD with the name of the JNDI connection we configured before (see screenshot).&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure_JNDI.png]]&lt;br /&gt;
&lt;br /&gt;
Test the connection with the &amp;quot;Test&amp;quot; button, then click &amp;quot;OK&amp;quot; to finalise things and make the connection available for queries.&lt;br /&gt;
&lt;br /&gt;
Once the connection is correctly configured, you can create and edit queries. Click the &amp;quot;+&amp;quot; icon above the &amp;quot;Available queries&amp;quot; frame, type in a name, and you can start typing directly into the Query text entry area.&lt;br /&gt;
&lt;br /&gt;
There is a graphical tool to help you join tables and perform more complicated wueries, which you can access by double-clicking the pencil icon above this area. By clicking on it, you will be brought to the SQL query designer. I have found this useful in the early stages of writing a queries, for simple joins between tables and for situations where I am not using any SQL functions, but for polishing queries, I tend to use the raw text.&lt;br /&gt;
&lt;br /&gt;
For now, let's create a query to list the top 20 posters for last month. Drag the &amp;quot;top_posters&amp;quot; table onto the query canvas area on the right; all of the fields will be selected by default, and that's fine. Just to test the query, start with &amp;quot;where month=6 and year=2011&amp;quot; as the condition.&lt;br /&gt;
&lt;br /&gt;
The final query is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     month,&lt;br /&gt;
     year,&lt;br /&gt;
     rank,&lt;br /&gt;
     member,&lt;br /&gt;
     posts&lt;br /&gt;
FROM&lt;br /&gt;
     forum_top_posters&lt;br /&gt;
WHERE&lt;br /&gt;
     month=6&lt;br /&gt;
and year=2011&lt;br /&gt;
LIMIT 20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and when you hit &amp;quot;Preview&amp;quot;, you should get a data table like this:&lt;br /&gt;
{|style=&amp;quot;border-collapse: separate; border-spacing: 0; border-width: 1px; border-style: solid; border-color: #000; padding: 0&amp;quot; cellpadding=&amp;quot;5&amp;quot;&lt;br /&gt;
!month!!year!!rank!!member!!posts&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	1||	qgil||	185&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	2||	sjgadsby||	115&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	3||	shmerl||	104&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	4||	texrat||	77&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	5||	jaffa||	60&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	6||	swift11||	52&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	7||	jonay||	42&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	8||	mja||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	9||	vitna||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	10||	timoph||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	11||	hartti||	29&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	12||	lm||	27&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	13||	rubear2009||	26&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	14||	andre||	25&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	15||	profebral||	23&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	16||	mikecomputing||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	17||	ritratt||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	18||	helex||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	19||	cckwes||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	20||	pycage||	19&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Hit OK, and now your query is selected for use with your report.&lt;br /&gt;
&lt;br /&gt;
To make the query always refer to &amp;quot;last month&amp;quot;, you need to resort to datetime functions in MySQL in the WHERE clause.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WHERE &lt;br /&gt;
month = month(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
AND year = year(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Laying out a report ===&lt;br /&gt;
&lt;br /&gt;
By default, there are 5 visible areas on the report canvas:&lt;br /&gt;
* Page Header and Footer: These are what they usually are in documents, a short text on the top/bottom of every page&lt;br /&gt;
* Report Header and Footer: you can think of these as the introduction and conclusion of your report, they will be evaluated once when the report. This is a good place to put an overview graph, or some introductory text, a report summary, or perhaps a sub-report (we'll get to those later)&lt;br /&gt;
* Details: The meat and two veg of the report, the &amp;quot;details&amp;quot; section is evaluated for each row in the associated query&lt;br /&gt;
&lt;br /&gt;
On the left hand side of the window, we have a number of different types of widgets (labels, text/number/date fields, included resources, images, shapes and so on.&lt;br /&gt;
&lt;br /&gt;
You can drag and drop elements from the query (under the Data tab) or from the left of the window into the &amp;quot;Details&amp;quot; section now and preview the report, just to see what happens. For now, let's create a &amp;quot;Username&amp;quot; label, drag &amp;amp; drop the &amp;quot;member&amp;quot; field beside it, a &amp;quot;Posts&amp;quot; label, with the number of posts associated, and let's add a horizontal rule underneath. Select the horizontal rule and set its x offset to 5% and width to 90%, in the &amp;quot;Style&amp;quot; tab which is visible when the element is active. Then we can preview the report, just to make sure it's working OK, with &amp;quot;View-&amp;gt;Preview&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
If all is going well, you should see something like this:&lt;br /&gt;
&lt;br /&gt;
[[Image:Report_preview.png]]&lt;br /&gt;
&lt;br /&gt;
Now we can start making the report a little prettier.&lt;br /&gt;
&lt;br /&gt;
There are a few areas in the document structure which are not visible on the canvas by default (and so we can't drop elements onto them). The important ones for us are:&lt;br /&gt;
* Group Header/Footer: You can use &amp;quot;GROUP BY&amp;quot; in your SQL queries to aggregate data corresponding to certain criteria together - this is useful for generating statistics per mailing list, per author, per month, etc. You can associate a group with a group header/footer (the outermost group will necessarily be the first field in the &amp;quot;group by&amp;quot; clause) to have the report broken up into different groups. The group header and footer appear at the top &amp;amp; bottom of each group&lt;br /&gt;
* Details Header/Footer: &amp;quot;Details header&amp;quot; is typically used to provide the header row in a results table.&lt;br /&gt;
&lt;br /&gt;
To make these visible on the canvas, enabling you to drag &amp;amp; drop elements onto them, select them in the &amp;quot;Structure&amp;quot; tab on the left of the window, and under &amp;quot;Attributes&amp;quot;, change the attribute &amp;quot;hide-on-canvas&amp;quot; to false.&lt;br /&gt;
&lt;br /&gt;
You can then cut &amp;amp; paste the &amp;quot;Username&amp;quot; and &amp;quot;Posts&amp;quot; labels, add a &amp;quot;Rank&amp;quot; label, and arrange them as a header in the &amp;quot;Details header&amp;quot; area. You can set all the labels in the header to bold by setting the &amp;quot;font&amp;quot;/&amp;quot;bold&amp;quot; attribute in the &amp;quot;Style&amp;quot; tab to &amp;quot;true&amp;quot; for the details header area (the value will be inherited by all labels in the area).&lt;br /&gt;
&lt;br /&gt;
Now set a report title using a Message field (you can include data from a query in the text of the field using $(parameter), and add &amp;lt;pre&amp;gt;monthname(date_add(now(), INTERVAL -1 MONTH)) as monthname&amp;lt;/pre&amp;gt; to the fields in the query, and let's set the report title to &amp;quot;Top forum posters for $(monthname) $(year)&amp;quot;. PS, if anyone knows how I can avoid those repeated calls to &amp;lt;pre&amp;gt;date_add(now(), INTERVAL -1 MONTH)&amp;lt;/pre&amp;gt; in Pentaho, I would be delighted to know how.&lt;br /&gt;
&lt;br /&gt;
Just to pretty things up a touch more, let's set a background colour for the header, and alternate the row colours for odd &amp;amp; even rows between white &amp;amp; pale yellow. To do the even/odd banding, we're going to use a Pentaho &amp;quot;Row banding&amp;quot; function, which you'll find in the Format-&amp;gt;Row-Banding menu. For the moment, choose a colour for the &amp;quot;Visible colour&amp;quot; area, and click OK. Now in the Data tab, we can see our new row banding function. All we need to do is give it a name in that area, and drag &amp;amp; drop it to the Details area. I just called it &amp;quot;band&amp;quot;. Then, select &amp;quot;band&amp;quot; in the Structure tab, and in the &amp;quot;Style&amp;quot; tab, set the field &amp;quot;visible&amp;quot; to &amp;quot;false&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
If everything has gone to plan, when you choose &amp;quot;View-&amp;gt;Preview&amp;quot;, you should see something like the following:&lt;br /&gt;
&lt;br /&gt;
[[Image:Report_table.png]]&lt;br /&gt;
&lt;br /&gt;
== Next steps ==&lt;br /&gt;
&lt;br /&gt;
As with anything graphical, the possibilities are boundless. The next step will be to explain how to create graphics, and integrate many queries into a singnle report using subreports, and finally the deployment of reports so that we can generate them on demand via the Pentaho BI server, or schedule regular static report generation for publication.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Creating_a_report</id>
		<title>Metrics/Creating a report</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Creating_a_report"/>
				<updated>2011-07-07T15:14:41Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Laying out a report */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== How to create a Pentaho report ==&lt;br /&gt;
&lt;br /&gt;
There are two ways to create a Pentaho report: the first is a method to create simple tabular reports using the Pentaho user console &amp;quot;ad-hoc reporting&amp;quot; capability, but as the more powerful method, we will concentrate on the [http://wiki.pentaho.com/display/Reporting/Report+Designer Pentaho Report Designer], available [http://sourceforge.net/projects/pentaho/files/Report%20Designer/  from Sourceforge].&lt;br /&gt;
&lt;br /&gt;
There are four basic steps to creating a report in PRD: &lt;br /&gt;
1. Make a database connection available as a JNDI source for the Pentaho platform&lt;br /&gt;
2. Duplicate the JDNI configuration locally for PRD (since the designer will not have direct access to the Pentaho database)&lt;br /&gt;
3. Create queries which will be made available to the report designer&lt;br /&gt;
4. Lay out the report to make the results of those queries available to the user (as a graph, tabular data, etc)&lt;br /&gt;
&lt;br /&gt;
=== Configuring a new JNDI connection ===&lt;br /&gt;
&lt;br /&gt;
The easiest way to configure a new JNDI connection is to start the Pentaho Administration Console (found in $PENTAHO_JOME/administration_console) with &amp;quot;start_pac.sh&amp;quot;. The reason we use JNDI rather than a direct JDBC connection is that when we deploy the report to a different instance, where the username, password and database host are potentially different, we don't have to change the report, we simply use the same symbolic name for the JNDI source.&lt;br /&gt;
&lt;br /&gt;
In Pentaho Admin Console (PAC), click &amp;quot;Administration&amp;quot;, then &amp;quot;Database connections&amp;quot;, and create a new connection. For the forum database, for example, the local parameters here were:&lt;br /&gt;
&lt;br /&gt;
* Name: Forum&lt;br /&gt;
* Driver class: com.mysql.jdbc.Driver&lt;br /&gt;
* User: &amp;lt;db_user&amp;gt;&lt;br /&gt;
* Password: &amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
* URL: jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&lt;br /&gt;
Then click &amp;quot;Test&amp;quot;, and if everything is fine, you should see &amp;quot;Connection test successful&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Potential pitfalls are if you are using c3p0 libraries to do connection pooling [http://wiki.pentaho.com/display/ServerDoc2x/Configuring+for+MySQL as described in the Pentaho wiki], but [http://forums.pentaho.com/archive/index.php/t-78383.html have not included c3p0.jar on the classpath for PAC], or if the database user does not have the appropriate permissions or access for the database. You can troubleshoot permissions issues using the MySQL command line client.&lt;br /&gt;
&lt;br /&gt;
=== Configuring JNDI for PRD ===&lt;br /&gt;
&lt;br /&gt;
Before you can use a &amp;quot;pure&amp;quot; JNDI connection in PRD, you need to declare it locally to make it available.&lt;br /&gt;
&lt;br /&gt;
Open the file &amp;lt;pre&amp;gt;~/.pentaho/simple-jndi/default.properties&amp;lt;/pre&amp;gt; and add this to the bottom:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Forum/type=javax.sql.DataSource&lt;br /&gt;
Forum/driver=com.mysql.jdbc.Driver&lt;br /&gt;
Forum/user=&amp;lt;db_user&amp;gt;&lt;br /&gt;
Forum/password=&amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
Forum/url=jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hopefully the parallels between this configuration and the JNDI configuration in PAC are obvious! The prefix &amp;quot;Forum/&amp;quot; represents the symbolic name of the JNDI connection.&lt;br /&gt;
&lt;br /&gt;
=== Starting Pentaho Report Designer ===&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded PRD, you can run report_designer.sh directly from the directory in which you uncompressed the package. PRD looks like a typical Java application (that is to say, on Linux it's not very pretty).&lt;br /&gt;
&lt;br /&gt;
[[Image:PRD_screenshot.png]]&lt;br /&gt;
&lt;br /&gt;
Create a new report with File-&amp;gt;New, and you will see on the right-hand side of the screen a pane with two tabs, &amp;quot;Data&amp;quot; and &amp;quot;Structure&amp;quot;. PRD is semi-WYSIWYG. Reports have a fixed structure, and each part of the report is evaluated based on a query associated with the report. The group header, for example, will appear once for every group defined in a query, what we put in the data section will be evaluated once for each row returned by the query, and the report header will be evaluated only once per report generation. Styling of elements in the report is cascading and hierarchical: if I set a font family for the report root, this will cascade throughout the report. The styling syntax is very similar to CSS.&lt;br /&gt;
&lt;br /&gt;
To make a report do anything useful, we much associate a query with it. If we want to use several queries in one single document, we must declare all the queries at the top level, and then use sub-reports for each query to display.&lt;br /&gt;
&lt;br /&gt;
=== Creating and testing queries ===&lt;br /&gt;
&lt;br /&gt;
Click on the Data tab in PRD. We can define a number of sources of data for the report by right-clicking on &amp;quot;Data sets&amp;quot;. For a database source, we use the JDBC source type.&lt;br /&gt;
&lt;br /&gt;
On the screen which pops up, we can add a new connection by clicking on the + icon above the connection names, and then choose &amp;quot;JNDI&amp;quot; as our connection type, and set the name of the connection for PRD with the name of the JNDI connection we configured before (see screenshot).&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure_JNDI.png]]&lt;br /&gt;
&lt;br /&gt;
Test the connection with the &amp;quot;Test&amp;quot; button, then click &amp;quot;OK&amp;quot; to finalise things and make the connection available for queries.&lt;br /&gt;
&lt;br /&gt;
Once the connection is correctly configured, you can create and edit queries. Click the &amp;quot;+&amp;quot; icon above the &amp;quot;Available queries&amp;quot; frame, type in a name, and you can start typing directly into the Query text entry area.&lt;br /&gt;
&lt;br /&gt;
There is a graphical tool to help you join tables and perform more complicated wueries, which you can access by double-clicking the pencil icon above this area. By clicking on it, you will be brought to the SQL query designer. I have found this useful in the early stages of writing a queries, for simple joins between tables and for situations where I am not using any SQL functions, but for polishing queries, I tend to use the raw text.&lt;br /&gt;
&lt;br /&gt;
For now, let's create a query to list the top 20 posters for last month. Drag the &amp;quot;top_posters&amp;quot; table onto the query canvas area on the right; all of the fields will be selected by default, and that's fine. Just to test the query, start with &amp;quot;where month=6 and year=2011&amp;quot; as the condition.&lt;br /&gt;
&lt;br /&gt;
The final query is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     month,&lt;br /&gt;
     year,&lt;br /&gt;
     rank,&lt;br /&gt;
     member,&lt;br /&gt;
     posts&lt;br /&gt;
FROM&lt;br /&gt;
     forum_top_posters&lt;br /&gt;
WHERE&lt;br /&gt;
     month=6&lt;br /&gt;
and year=2011&lt;br /&gt;
LIMIT 20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and when you hit &amp;quot;Preview&amp;quot;, you should get a data table like this:&lt;br /&gt;
{|style=&amp;quot;border-collapse: separate; border-spacing: 0; border-width: 1px; border-style: solid; border-color: #000; padding: 0&amp;quot; cellpadding=&amp;quot;5&amp;quot;&lt;br /&gt;
!month!!year!!rank!!member!!posts&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	1||	qgil||	185&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	2||	sjgadsby||	115&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	3||	shmerl||	104&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	4||	texrat||	77&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	5||	jaffa||	60&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	6||	swift11||	52&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	7||	jonay||	42&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	8||	mja||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	9||	vitna||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	10||	timoph||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	11||	hartti||	29&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	12||	lm||	27&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	13||	rubear2009||	26&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	14||	andre||	25&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	15||	profebral||	23&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	16||	mikecomputing||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	17||	ritratt||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	18||	helex||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	19||	cckwes||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	20||	pycage||	19&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Hit OK, and now your query is selected for use with your report.&lt;br /&gt;
&lt;br /&gt;
To make the query always refer to &amp;quot;last month&amp;quot;, you need to resort to datetime functions in MySQL in the WHERE clause.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WHERE &lt;br /&gt;
month = month(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
AND year = year(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Laying out a report ===&lt;br /&gt;
&lt;br /&gt;
By default, there are 5 visible areas on the report canvas:&lt;br /&gt;
* Page Header and Footer: These are what they usually are in documents, a short text on the top/bottom of every page&lt;br /&gt;
* Report Header and Footer: you can think of these as the introduction and conclusion of your report, they will be evaluated once when the report. This is a good place to put an overview graph, or some introductory text, a report summary, or perhaps a sub-report (we'll get to those later)&lt;br /&gt;
* Details: The meat and two veg of the report, the &amp;quot;details&amp;quot; section is evaluated for each row in the associated query&lt;br /&gt;
&lt;br /&gt;
On the left hand side of the window, we have a number of different types of widgets (labels, text/number/date fields, included resources, images, shapes and so on.&lt;br /&gt;
&lt;br /&gt;
You can drag and drop elements from the query (under the Data tab) or from the left of the window into the &amp;quot;Details&amp;quot; section now and preview the report, just to see what happens. For now, let's create a &amp;quot;Username&amp;quot; label, drag &amp;amp; drop the &amp;quot;member&amp;quot; field beside it, a &amp;quot;Posts&amp;quot; label, with the number of posts associated, and let's add a horizontal rule underneath. Select the horizontal rule and set its x offset to 5% and width to 90%, in the &amp;quot;Style&amp;quot; tab which is visible when the element is active. Then we can preview the report, just to make sure it's working OK, with &amp;quot;View-&amp;gt;Preview&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
If all is going well, you should see something like this:&lt;br /&gt;
&lt;br /&gt;
[[Image:Report_preview.png]]&lt;br /&gt;
&lt;br /&gt;
Now we can start making the report a little prettier.&lt;br /&gt;
&lt;br /&gt;
There are a few areas in the document structure which are not visible on the canvas by default (and so we can't drop elements onto them). The important ones for us are:&lt;br /&gt;
* Group Header/Footer: You can use &amp;quot;GROUP BY&amp;quot; in your SQL queries to aggregate data corresponding to certain criteria together - this is useful for generating statistics per mailing list, per author, per month, etc. You can associate a group with a group header/footer (the outermost group will necessarily be the first field in the &amp;quot;group by&amp;quot; clause) to have the report broken up into different groups. The group header and footer appear at the top &amp;amp; bottom of each group&lt;br /&gt;
* Details Header/Footer: &amp;quot;Details header&amp;quot; is typically used to provide the header row in a results table.&lt;br /&gt;
&lt;br /&gt;
To make these visible on the canvas, enabling you to drag &amp;amp; drop elements onto them, select them in the &amp;quot;Structure&amp;quot; tab on the left of the window, and under &amp;quot;Attributes&amp;quot;, change the attribute &amp;quot;hide-on-canvas&amp;quot; to false.&lt;br /&gt;
&lt;br /&gt;
You can then cut &amp;amp; paste the &amp;quot;Username&amp;quot; and &amp;quot;Posts&amp;quot; labels, add a &amp;quot;Rank&amp;quot; label, and arrange them as a header in the &amp;quot;Details header&amp;quot; area. You can set all the labels in the header to bold by setting the &amp;quot;font&amp;quot;/&amp;quot;bold&amp;quot; attribute in the &amp;quot;Style&amp;quot; tab to &amp;quot;true&amp;quot; for the details header area (the value will be inherited by all labels in the area).&lt;br /&gt;
&lt;br /&gt;
Now set a report title using a Message field (you can include data from a query in the text of the field using $(parameter), and add &amp;lt;pre&amp;gt;monthname(date_add(now(), INTERVAL -1 MONTH)) as monthname&amp;lt;/pre&amp;gt; to the fields in the query, and let's set the report title to &amp;quot;Top forum posters for $(monthname) $(year)&amp;quot;. PS, if anyone knows how I can avoid those repeated calls to &amp;lt;pre&amp;gt;date_add(now(), INTERVAL -1 MONTH)&amp;lt;/pre&amp;gt; in Pentaho, I would be delighted to know how.&lt;br /&gt;
&lt;br /&gt;
Just to pretty things up a touch more, let's set a background colour for the header, and alternate the row colours for odd &amp;amp; even rows between white &amp;amp; pale yellow. To do the even/odd banding, we're going to use a Pentaho &amp;quot;Row banding&amp;quot; function, which you'll find in the Format-&amp;gt;Row-Banding menu. For the moment, choose a colour for the &amp;quot;Visible colour&amp;quot; area, and click OK. Now in the Data tab, we can see our new row banding function. All we need to do is give it a name in that area, and drag &amp;amp; drop it to the Details area. I just called it &amp;quot;band&amp;quot;. Then, select &amp;quot;band&amp;quot; in the Structure tab, and in the &amp;quot;Style&amp;quot; tab, set the field &amp;quot;visible&amp;quot; to &amp;quot;false&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
If everything has gone to plan, when you choose &amp;quot;View-&amp;gt;Preview&amp;quot;, you should see something like the following:&lt;br /&gt;
&lt;br /&gt;
[[Image:Report_table.png]]&lt;br /&gt;
&lt;br /&gt;
== Next steps ==&lt;br /&gt;
&lt;br /&gt;
As with anything graphical, the possibilities are boundless. The next step will be to explain how to create graphics, and integrate many queries into a singnle report using subreports, and finally the deployment of reports so that we can generate them on demand via the Pentaho BI server, or schedule regular static report generation for publication.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Creating_a_report</id>
		<title>Metrics/Creating a report</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Creating_a_report"/>
				<updated>2011-07-07T15:12:08Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Laying out a report */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== How to create a Pentaho report ==&lt;br /&gt;
&lt;br /&gt;
There are two ways to create a Pentaho report: the first is a method to create simple tabular reports using the Pentaho user console &amp;quot;ad-hoc reporting&amp;quot; capability, but as the more powerful method, we will concentrate on the [http://wiki.pentaho.com/display/Reporting/Report+Designer Pentaho Report Designer], available [http://sourceforge.net/projects/pentaho/files/Report%20Designer/  from Sourceforge].&lt;br /&gt;
&lt;br /&gt;
There are four basic steps to creating a report in PRD: &lt;br /&gt;
1. Make a database connection available as a JNDI source for the Pentaho platform&lt;br /&gt;
2. Duplicate the JDNI configuration locally for PRD (since the designer will not have direct access to the Pentaho database)&lt;br /&gt;
3. Create queries which will be made available to the report designer&lt;br /&gt;
4. Lay out the report to make the results of those queries available to the user (as a graph, tabular data, etc)&lt;br /&gt;
&lt;br /&gt;
=== Configuring a new JNDI connection ===&lt;br /&gt;
&lt;br /&gt;
The easiest way to configure a new JNDI connection is to start the Pentaho Administration Console (found in $PENTAHO_JOME/administration_console) with &amp;quot;start_pac.sh&amp;quot;. The reason we use JNDI rather than a direct JDBC connection is that when we deploy the report to a different instance, where the username, password and database host are potentially different, we don't have to change the report, we simply use the same symbolic name for the JNDI source.&lt;br /&gt;
&lt;br /&gt;
In Pentaho Admin Console (PAC), click &amp;quot;Administration&amp;quot;, then &amp;quot;Database connections&amp;quot;, and create a new connection. For the forum database, for example, the local parameters here were:&lt;br /&gt;
&lt;br /&gt;
* Name: Forum&lt;br /&gt;
* Driver class: com.mysql.jdbc.Driver&lt;br /&gt;
* User: &amp;lt;db_user&amp;gt;&lt;br /&gt;
* Password: &amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
* URL: jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&lt;br /&gt;
Then click &amp;quot;Test&amp;quot;, and if everything is fine, you should see &amp;quot;Connection test successful&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Potential pitfalls are if you are using c3p0 libraries to do connection pooling [http://wiki.pentaho.com/display/ServerDoc2x/Configuring+for+MySQL as described in the Pentaho wiki], but [http://forums.pentaho.com/archive/index.php/t-78383.html have not included c3p0.jar on the classpath for PAC], or if the database user does not have the appropriate permissions or access for the database. You can troubleshoot permissions issues using the MySQL command line client.&lt;br /&gt;
&lt;br /&gt;
=== Configuring JNDI for PRD ===&lt;br /&gt;
&lt;br /&gt;
Before you can use a &amp;quot;pure&amp;quot; JNDI connection in PRD, you need to declare it locally to make it available.&lt;br /&gt;
&lt;br /&gt;
Open the file &amp;lt;pre&amp;gt;~/.pentaho/simple-jndi/default.properties&amp;lt;/pre&amp;gt; and add this to the bottom:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Forum/type=javax.sql.DataSource&lt;br /&gt;
Forum/driver=com.mysql.jdbc.Driver&lt;br /&gt;
Forum/user=&amp;lt;db_user&amp;gt;&lt;br /&gt;
Forum/password=&amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
Forum/url=jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hopefully the parallels between this configuration and the JNDI configuration in PAC are obvious! The prefix &amp;quot;Forum/&amp;quot; represents the symbolic name of the JNDI connection.&lt;br /&gt;
&lt;br /&gt;
=== Starting Pentaho Report Designer ===&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded PRD, you can run report_designer.sh directly from the directory in which you uncompressed the package. PRD looks like a typical Java application (that is to say, on Linux it's not very pretty).&lt;br /&gt;
&lt;br /&gt;
[[Image:PRD_screenshot.png]]&lt;br /&gt;
&lt;br /&gt;
Create a new report with File-&amp;gt;New, and you will see on the right-hand side of the screen a pane with two tabs, &amp;quot;Data&amp;quot; and &amp;quot;Structure&amp;quot;. PRD is semi-WYSIWYG. Reports have a fixed structure, and each part of the report is evaluated based on a query associated with the report. The group header, for example, will appear once for every group defined in a query, what we put in the data section will be evaluated once for each row returned by the query, and the report header will be evaluated only once per report generation. Styling of elements in the report is cascading and hierarchical: if I set a font family for the report root, this will cascade throughout the report. The styling syntax is very similar to CSS.&lt;br /&gt;
&lt;br /&gt;
To make a report do anything useful, we much associate a query with it. If we want to use several queries in one single document, we must declare all the queries at the top level, and then use sub-reports for each query to display.&lt;br /&gt;
&lt;br /&gt;
=== Creating and testing queries ===&lt;br /&gt;
&lt;br /&gt;
Click on the Data tab in PRD. We can define a number of sources of data for the report by right-clicking on &amp;quot;Data sets&amp;quot;. For a database source, we use the JDBC source type.&lt;br /&gt;
&lt;br /&gt;
On the screen which pops up, we can add a new connection by clicking on the + icon above the connection names, and then choose &amp;quot;JNDI&amp;quot; as our connection type, and set the name of the connection for PRD with the name of the JNDI connection we configured before (see screenshot).&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure_JNDI.png]]&lt;br /&gt;
&lt;br /&gt;
Test the connection with the &amp;quot;Test&amp;quot; button, then click &amp;quot;OK&amp;quot; to finalise things and make the connection available for queries.&lt;br /&gt;
&lt;br /&gt;
Once the connection is correctly configured, you can create and edit queries. Click the &amp;quot;+&amp;quot; icon above the &amp;quot;Available queries&amp;quot; frame, type in a name, and you can start typing directly into the Query text entry area.&lt;br /&gt;
&lt;br /&gt;
There is a graphical tool to help you join tables and perform more complicated wueries, which you can access by double-clicking the pencil icon above this area. By clicking on it, you will be brought to the SQL query designer. I have found this useful in the early stages of writing a queries, for simple joins between tables and for situations where I am not using any SQL functions, but for polishing queries, I tend to use the raw text.&lt;br /&gt;
&lt;br /&gt;
For now, let's create a query to list the top 20 posters for last month. Drag the &amp;quot;top_posters&amp;quot; table onto the query canvas area on the right; all of the fields will be selected by default, and that's fine. Just to test the query, start with &amp;quot;where month=6 and year=2011&amp;quot; as the condition.&lt;br /&gt;
&lt;br /&gt;
The final query is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     month,&lt;br /&gt;
     year,&lt;br /&gt;
     rank,&lt;br /&gt;
     member,&lt;br /&gt;
     posts&lt;br /&gt;
FROM&lt;br /&gt;
     forum_top_posters&lt;br /&gt;
WHERE&lt;br /&gt;
     month=6&lt;br /&gt;
and year=2011&lt;br /&gt;
LIMIT 20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and when you hit &amp;quot;Preview&amp;quot;, you should get a data table like this:&lt;br /&gt;
{|style=&amp;quot;border-collapse: separate; border-spacing: 0; border-width: 1px; border-style: solid; border-color: #000; padding: 0&amp;quot; cellpadding=&amp;quot;5&amp;quot;&lt;br /&gt;
!month!!year!!rank!!member!!posts&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	1||	qgil||	185&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	2||	sjgadsby||	115&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	3||	shmerl||	104&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	4||	texrat||	77&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	5||	jaffa||	60&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	6||	swift11||	52&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	7||	jonay||	42&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	8||	mja||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	9||	vitna||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	10||	timoph||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	11||	hartti||	29&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	12||	lm||	27&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	13||	rubear2009||	26&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	14||	andre||	25&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	15||	profebral||	23&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	16||	mikecomputing||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	17||	ritratt||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	18||	helex||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	19||	cckwes||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	20||	pycage||	19&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Hit OK, and now your query is selected for use with your report.&lt;br /&gt;
&lt;br /&gt;
To make the query always refer to &amp;quot;last month&amp;quot;, you need to resort to datetime functions in MySQL in the WHERE clause.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WHERE &lt;br /&gt;
month = month(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
AND year = year(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Laying out a report ===&lt;br /&gt;
&lt;br /&gt;
By default, there are 5 visible areas on the report canvas:&lt;br /&gt;
* Page Header and Footer: These are what they usually are in documents, a short text on the top/bottom of every page&lt;br /&gt;
* Report Header and Footer: you can think of these as the introduction and conclusion of your report, they will be evaluated once when the report. This is a good place to put an overview graph, or some introductory text, a report summary, or perhaps a sub-report (we'll get to those later)&lt;br /&gt;
* Details: The meat and two veg of the report, the &amp;quot;details&amp;quot; section is evaluated for each row in the associated query&lt;br /&gt;
&lt;br /&gt;
On the left hand side of the window, we have a number of different types of widgets (labels, text/number/date fields, included resources, images, shapes and so on.&lt;br /&gt;
&lt;br /&gt;
You can drag and drop elements from the query (under the Data tab) or from the left of the window into the &amp;quot;Details&amp;quot; section now and preview the report, just to see what happens. For now, let's create a &amp;quot;Username&amp;quot; label, drag &amp;amp; drop the &amp;quot;member&amp;quot; field beside it, a &amp;quot;Posts&amp;quot; label, with the number of posts associated, and let's add a horizontal rule underneath. Select the horizontal rule and set its x offset to 5% and width to 90%, in the &amp;quot;Style&amp;quot; tab which is visible when the element is active. Then we can preview the report, just to make sure it's working OK, with &amp;quot;View-&amp;gt;Preview&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
If all is going well, you should see something like this:&lt;br /&gt;
&lt;br /&gt;
[[Image:Report_preview.png]]&lt;br /&gt;
&lt;br /&gt;
Now we can start making the report a little prettier.&lt;br /&gt;
&lt;br /&gt;
There are a few areas in the document structure which are not visible on the canvas by default (and so we can't drop elements onto them). The important ones for us are:&lt;br /&gt;
* Group Header/Footer: You can use &amp;quot;GROUP BY&amp;quot; in your SQL queries to aggregate data corresponding to certain criteria together - this is useful for generating statistics per mailing list, per author, per month, etc. You can associate a group with a group header/footer (the outermost group will necessarily be the first field in the &amp;quot;group by&amp;quot; clause) to have the report broken up into different groups. The group header and footer appear at the top &amp;amp; bottom of each group&lt;br /&gt;
* Details Header/Footer: &amp;quot;Details header&amp;quot; is typically used to provide the header row in a results table.&lt;br /&gt;
&lt;br /&gt;
To make these visible on the canvas, enabling you to drag &amp;amp; drop elements onto them, select them in the &amp;quot;Structure&amp;quot; tab on the left of the window, and under &amp;quot;Attributes&amp;quot;, change the attribute &amp;quot;hide-on-canvas&amp;quot; to false.&lt;br /&gt;
&lt;br /&gt;
You can then cut &amp;amp; paste the &amp;quot;Username&amp;quot; and &amp;quot;Posts&amp;quot; labels, add a &amp;quot;Rank&amp;quot; label, and arrange them as a header in the &amp;quot;Details header&amp;quot; area. You can set all the labels in the header to bold by setting the &amp;quot;font&amp;quot;/&amp;quot;bold&amp;quot; attribute in the &amp;quot;Style&amp;quot; tab to &amp;quot;true&amp;quot; for the details header area (the value will be inherited by all labels in the area).&lt;br /&gt;
&lt;br /&gt;
Now set a report title using a Message field (you can include data from a query in the text of the field using $(parameter), and add &amp;lt;pre&amp;gt;monthname(date_add(now(), INTERVAL -1 MONTH)) as monthname&amp;lt;/pre&amp;gt; to the fields in the query, and let's set the report title to &amp;quot;Top forum posters for $(monthname) $(year)&amp;quot;. PS, if anyone knows how I can avoid those repeated calls to &amp;lt;pre&amp;gt;date_add(now(), INTERVAL -1 MONTH)&amp;lt;/pre&amp;gt; in Pentaho, I would be delighted to know how.&lt;br /&gt;
&lt;br /&gt;
Just to pretty things up a touch more, let's set a background colour for the header, and alternate the row colours for odd &amp;amp; even rows between white &amp;amp; pale yellow. To do the even/odd banding, we're going to use a Pentaho &amp;quot;Row banding&amp;quot; function, which you'll find in the Format-&amp;gt;Row-Banding menu. For the moment, choose a colour for the &amp;quot;Visible colour&amp;quot; area, and click OK. Now in the Data tab, we can see our new row banding function. All we need to do is give it a name in that area, and drag &amp;amp; drop it to the Details area. I just called it &amp;quot;band&amp;quot;. Then, select &amp;quot;band&amp;quot; in the Structure tab, and in the &amp;quot;Style&amp;quot; tab, set the field &amp;quot;visible&amp;quot; to &amp;quot;false&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
If everything has gone to plan, when you choose &amp;quot;View-&amp;gt;Preview&amp;quot;, you should see something like the following:&lt;br /&gt;
&lt;br /&gt;
[[Image:Report_table.png]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/File:Report_table.png</id>
		<title>File:Report table.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/File:Report_table.png"/>
				<updated>2011-07-07T15:11:31Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/File:Report_preview.png</id>
		<title>File:Report preview.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/File:Report_preview.png"/>
				<updated>2011-07-07T15:11:11Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Creating_a_report</id>
		<title>Metrics/Creating a report</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Creating_a_report"/>
				<updated>2011-07-07T15:10:33Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Laying out a report */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== How to create a Pentaho report ==&lt;br /&gt;
&lt;br /&gt;
There are two ways to create a Pentaho report: the first is a method to create simple tabular reports using the Pentaho user console &amp;quot;ad-hoc reporting&amp;quot; capability, but as the more powerful method, we will concentrate on the [http://wiki.pentaho.com/display/Reporting/Report+Designer Pentaho Report Designer], available [http://sourceforge.net/projects/pentaho/files/Report%20Designer/  from Sourceforge].&lt;br /&gt;
&lt;br /&gt;
There are four basic steps to creating a report in PRD: &lt;br /&gt;
1. Make a database connection available as a JNDI source for the Pentaho platform&lt;br /&gt;
2. Duplicate the JDNI configuration locally for PRD (since the designer will not have direct access to the Pentaho database)&lt;br /&gt;
3. Create queries which will be made available to the report designer&lt;br /&gt;
4. Lay out the report to make the results of those queries available to the user (as a graph, tabular data, etc)&lt;br /&gt;
&lt;br /&gt;
=== Configuring a new JNDI connection ===&lt;br /&gt;
&lt;br /&gt;
The easiest way to configure a new JNDI connection is to start the Pentaho Administration Console (found in $PENTAHO_JOME/administration_console) with &amp;quot;start_pac.sh&amp;quot;. The reason we use JNDI rather than a direct JDBC connection is that when we deploy the report to a different instance, where the username, password and database host are potentially different, we don't have to change the report, we simply use the same symbolic name for the JNDI source.&lt;br /&gt;
&lt;br /&gt;
In Pentaho Admin Console (PAC), click &amp;quot;Administration&amp;quot;, then &amp;quot;Database connections&amp;quot;, and create a new connection. For the forum database, for example, the local parameters here were:&lt;br /&gt;
&lt;br /&gt;
* Name: Forum&lt;br /&gt;
* Driver class: com.mysql.jdbc.Driver&lt;br /&gt;
* User: &amp;lt;db_user&amp;gt;&lt;br /&gt;
* Password: &amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
* URL: jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&lt;br /&gt;
Then click &amp;quot;Test&amp;quot;, and if everything is fine, you should see &amp;quot;Connection test successful&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Potential pitfalls are if you are using c3p0 libraries to do connection pooling [http://wiki.pentaho.com/display/ServerDoc2x/Configuring+for+MySQL as described in the Pentaho wiki], but [http://forums.pentaho.com/archive/index.php/t-78383.html have not included c3p0.jar on the classpath for PAC], or if the database user does not have the appropriate permissions or access for the database. You can troubleshoot permissions issues using the MySQL command line client.&lt;br /&gt;
&lt;br /&gt;
=== Configuring JNDI for PRD ===&lt;br /&gt;
&lt;br /&gt;
Before you can use a &amp;quot;pure&amp;quot; JNDI connection in PRD, you need to declare it locally to make it available.&lt;br /&gt;
&lt;br /&gt;
Open the file &amp;lt;pre&amp;gt;~/.pentaho/simple-jndi/default.properties&amp;lt;/pre&amp;gt; and add this to the bottom:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Forum/type=javax.sql.DataSource&lt;br /&gt;
Forum/driver=com.mysql.jdbc.Driver&lt;br /&gt;
Forum/user=&amp;lt;db_user&amp;gt;&lt;br /&gt;
Forum/password=&amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
Forum/url=jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hopefully the parallels between this configuration and the JNDI configuration in PAC are obvious! The prefix &amp;quot;Forum/&amp;quot; represents the symbolic name of the JNDI connection.&lt;br /&gt;
&lt;br /&gt;
=== Starting Pentaho Report Designer ===&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded PRD, you can run report_designer.sh directly from the directory in which you uncompressed the package. PRD looks like a typical Java application (that is to say, on Linux it's not very pretty).&lt;br /&gt;
&lt;br /&gt;
[[Image:PRD_screenshot.png]]&lt;br /&gt;
&lt;br /&gt;
Create a new report with File-&amp;gt;New, and you will see on the right-hand side of the screen a pane with two tabs, &amp;quot;Data&amp;quot; and &amp;quot;Structure&amp;quot;. PRD is semi-WYSIWYG. Reports have a fixed structure, and each part of the report is evaluated based on a query associated with the report. The group header, for example, will appear once for every group defined in a query, what we put in the data section will be evaluated once for each row returned by the query, and the report header will be evaluated only once per report generation. Styling of elements in the report is cascading and hierarchical: if I set a font family for the report root, this will cascade throughout the report. The styling syntax is very similar to CSS.&lt;br /&gt;
&lt;br /&gt;
To make a report do anything useful, we much associate a query with it. If we want to use several queries in one single document, we must declare all the queries at the top level, and then use sub-reports for each query to display.&lt;br /&gt;
&lt;br /&gt;
=== Creating and testing queries ===&lt;br /&gt;
&lt;br /&gt;
Click on the Data tab in PRD. We can define a number of sources of data for the report by right-clicking on &amp;quot;Data sets&amp;quot;. For a database source, we use the JDBC source type.&lt;br /&gt;
&lt;br /&gt;
On the screen which pops up, we can add a new connection by clicking on the + icon above the connection names, and then choose &amp;quot;JNDI&amp;quot; as our connection type, and set the name of the connection for PRD with the name of the JNDI connection we configured before (see screenshot).&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure_JNDI.png]]&lt;br /&gt;
&lt;br /&gt;
Test the connection with the &amp;quot;Test&amp;quot; button, then click &amp;quot;OK&amp;quot; to finalise things and make the connection available for queries.&lt;br /&gt;
&lt;br /&gt;
Once the connection is correctly configured, you can create and edit queries. Click the &amp;quot;+&amp;quot; icon above the &amp;quot;Available queries&amp;quot; frame, type in a name, and you can start typing directly into the Query text entry area.&lt;br /&gt;
&lt;br /&gt;
There is a graphical tool to help you join tables and perform more complicated wueries, which you can access by double-clicking the pencil icon above this area. By clicking on it, you will be brought to the SQL query designer. I have found this useful in the early stages of writing a queries, for simple joins between tables and for situations where I am not using any SQL functions, but for polishing queries, I tend to use the raw text.&lt;br /&gt;
&lt;br /&gt;
For now, let's create a query to list the top 20 posters for last month. Drag the &amp;quot;top_posters&amp;quot; table onto the query canvas area on the right; all of the fields will be selected by default, and that's fine. Just to test the query, start with &amp;quot;where month=6 and year=2011&amp;quot; as the condition.&lt;br /&gt;
&lt;br /&gt;
The final query is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     month,&lt;br /&gt;
     year,&lt;br /&gt;
     rank,&lt;br /&gt;
     member,&lt;br /&gt;
     posts&lt;br /&gt;
FROM&lt;br /&gt;
     forum_top_posters&lt;br /&gt;
WHERE&lt;br /&gt;
     month=6&lt;br /&gt;
and year=2011&lt;br /&gt;
LIMIT 20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and when you hit &amp;quot;Preview&amp;quot;, you should get a data table like this:&lt;br /&gt;
{|style=&amp;quot;border-collapse: separate; border-spacing: 0; border-width: 1px; border-style: solid; border-color: #000; padding: 0&amp;quot; cellpadding=&amp;quot;5&amp;quot;&lt;br /&gt;
!month!!year!!rank!!member!!posts&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	1||	qgil||	185&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	2||	sjgadsby||	115&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	3||	shmerl||	104&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	4||	texrat||	77&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	5||	jaffa||	60&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	6||	swift11||	52&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	7||	jonay||	42&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	8||	mja||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	9||	vitna||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	10||	timoph||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	11||	hartti||	29&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	12||	lm||	27&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	13||	rubear2009||	26&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	14||	andre||	25&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	15||	profebral||	23&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	16||	mikecomputing||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	17||	ritratt||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	18||	helex||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	19||	cckwes||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	20||	pycage||	19&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Hit OK, and now your query is selected for use with your report.&lt;br /&gt;
&lt;br /&gt;
To make the query always refer to &amp;quot;last month&amp;quot;, you need to resort to datetime functions in MySQL in the WHERE clause.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WHERE &lt;br /&gt;
month = month(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
AND year = year(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Laying out a report ===&lt;br /&gt;
&lt;br /&gt;
By default, there are 5 visible areas on the report canvas:&lt;br /&gt;
* Page Header and Footer: These are what they usually are in documents, a short text on the top/bottom of every page&lt;br /&gt;
* Report Header and Footer: you can think of these as the introduction and conclusion of your report, they will be evaluated once when the report. This is a good place to put an overview graph, or some introductory text, a report summary, or perhaps a sub-report (we'll get to those later)&lt;br /&gt;
* Details: The meat and two veg of the report, the &amp;quot;details&amp;quot; section is evaluated for each row in the associated query&lt;br /&gt;
&lt;br /&gt;
On the left hand side of the window, we have a number of different types of widgets (labels, text/number/date fields, included resources, images, shapes and so on.&lt;br /&gt;
&lt;br /&gt;
You can drag and drop elements from the query (under the Data tab) or from the left of the window into the &amp;quot;Details&amp;quot; section now and preview the report, just to see what happens. For now, let's create a &amp;quot;Username&amp;quot; label, drag &amp;amp; drop the &amp;quot;member&amp;quot; field beside it, a &amp;quot;Posts&amp;quot; label, with the number of posts associated, and let's add a horizontal rule underneath. Select the horizontal rule and set its x offset to 5% and width to 90%, in the &amp;quot;Style&amp;quot; tab which is visible when the element is active. Then we can preview the report, just to make sure it's working OK, with &amp;quot;View-&amp;gt;Preview&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
If all is going well, you should see something like this:&lt;br /&gt;
[[Image:Report_preview.png]]&lt;br /&gt;
&lt;br /&gt;
Now we can start making the report a little prettier.&lt;br /&gt;
&lt;br /&gt;
There are a few areas in the document structure which are not visible on the canvas by default (and so we can't drop elements onto them). The important ones for us are:&lt;br /&gt;
* Group Header/Footer: You can use &amp;quot;GROUP BY&amp;quot; in your SQL queries to aggregate data corresponding to certain criteria together - this is useful for generating statistics per mailing list, per author, per month, etc. You can associate a group with a group header/footer (the outermost group will necessarily be the first field in the &amp;quot;group by&amp;quot; clause) to have the report broken up into different groups. The group header and footer appear at the top &amp;amp; bottom of each group&lt;br /&gt;
* Details Header/Footer: &amp;quot;Details header&amp;quot; is typically used to provide the header row in a results table.&lt;br /&gt;
&lt;br /&gt;
To make these visible on the canvas, enabling you to drag &amp;amp; drop elements onto them, select them in the &amp;quot;Structure&amp;quot; tab on the left of the window, and under &amp;quot;Attributes&amp;quot;, change the attribute &amp;quot;hide-on-canvas&amp;quot; to false.&lt;br /&gt;
&lt;br /&gt;
You can then cut &amp;amp; paste the &amp;quot;Username&amp;quot; and &amp;quot;Posts&amp;quot; labels, add a &amp;quot;Rank&amp;quot; label, and arrange them as a header in the &amp;quot;Details header&amp;quot; area. You can set all the labels in the header to bold by setting the &amp;quot;font&amp;quot;/&amp;quot;bold&amp;quot; attribute in the &amp;quot;Style&amp;quot; tab to &amp;quot;true&amp;quot; for the details header area (the value will be inherited by all labels in the area).&lt;br /&gt;
&lt;br /&gt;
Now set a report title using a Message field (you can include data from a query in the text of the field using $(parameter), and add &amp;lt;pre&amp;gt;monthname(date_add(now(), INTERVAL -1 MONTH)) as monthname&amp;lt;/pre&amp;gt; to the fields in the query, and let's set the report title to &amp;quot;Top forum posters for $(monthname) $(year)&amp;quot;. PS, if anyone knows how I can avoid those repeated calls to &amp;lt;pre&amp;gt;date_add(now(), INTERVAL -1 MONTH)&amp;lt;/pre&amp;gt; in Pentaho, I would be delighted to know how.&lt;br /&gt;
&lt;br /&gt;
Just to pretty things up a touch more, let's set a background colour for the header, and alternate the row colours for odd &amp;amp; even rows between white &amp;amp; pale yellow. To do the even/odd banding, we're going to use a Pentaho &amp;quot;Row banding&amp;quot; function, which you'll find in the Format-&amp;gt;Row-Banding menu. For the moment, choose a colour for the &amp;quot;Visible colour&amp;quot; area, and click OK. Now in the Data tab, we can see our new row banding function. All we need to do is give it a name in that area, and drag &amp;amp; drop it to the Details area. I just called it &amp;quot;band&amp;quot;. Then, select &amp;quot;band&amp;quot; in the Structure tab, and in the &amp;quot;Style&amp;quot; tab, set the field &amp;quot;visible&amp;quot; to &amp;quot;false&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
If everything has gone to plan, when you choose &amp;quot;View-&amp;gt;Preview&amp;quot;, you should see something like this:&lt;br /&gt;
[[Image:Report_table.png]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Creating_a_report</id>
		<title>Metrics/Creating a report</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Creating_a_report"/>
				<updated>2011-07-07T14:20:30Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Creating and testing queries */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== How to create a Pentaho report ==&lt;br /&gt;
&lt;br /&gt;
There are two ways to create a Pentaho report: the first is a method to create simple tabular reports using the Pentaho user console &amp;quot;ad-hoc reporting&amp;quot; capability, but as the more powerful method, we will concentrate on the [http://wiki.pentaho.com/display/Reporting/Report+Designer Pentaho Report Designer], available [http://sourceforge.net/projects/pentaho/files/Report%20Designer/  from Sourceforge].&lt;br /&gt;
&lt;br /&gt;
There are four basic steps to creating a report in PRD: &lt;br /&gt;
1. Make a database connection available as a JNDI source for the Pentaho platform&lt;br /&gt;
2. Duplicate the JDNI configuration locally for PRD (since the designer will not have direct access to the Pentaho database)&lt;br /&gt;
3. Create queries which will be made available to the report designer&lt;br /&gt;
4. Lay out the report to make the results of those queries available to the user (as a graph, tabular data, etc)&lt;br /&gt;
&lt;br /&gt;
=== Configuring a new JNDI connection ===&lt;br /&gt;
&lt;br /&gt;
The easiest way to configure a new JNDI connection is to start the Pentaho Administration Console (found in $PENTAHO_JOME/administration_console) with &amp;quot;start_pac.sh&amp;quot;. The reason we use JNDI rather than a direct JDBC connection is that when we deploy the report to a different instance, where the username, password and database host are potentially different, we don't have to change the report, we simply use the same symbolic name for the JNDI source.&lt;br /&gt;
&lt;br /&gt;
In Pentaho Admin Console (PAC), click &amp;quot;Administration&amp;quot;, then &amp;quot;Database connections&amp;quot;, and create a new connection. For the forum database, for example, the local parameters here were:&lt;br /&gt;
&lt;br /&gt;
* Name: Forum&lt;br /&gt;
* Driver class: com.mysql.jdbc.Driver&lt;br /&gt;
* User: &amp;lt;db_user&amp;gt;&lt;br /&gt;
* Password: &amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
* URL: jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&lt;br /&gt;
Then click &amp;quot;Test&amp;quot;, and if everything is fine, you should see &amp;quot;Connection test successful&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Potential pitfalls are if you are using c3p0 libraries to do connection pooling [http://wiki.pentaho.com/display/ServerDoc2x/Configuring+for+MySQL as described in the Pentaho wiki], but [http://forums.pentaho.com/archive/index.php/t-78383.html have not included c3p0.jar on the classpath for PAC], or if the database user does not have the appropriate permissions or access for the database. You can troubleshoot permissions issues using the MySQL command line client.&lt;br /&gt;
&lt;br /&gt;
=== Configuring JNDI for PRD ===&lt;br /&gt;
&lt;br /&gt;
Before you can use a &amp;quot;pure&amp;quot; JNDI connection in PRD, you need to declare it locally to make it available.&lt;br /&gt;
&lt;br /&gt;
Open the file &amp;lt;pre&amp;gt;~/.pentaho/simple-jndi/default.properties&amp;lt;/pre&amp;gt; and add this to the bottom:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Forum/type=javax.sql.DataSource&lt;br /&gt;
Forum/driver=com.mysql.jdbc.Driver&lt;br /&gt;
Forum/user=&amp;lt;db_user&amp;gt;&lt;br /&gt;
Forum/password=&amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
Forum/url=jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hopefully the parallels between this configuration and the JNDI configuration in PAC are obvious! The prefix &amp;quot;Forum/&amp;quot; represents the symbolic name of the JNDI connection.&lt;br /&gt;
&lt;br /&gt;
=== Starting Pentaho Report Designer ===&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded PRD, you can run report_designer.sh directly from the directory in which you uncompressed the package. PRD looks like a typical Java application (that is to say, on Linux it's not very pretty).&lt;br /&gt;
&lt;br /&gt;
[[Image:PRD_screenshot.png]]&lt;br /&gt;
&lt;br /&gt;
Create a new report with File-&amp;gt;New, and you will see on the right-hand side of the screen a pane with two tabs, &amp;quot;Data&amp;quot; and &amp;quot;Structure&amp;quot;. PRD is semi-WYSIWYG. Reports have a fixed structure, and each part of the report is evaluated based on a query associated with the report. The group header, for example, will appear once for every group defined in a query, what we put in the data section will be evaluated once for each row returned by the query, and the report header will be evaluated only once per report generation. Styling of elements in the report is cascading and hierarchical: if I set a font family for the report root, this will cascade throughout the report. The styling syntax is very similar to CSS.&lt;br /&gt;
&lt;br /&gt;
To make a report do anything useful, we much associate a query with it. If we want to use several queries in one single document, we must declare all the queries at the top level, and then use sub-reports for each query to display.&lt;br /&gt;
&lt;br /&gt;
=== Creating and testing queries ===&lt;br /&gt;
&lt;br /&gt;
Click on the Data tab in PRD. We can define a number of sources of data for the report by right-clicking on &amp;quot;Data sets&amp;quot;. For a database source, we use the JDBC source type.&lt;br /&gt;
&lt;br /&gt;
On the screen which pops up, we can add a new connection by clicking on the + icon above the connection names, and then choose &amp;quot;JNDI&amp;quot; as our connection type, and set the name of the connection for PRD with the name of the JNDI connection we configured before (see screenshot).&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure_JNDI.png]]&lt;br /&gt;
&lt;br /&gt;
Test the connection with the &amp;quot;Test&amp;quot; button, then click &amp;quot;OK&amp;quot; to finalise things and make the connection available for queries.&lt;br /&gt;
&lt;br /&gt;
Once the connection is correctly configured, you can create and edit queries. Click the &amp;quot;+&amp;quot; icon above the &amp;quot;Available queries&amp;quot; frame, type in a name, and you can start typing directly into the Query text entry area.&lt;br /&gt;
&lt;br /&gt;
There is a graphical tool to help you join tables and perform more complicated wueries, which you can access by double-clicking the pencil icon above this area. By clicking on it, you will be brought to the SQL query designer. I have found this useful in the early stages of writing a queries, for simple joins between tables and for situations where I am not using any SQL functions, but for polishing queries, I tend to use the raw text.&lt;br /&gt;
&lt;br /&gt;
For now, let's create a query to list the top 20 posters for last month. Drag the &amp;quot;top_posters&amp;quot; table onto the query canvas area on the right; all of the fields will be selected by default, and that's fine. Just to test the query, start with &amp;quot;where month=6 and year=2011&amp;quot; as the condition.&lt;br /&gt;
&lt;br /&gt;
The final query is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     month,&lt;br /&gt;
     year,&lt;br /&gt;
     rank,&lt;br /&gt;
     member,&lt;br /&gt;
     posts&lt;br /&gt;
FROM&lt;br /&gt;
     forum_top_posters&lt;br /&gt;
WHERE&lt;br /&gt;
     month=6&lt;br /&gt;
and year=2011&lt;br /&gt;
LIMIT 20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and when you hit &amp;quot;Preview&amp;quot;, you should get a data table like this:&lt;br /&gt;
{|style=&amp;quot;border-collapse: separate; border-spacing: 0; border-width: 1px; border-style: solid; border-color: #000; padding: 0&amp;quot; cellpadding=&amp;quot;5&amp;quot;&lt;br /&gt;
!month!!year!!rank!!member!!posts&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	1||	qgil||	185&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	2||	sjgadsby||	115&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	3||	shmerl||	104&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	4||	texrat||	77&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	5||	jaffa||	60&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	6||	swift11||	52&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	7||	jonay||	42&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	8||	mja||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	9||	vitna||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	10||	timoph||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	11||	hartti||	29&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	12||	lm||	27&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	13||	rubear2009||	26&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	14||	andre||	25&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	15||	profebral||	23&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	16||	mikecomputing||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	17||	ritratt||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	18||	helex||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	19||	cckwes||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	20||	pycage||	19&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Hit OK, and now your query is selected for use with your report.&lt;br /&gt;
&lt;br /&gt;
To make the query always refer to &amp;quot;last month&amp;quot;, you need to resort to datetime functions in MySQL in the WHERE clause.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WHERE &lt;br /&gt;
month = month(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
AND year = year(date_add(now(), INTERVAL -1 MONTH))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Laying out a report ===&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Creating_a_report</id>
		<title>Metrics/Creating a report</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Creating_a_report"/>
				<updated>2011-07-07T13:34:01Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Creating and testing queries */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== How to create a Pentaho report ==&lt;br /&gt;
&lt;br /&gt;
There are two ways to create a Pentaho report: the first is a method to create simple tabular reports using the Pentaho user console &amp;quot;ad-hoc reporting&amp;quot; capability, but as the more powerful method, we will concentrate on the [http://wiki.pentaho.com/display/Reporting/Report+Designer Pentaho Report Designer], available [http://sourceforge.net/projects/pentaho/files/Report%20Designer/  from Sourceforge].&lt;br /&gt;
&lt;br /&gt;
There are four basic steps to creating a report in PRD: &lt;br /&gt;
1. Make a database connection available as a JNDI source for the Pentaho platform&lt;br /&gt;
2. Duplicate the JDNI configuration locally for PRD (since the designer will not have direct access to the Pentaho database)&lt;br /&gt;
3. Create queries which will be made available to the report designer&lt;br /&gt;
4. Lay out the report to make the results of those queries available to the user (as a graph, tabular data, etc)&lt;br /&gt;
&lt;br /&gt;
=== Configuring a new JNDI connection ===&lt;br /&gt;
&lt;br /&gt;
The easiest way to configure a new JNDI connection is to start the Pentaho Administration Console (found in $PENTAHO_JOME/administration_console) with &amp;quot;start_pac.sh&amp;quot;. The reason we use JNDI rather than a direct JDBC connection is that when we deploy the report to a different instance, where the username, password and database host are potentially different, we don't have to change the report, we simply use the same symbolic name for the JNDI source.&lt;br /&gt;
&lt;br /&gt;
In Pentaho Admin Console (PAC), click &amp;quot;Administration&amp;quot;, then &amp;quot;Database connections&amp;quot;, and create a new connection. For the forum database, for example, the local parameters here were:&lt;br /&gt;
&lt;br /&gt;
* Name: Forum&lt;br /&gt;
* Driver class: com.mysql.jdbc.Driver&lt;br /&gt;
* User: &amp;lt;db_user&amp;gt;&lt;br /&gt;
* Password: &amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
* URL: jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&lt;br /&gt;
Then click &amp;quot;Test&amp;quot;, and if everything is fine, you should see &amp;quot;Connection test successful&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Potential pitfalls are if you are using c3p0 libraries to do connection pooling [http://wiki.pentaho.com/display/ServerDoc2x/Configuring+for+MySQL as described in the Pentaho wiki], but [http://forums.pentaho.com/archive/index.php/t-78383.html have not included c3p0.jar on the classpath for PAC], or if the database user does not have the appropriate permissions or access for the database. You can troubleshoot permissions issues using the MySQL command line client.&lt;br /&gt;
&lt;br /&gt;
=== Configuring JNDI for PRD ===&lt;br /&gt;
&lt;br /&gt;
Before you can use a &amp;quot;pure&amp;quot; JNDI connection in PRD, you need to declare it locally to make it available.&lt;br /&gt;
&lt;br /&gt;
Open the file &amp;lt;pre&amp;gt;~/.pentaho/simple-jndi/default.properties&amp;lt;/pre&amp;gt; and add this to the bottom:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Forum/type=javax.sql.DataSource&lt;br /&gt;
Forum/driver=com.mysql.jdbc.Driver&lt;br /&gt;
Forum/user=&amp;lt;db_user&amp;gt;&lt;br /&gt;
Forum/password=&amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
Forum/url=jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hopefully the parallels between this configuration and the JNDI configuration in PAC are obvious! The prefix &amp;quot;Forum/&amp;quot; represents the symbolic name of the JNDI connection.&lt;br /&gt;
&lt;br /&gt;
=== Starting Pentaho Report Designer ===&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded PRD, you can run report_designer.sh directly from the directory in which you uncompressed the package. PRD looks like a typical Java application (that is to say, on Linux it's not very pretty).&lt;br /&gt;
&lt;br /&gt;
[[Image:PRD_screenshot.png]]&lt;br /&gt;
&lt;br /&gt;
Create a new report with File-&amp;gt;New, and you will see on the right-hand side of the screen a pane with two tabs, &amp;quot;Data&amp;quot; and &amp;quot;Structure&amp;quot;. PRD is semi-WYSIWYG. Reports have a fixed structure, and each part of the report is evaluated based on a query associated with the report. The group header, for example, will appear once for every group defined in a query, what we put in the data section will be evaluated once for each row returned by the query, and the report header will be evaluated only once per report generation. Styling of elements in the report is cascading and hierarchical: if I set a font family for the report root, this will cascade throughout the report. The styling syntax is very similar to CSS.&lt;br /&gt;
&lt;br /&gt;
To make a report do anything useful, we much associate a query with it. If we want to use several queries in one single document, we must declare all the queries at the top level, and then use sub-reports for each query to display.&lt;br /&gt;
&lt;br /&gt;
=== Creating and testing queries ===&lt;br /&gt;
&lt;br /&gt;
Click on the Data tab in PRD. We can define a number of sources of data for the report by right-clicking on &amp;quot;Data sets&amp;quot;. For a database source, we use the JDBC source type.&lt;br /&gt;
&lt;br /&gt;
On the screen which pops up, we can add a new connection by clicking on the + icon above the connection names, and then choose &amp;quot;JNDI&amp;quot; as our connection type, and set the name of the connection for PRD with the name of the JNDI connection we configured before (see screenshot).&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure_JNDI.png]]&lt;br /&gt;
&lt;br /&gt;
Test the connection with the &amp;quot;Test&amp;quot; button, then click &amp;quot;OK&amp;quot; to finalise things and make the connection available for queries.&lt;br /&gt;
&lt;br /&gt;
Once the connection is correctly configured, you can create and edit queries. Click the &amp;quot;+&amp;quot; icon above the &amp;quot;Available queries&amp;quot; frame, type in a name, and you can start typing directly into the Query text entry area.&lt;br /&gt;
&lt;br /&gt;
There is a graphical tool to help you join tables and perform more complicated wueries, which you can access by double-clicking the pencil icon above this area. By clicking on it, you will be brought to the SQL query designer. I have found this useful in the early stages of writing a queries, for simple joins between tables and for situations where I am not using any SQL functions, but for polishing queries, I tend to use the raw text.&lt;br /&gt;
&lt;br /&gt;
For now, let's create a query to list the top 20 posters for last month. Drag the &amp;quot;top_posters&amp;quot; table onto the query canvas area on the right; all of the fields will be selected by default, and that's fine. Just to test the query, start with &amp;quot;where month=6 and year=2011&amp;quot; as the condition.&lt;br /&gt;
&lt;br /&gt;
The final query is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     month,&lt;br /&gt;
     year,&lt;br /&gt;
     rank,&lt;br /&gt;
     member,&lt;br /&gt;
     posts&lt;br /&gt;
FROM&lt;br /&gt;
     forum_top_posters&lt;br /&gt;
WHERE&lt;br /&gt;
     month=6&lt;br /&gt;
and year=2011&lt;br /&gt;
LIMIT 20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and when you hit &amp;quot;Preview&amp;quot;, you should get a data table like this:&lt;br /&gt;
{|style=&amp;quot;border-collapse: separate; border-spacing: 0; border-width: 1px; border-style: solid; border-color: #000; padding: 0&amp;quot; cellpadding=&amp;quot;5&amp;quot;&lt;br /&gt;
!month!!year!!rank!!member!!posts&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	1||	qgil||	185&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	2||	sjgadsby||	115&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	3||	shmerl||	104&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	4||	texrat||	77&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	5||	jaffa||	60&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	6||	swift11||	52&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	7||	jonay||	42&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	8||	mja||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	9||	vitna||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	10||	timoph||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	11||	hartti||	29&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	12||	lm||	27&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	13||	rubear2009||	26&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	14||	andre||	25&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	15||	profebral||	23&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	16||	mikecomputing||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	17||	ritratt||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	18||	helex||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	19||	cckwes||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	20||	pycage||	19&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Hit OK, and now your query is selected for use with your report.&lt;br /&gt;
&lt;br /&gt;
=== Laying out a report ===&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Creating_a_report</id>
		<title>Metrics/Creating a report</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Creating_a_report"/>
				<updated>2011-07-07T13:31:58Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Creating and testing queries */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== How to create a Pentaho report ==&lt;br /&gt;
&lt;br /&gt;
There are two ways to create a Pentaho report: the first is a method to create simple tabular reports using the Pentaho user console &amp;quot;ad-hoc reporting&amp;quot; capability, but as the more powerful method, we will concentrate on the [http://wiki.pentaho.com/display/Reporting/Report+Designer Pentaho Report Designer], available [http://sourceforge.net/projects/pentaho/files/Report%20Designer/  from Sourceforge].&lt;br /&gt;
&lt;br /&gt;
There are four basic steps to creating a report in PRD: &lt;br /&gt;
1. Make a database connection available as a JNDI source for the Pentaho platform&lt;br /&gt;
2. Duplicate the JDNI configuration locally for PRD (since the designer will not have direct access to the Pentaho database)&lt;br /&gt;
3. Create queries which will be made available to the report designer&lt;br /&gt;
4. Lay out the report to make the results of those queries available to the user (as a graph, tabular data, etc)&lt;br /&gt;
&lt;br /&gt;
=== Configuring a new JNDI connection ===&lt;br /&gt;
&lt;br /&gt;
The easiest way to configure a new JNDI connection is to start the Pentaho Administration Console (found in $PENTAHO_JOME/administration_console) with &amp;quot;start_pac.sh&amp;quot;. The reason we use JNDI rather than a direct JDBC connection is that when we deploy the report to a different instance, where the username, password and database host are potentially different, we don't have to change the report, we simply use the same symbolic name for the JNDI source.&lt;br /&gt;
&lt;br /&gt;
In Pentaho Admin Console (PAC), click &amp;quot;Administration&amp;quot;, then &amp;quot;Database connections&amp;quot;, and create a new connection. For the forum database, for example, the local parameters here were:&lt;br /&gt;
&lt;br /&gt;
* Name: Forum&lt;br /&gt;
* Driver class: com.mysql.jdbc.Driver&lt;br /&gt;
* User: &amp;lt;db_user&amp;gt;&lt;br /&gt;
* Password: &amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
* URL: jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&lt;br /&gt;
Then click &amp;quot;Test&amp;quot;, and if everything is fine, you should see &amp;quot;Connection test successful&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Potential pitfalls are if you are using c3p0 libraries to do connection pooling [http://wiki.pentaho.com/display/ServerDoc2x/Configuring+for+MySQL as described in the Pentaho wiki], but [http://forums.pentaho.com/archive/index.php/t-78383.html have not included c3p0.jar on the classpath for PAC], or if the database user does not have the appropriate permissions or access for the database. You can troubleshoot permissions issues using the MySQL command line client.&lt;br /&gt;
&lt;br /&gt;
=== Configuring JNDI for PRD ===&lt;br /&gt;
&lt;br /&gt;
Before you can use a &amp;quot;pure&amp;quot; JNDI connection in PRD, you need to declare it locally to make it available.&lt;br /&gt;
&lt;br /&gt;
Open the file &amp;lt;pre&amp;gt;~/.pentaho/simple-jndi/default.properties&amp;lt;/pre&amp;gt; and add this to the bottom:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Forum/type=javax.sql.DataSource&lt;br /&gt;
Forum/driver=com.mysql.jdbc.Driver&lt;br /&gt;
Forum/user=&amp;lt;db_user&amp;gt;&lt;br /&gt;
Forum/password=&amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
Forum/url=jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hopefully the parallels between this configuration and the JNDI configuration in PAC are obvious! The prefix &amp;quot;Forum/&amp;quot; represents the symbolic name of the JNDI connection.&lt;br /&gt;
&lt;br /&gt;
=== Starting Pentaho Report Designer ===&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded PRD, you can run report_designer.sh directly from the directory in which you uncompressed the package. PRD looks like a typical Java application (that is to say, on Linux it's not very pretty).&lt;br /&gt;
&lt;br /&gt;
[[Image:PRD_screenshot.png]]&lt;br /&gt;
&lt;br /&gt;
Create a new report with File-&amp;gt;New, and you will see on the right-hand side of the screen a pane with two tabs, &amp;quot;Data&amp;quot; and &amp;quot;Structure&amp;quot;. PRD is semi-WYSIWYG. Reports have a fixed structure, and each part of the report is evaluated based on a query associated with the report. The group header, for example, will appear once for every group defined in a query, what we put in the data section will be evaluated once for each row returned by the query, and the report header will be evaluated only once per report generation. Styling of elements in the report is cascading and hierarchical: if I set a font family for the report root, this will cascade throughout the report. The styling syntax is very similar to CSS.&lt;br /&gt;
&lt;br /&gt;
To make a report do anything useful, we much associate a query with it. If we want to use several queries in one single document, we must declare all the queries at the top level, and then use sub-reports for each query to display.&lt;br /&gt;
&lt;br /&gt;
=== Creating and testing queries ===&lt;br /&gt;
&lt;br /&gt;
Click on the Data tab in PRD. We can define a number of sources of data for the report by right-clicking on &amp;quot;Data sets&amp;quot;. For a database source, we use the JDBC source type.&lt;br /&gt;
&lt;br /&gt;
On the screen which pops up, we can add a new connection by clicking on the + icon above the connection names, and then choose &amp;quot;JNDI&amp;quot; as our connection type, and set the name of the connection for PRD with the name of the JNDI connection we configured before (see screenshot).&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure_JNDI.png]]&lt;br /&gt;
&lt;br /&gt;
Test the connection with the &amp;quot;Test&amp;quot; button, then click &amp;quot;OK&amp;quot; to finalise things and make the connection available for queries.&lt;br /&gt;
&lt;br /&gt;
Once the connection is correctly configured, you can create and edit queries. Click the &amp;quot;+&amp;quot; icon above the &amp;quot;Available queries&amp;quot; frame, type in a name, and you can start typing directly into the Query text entry area.&lt;br /&gt;
&lt;br /&gt;
There is a graphical tool to help you join tables and perform more complicated wueries, which you can access by double-clicking the pencil icon above this area. By clicking on it, you will be brought to the SQL query designer. I have found this useful in the early stages of writing a queries, for simple joins between tables and for situations where I am not using any SQL functions, but for polishing queries, I tend to use the raw text.&lt;br /&gt;
&lt;br /&gt;
For now, let's create a query to list the top 20 posters for last month. Drag the &amp;quot;top_posters&amp;quot; table onto the query canvas area on the right; all of the fields will be selected by default, and that's fine. Just to test the query, start with &amp;quot;where month=6 and year=2011&amp;quot; as the condition.&lt;br /&gt;
&lt;br /&gt;
The final query is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     month,&lt;br /&gt;
     year,&lt;br /&gt;
     rank,&lt;br /&gt;
     member,&lt;br /&gt;
     posts&lt;br /&gt;
FROM&lt;br /&gt;
     forum_top_posters&lt;br /&gt;
WHERE&lt;br /&gt;
     month=6&lt;br /&gt;
and year=2011&lt;br /&gt;
LIMIT 20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and when you hit &amp;quot;Preview&amp;quot;, you should get a data table like this:&lt;br /&gt;
{|style=&amp;quot;border-collapse: separate; border-spacing: 0; border-width: 1px; border-style: solid; border-color: #000; padding: 0&amp;quot;&lt;br /&gt;
!month!!year!!rank!!member!!posts&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	1||	qgil||	185&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	2||	sjgadsby||	115&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	3||	shmerl||	104&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	4||	texrat||	77&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	5||	jaffa||	60&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	6||	swift11||	52&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	7||	jonay||	42&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	8||	mja||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	9||	vitna||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	10||	timoph||	35&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	11||	hartti||	29&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	12||	lm||	27&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	13||	rubear2009||	26&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	14||	andre||	25&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	15||	profebral||	23&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	16||	mikecomputing||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	17||	ritratt||	21&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	18||	helex||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	19||	cckwes||	19&lt;br /&gt;
|-&lt;br /&gt;
|6||	2011||	20||	pycage||	19&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Hit OK, and now your query is selected for use with your report.&lt;br /&gt;
&lt;br /&gt;
=== Laying out a report ===&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/File:PRD_screenshot.png</id>
		<title>File:PRD screenshot.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/File:PRD_screenshot.png"/>
				<updated>2011-07-07T11:05:11Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/File:Configure_JNDI.png</id>
		<title>File:Configure JNDI.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/File:Configure_JNDI.png"/>
				<updated>2011-07-07T11:04:58Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Creating_a_report</id>
		<title>Metrics/Creating a report</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Creating_a_report"/>
				<updated>2011-07-07T11:04:31Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Start documenting the creation of Pentaho reports.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== How to create a Pentaho report ==&lt;br /&gt;
&lt;br /&gt;
There are two ways to create a Pentaho report: the first is a method to create simple tabular reports using the Pentaho user console &amp;quot;ad-hoc reporting&amp;quot; capability, but as the more powerful method, we will concentrate on the [http://wiki.pentaho.com/display/Reporting/Report+Designer Pentaho Report Designer], available [http://sourceforge.net/projects/pentaho/files/Report%20Designer/  from Sourceforge].&lt;br /&gt;
&lt;br /&gt;
There are four basic steps to creating a report in PRD: &lt;br /&gt;
1. Make a database connection available as a JNDI source for the Pentaho platform&lt;br /&gt;
2. Duplicate the JDNI configuration locally for PRD (since the designer will not have direct access to the Pentaho database)&lt;br /&gt;
3. Create queries which will be made available to the report designer&lt;br /&gt;
4. Lay out the report to make the results of those queries available to the user (as a graph, tabular data, etc)&lt;br /&gt;
&lt;br /&gt;
=== Configuring a new JNDI connection ===&lt;br /&gt;
&lt;br /&gt;
The easiest way to configure a new JNDI connection is to start the Pentaho Administration Console (found in $PENTAHO_JOME/administration_console) with &amp;quot;start_pac.sh&amp;quot;. The reason we use JNDI rather than a direct JDBC connection is that when we deploy the report to a different instance, where the username, password and database host are potentially different, we don't have to change the report, we simply use the same symbolic name for the JNDI source.&lt;br /&gt;
&lt;br /&gt;
In Pentaho Admin Console (PAC), click &amp;quot;Administration&amp;quot;, then &amp;quot;Database connections&amp;quot;, and create a new connection. For the forum database, for example, the local parameters here were:&lt;br /&gt;
&lt;br /&gt;
* Name: Forum&lt;br /&gt;
* Driver class: com.mysql.jdbc.Driver&lt;br /&gt;
* User: &amp;lt;db_user&amp;gt;&lt;br /&gt;
* Password: &amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
* URL: jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&lt;br /&gt;
Then click &amp;quot;Test&amp;quot;, and if everything is fine, you should see &amp;quot;Connection test successful&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Potential pitfalls are if you are using c3p0 libraries to do connection pooling [http://wiki.pentaho.com/display/ServerDoc2x/Configuring+for+MySQL as described in the Pentaho wiki], but [http://forums.pentaho.com/archive/index.php/t-78383.html have not included c3p0.jar on the classpath for PAC], or if the database user does not have the appropriate permissions or access for the database. You can troubleshoot permissions issues using the MySQL command line client.&lt;br /&gt;
&lt;br /&gt;
=== Configuring JNDI for PRD ===&lt;br /&gt;
&lt;br /&gt;
Before you can use a &amp;quot;pure&amp;quot; JNDI connection in PRD, you need to declare it locally to make it available.&lt;br /&gt;
&lt;br /&gt;
Open the file &amp;lt;pre&amp;gt;~/.pentaho/simple-jndi/default.properties&amp;lt;/pre&amp;gt; and add this to the bottom:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Forum/type=javax.sql.DataSource&lt;br /&gt;
Forum/driver=com.mysql.jdbc.Driver&lt;br /&gt;
Forum/user=&amp;lt;db_user&amp;gt;&lt;br /&gt;
Forum/password=&amp;lt;sure_ill_tell_you&amp;gt;&lt;br /&gt;
Forum/url=jdbc:mysql://localhost:3306/forum&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Hopefully the parallels between this configuration and the JNDI configuration in PAC are obvious! The prefix &amp;quot;Forum/&amp;quot; represents the symbolic name of the JNDI connection.&lt;br /&gt;
&lt;br /&gt;
=== Starting Pentaho Report Designer ===&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded PRD, you can run report_designer.sh directly from the directory in which you uncompressed the package. PRD looks like a typical Java application (that is to say, on Linux it's not very pretty).&lt;br /&gt;
&lt;br /&gt;
[[Image:PRD_screenshot.png]]&lt;br /&gt;
&lt;br /&gt;
Create a new report with File-&amp;gt;New, and you will see on the right-hand side of the screen a pane with two tabs, &amp;quot;Data&amp;quot; and &amp;quot;Structure&amp;quot;. PRD is semi-WYSIWYG. Reports have a fixed structure, and each part of the report is evaluated based on a query associated with the report. The group header, for example, will appear once for every group defined in a query, what we put in the data section will be evaluated once for each row returned by the query, and the report header will be evaluated only once per report generation. Styling of elements in the report is cascading and hierarchical: if I set a font family for the report root, this will cascade throughout the report. The styling syntax is very similar to CSS.&lt;br /&gt;
&lt;br /&gt;
To make a report do anything useful, we much associate a query with it. If we want to use several queries in one single document, we must declare all the queries at the top level, and then use sub-reports for each query to display.&lt;br /&gt;
&lt;br /&gt;
=== Creating and testing queries ===&lt;br /&gt;
&lt;br /&gt;
Click on the Data tab in PRD. We can define a number of sources of data for the report by right-clicking on &amp;quot;Data sets&amp;quot;. For a database source, we use the JDBC source type.&lt;br /&gt;
&lt;br /&gt;
On the screen which pops up, we can add a new connection by clicking on the + icon above the connection names, and then choose &amp;quot;JNDI&amp;quot; as our connection type, and set the name of the connection for PRD with the name of the JNDI connection we configured before (see screenshot).&lt;br /&gt;
&lt;br /&gt;
[[Image:Configure_JNDI.png]]&lt;br /&gt;
&lt;br /&gt;
Test the connection with the &amp;quot;Test&amp;quot; button, then click &amp;quot;OK&amp;quot; to finalise things and make the connection available for queries.&lt;br /&gt;
&lt;br /&gt;
To be continued...&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Dashboard</id>
		<title>Metrics/Dashboard</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Dashboard"/>
				<updated>2011-07-07T10:15:49Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Architecture */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Community Metrics Dashboard ==&lt;br /&gt;
&lt;br /&gt;
The goal is to provide a web page summarising metrics about various aspects of the MeeGo project. The data should update regularly - depending on the metric, that could be real time or  updated automatically on a regular basis.&lt;br /&gt;
&lt;br /&gt;
The dashboard will track the following community resources, ideally: &lt;br /&gt;
* Drupal members&lt;br /&gt;
* Bugzilla (bugs opened, bugs closed, active users)&lt;br /&gt;
* Mailing lists (members, posts, threads)&lt;br /&gt;
* gitorious (commits, employer details for committers) - should use Jon Corbet's scripts like are used in the LF yearly kernel data.&lt;br /&gt;
* Wiki (edits, new pages)&lt;br /&gt;
* Forums (members, posts)&lt;br /&gt;
* IRC (total comments, people on channel)&lt;br /&gt;
* Transifex (Languages, translators, strings translated)&lt;br /&gt;
* Community OBS (uploads, users)&lt;br /&gt;
* SDK downloads (potentially extrapolated from meego.com)&lt;br /&gt;
&lt;br /&gt;
The data should also be available for custom reports for usage and analysis in the monthly MeeGo Metrics report published by [[User:DawnFoster]]&lt;br /&gt;
&lt;br /&gt;
To fulfill these goals, the dashboard will gather data from the various resource into a centralised database, using some sort of Business Intelligence platform including ETL for data acquisition and storage, and a reporting service for generating reports and dashboards.. A web page will provide a view into this database with predefined reports.&lt;br /&gt;
&lt;br /&gt;
Candidate reporting solutions:&lt;br /&gt;
&lt;br /&gt;
* [http://jasperforge.org/index.php?q=project/jasperreports JasperReports]&lt;br /&gt;
* [http://www.pentaho.com/ Pentaho]&lt;br /&gt;
&lt;br /&gt;
The following are essentially ETL engines, and do not provide reporting or dashboard functionality:&lt;br /&gt;
&lt;br /&gt;
* [http://www.talend.com/index.php Talend]&lt;br /&gt;
* [http://petals.ow2.org/ Petals]&lt;br /&gt;
&lt;br /&gt;
[http://www.mulesoft.com/ MuleSoft] is an open source ESB, but does not seem adapted to our needs. The field is thus narrowed to Pentaho and JasperReports.&lt;br /&gt;
&lt;br /&gt;
For each community resource, we need to figure out how to get the data into a usable form, and come up with appropriate queries for metrics reports, and finally present the results on a webpage.&lt;br /&gt;
&lt;br /&gt;
=== Business intelligence engines ===&lt;br /&gt;
&lt;br /&gt;
The area of Business Intelligence is littered with acronyms. Here's a quick overview of the main ones, and how they all fit together.&lt;br /&gt;
&lt;br /&gt;
; BI&lt;br /&gt;
: Business Intelligence - general name for any middleware which allows you to query business processes (sales, inventory, etc) and get data overviews from it&lt;br /&gt;
; ETL&lt;br /&gt;
: Extract, Transform, and Load - the process if extracting data from a data source (database, screen scraping, text file parsing, whatever), transforming it to a well understood format, and loading it in your BI engine database or data warehouse. Good ETL solutions provide a nice way for you to connect another database and have new data sucked in at regular intervals, define views into the source data store which you can then query within your BI engine, etc. Pentaho's ETL, [http://kettle.pentaho.com/ Kettle], and [http://www.jaspersoft.com/jasperetl JasperETL], used by JasperReports, both provide (kind of) straightforward ways to hook into a MySQL database.&lt;br /&gt;
; ESB&lt;br /&gt;
: [http://en.wikipedia.org/wiki/Enterprise_service_bus Enterprise Service Bus] - a middleware bus providing a unique interface to applications on the front-end and data stores on the back end. Often used to link up many front-end applications (eg. library, student registration, employee payroll, syllabus management, accounting, supply-chain, student lodgement programmes, etc in a university). Not really useful for us, as far as I can tell.&lt;br /&gt;
; EAI&lt;br /&gt;
: Enterprise Application Integration - using software to integrate different applications together. As far as I can tell, this is a meaningless catch-all phrase for anything from kludges to architected business intelligence solutions.&lt;br /&gt;
; DW&lt;br /&gt;
: Data Warehouse. Basically the same thing as a database, as far as I can tell, but bigger and more impressive sounding.&lt;br /&gt;
; OLAP&lt;br /&gt;
: On-Line Analytical Processing. Commonly used acronym for extracting data via multi-dimensional queries. Databases can be configured to provide the results of this kind of query. As far as I can tell this is mostly a buzzword - an &amp;quot;OLAP database&amp;quot; like [http://mondrian.pentaho.com/ Mondrian] is basically the same thing as a database. &amp;quot;speed-of-thought&amp;quot; response times indeed.&lt;br /&gt;
; Business reporting&lt;br /&gt;
: An application which allows a graphical view of a database, and allows you to construct queries interactively, often using drag &amp;amp; drop. The results of these queries can then be plugged into graphing software for presentation in a dashboard.&lt;br /&gt;
; Dashboard&lt;br /&gt;
: Organised presentation of information in a web-page or other similar format allowing an at-a-glance overview of the situation for the data being measured.&lt;br /&gt;
&lt;br /&gt;
So, in short, the community dashboard project will likely use an ETL to plug data into an OLAP server, and then use a business reporting engine to query that data and present it in a dashboard.&lt;br /&gt;
&lt;br /&gt;
=== Comparison of candidate ETL/reporting ===&lt;br /&gt;
&lt;br /&gt;
Modules available:&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
! Software !! License !! ETL !! OLAP database !! BI server !! Reporting !! Dashboard module&lt;br /&gt;
|-&lt;br /&gt;
| Pentaho || EPL || [http://kettle.pentaho.com/ Kettle] || [http://mondrian.pentaho.com/ Mondrian] || [http://community.pentaho.com/projects/bi_platform/ Pentaho BI Platform] || [http://reporting.pentaho.com/ Pentaho Reporting] || [http://wiki.pentaho.com/display/COM/Community+Dashboard+Framework Community Dashboard Framework]&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.jaspersoft.com/editions Jaspersoft] || AGPL v3 || JasperETL (Talend Open Studio) || JasperOLAP || JasperReports Server || iReports editor || No (commercial only)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Pentaho is used as the basis of Mozilla's metrics project, and provides a very strong community software option for both the dashboard and for managing the BI server. Since Mozilla metrics work overlaps what we are trying to achieve, particularly their work on SQR, the Software Quality Reports analytic module for Bugzilla and JIRA, Pentaho is my preference for the dashboard project. In general, I have observed that the Pentaho community provides very good support.&lt;br /&gt;
&lt;br /&gt;
== Architecture ==&lt;br /&gt;
&lt;br /&gt;
Pentaho runs as a webapp in Tomcat6. It can use a variety of databases for its internal data structures, the default (Hypersonic) is a Java database. However, because it's both standard &amp;amp; well understood and to allow consolidation of databases under one DB server, I prefer to use MySQL. The configuration of Pentaho with a MySQL database is a little tricky, but almost all of the steps are covered well [https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en in this tutorial].&lt;br /&gt;
&lt;br /&gt;
The data which is useful for metrics will be copied into a local database from each of the services we query. The copying of data will be accomplished by a set of Kettle &amp;quot;xactions&amp;quot;, which can be created and edited easily with the Spoon tool.&lt;br /&gt;
&lt;br /&gt;
A number of reports will be generated using the Pentaho Report Designer, including a static HTML/Flash dashboard which will be published regularly. Other reports can be created for the community managers, and a more advanced dashboard, allowing detailed analysis of basic metrics, can be provided via the Community Dashboard Framework.&lt;br /&gt;
&lt;br /&gt;
We will need to see how much load the dashboard will generate on the server. I suspect that it will not be practical to expose the dashboard in public.&lt;br /&gt;
&lt;br /&gt;
We will document here everything you need to do to replicate the MeeGo Community Dashboard, with the exception of data which is not publicly available because it contains security related or confidential information (mainly bugzilla).&lt;br /&gt;
&lt;br /&gt;
* [[../Installing Pentaho]]: A guide to getting Pentaho up and running for MeeGo Dashboard&lt;br /&gt;
* [[../Gathering data]]: A guide to installing and using the tools to gather data for a dashboard&lt;br /&gt;
* [[../Creating a report]]: Given data in a database, how do we generate a report in Pentaho, and deploy it to the dashboard?&lt;br /&gt;
&lt;br /&gt;
=== Extracting data ===&lt;br /&gt;
&lt;br /&gt;
For SQL databases, this implies that the server where the dashboard will run should have access to the database server for MediaWiki, Bugzilla, and Drupal.&lt;br /&gt;
&lt;br /&gt;
For the forum, we will integrate the CSV files currently being exported, which provide the basic analytics we need.&lt;br /&gt;
&lt;br /&gt;
Individual mailing lists will be parsed by [http://forge.morfeo-project.org/projects/libresoft-tools/ MLStats]. We will use the resulting database directly in the dashboard.&lt;br /&gt;
&lt;br /&gt;
Git repositories will be queried with &amp;quot;git log&amp;quot;, and parsed with the parser module from [http://lwn.net/Articles/290957/ gitdm], before being stored directly in a database. we will be able to run analytics on the results from there. gitdm can also do basic analytics of git logs, and we may decide to simply reuse gitdm's analytics. However, if we want to extend them, we will want to have the raw data.&lt;br /&gt;
&lt;br /&gt;
IRC logs will be parsed with [http://code.google.com/p/superseriousstats/ superseriousstats], a PHP command line tool that parses IRC logs and stores the results in an SQL database.&lt;br /&gt;
&lt;br /&gt;
We still need to figure out how to do data interchange with Transifex and OBS. Dimitris tells me that there are already [http://meego.transifex.net/stats/ some analytics] available on Transifex, and that there is a RESTful API available to query this data.&lt;br /&gt;
&lt;br /&gt;
== Data to report ==&lt;br /&gt;
&lt;br /&gt;
For each of the resources, the following statistics (at a minimum) should be extracted:&lt;br /&gt;
&lt;br /&gt;
=== Drupal ===&lt;br /&gt;
* Members of meego.com (+ evolution month over month)&lt;br /&gt;
* Active members (need a decent way to hook up different ways a person can be active: wiki, ML, IRC, forum, git)&lt;br /&gt;
&lt;br /&gt;
=== Mailing lists ===&lt;br /&gt;
* Subscriber numbers (+ evolution) - from Mailman directly, not available in mlstats&lt;br /&gt;
* Emails sent (+ evolution) - from mlstats&lt;br /&gt;
* Active participants (individuals with &amp;gt;=2 emails during month) - from mlstats&lt;br /&gt;
* Hot threads - from mlstats&lt;br /&gt;
* Top posters - from mlstats&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* Count posts of most popular threads:&lt;br /&gt;
*: &amp;lt;code&amp;gt;select subject,year(first_date) as y, monthname(first_date),count(*) as c from messages group by subject, month(first_date) order by y, month(first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Count the number of posts for each person&lt;br /&gt;
*: &amp;lt;code&amp;gt;select p.email_address,year(m.first_date) as y, monthname(m.first_date),count(*) as c from messages as m,messages_people as p where m.message_id=p.message_ID group by p.email_address, month(m.first_date) order by y, month(m.first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Restricting queries to a date range / month&lt;br /&gt;
*: The query above gives month-by-month totals, you could add an order by year(first_date) to get the year too&lt;br /&gt;
*: The easiest way is to add &amp;lt;code&amp;gt;where month(first_date)=3 and year(first_date)=2010&amp;lt;/code&amp;gt; for March 2010. For the current month, &amp;lt;code&amp;gt;month(m.first_date)=month(NOW()) and year(first_date)=year(NOW())&amp;lt;/code&amp;gt; works.&lt;br /&gt;
&lt;br /&gt;
=== Forum ===&lt;br /&gt;
* Posts per month (+evolution)&lt;br /&gt;
* Active posters (2+ posts during month)&lt;br /&gt;
* Hot topics&lt;br /&gt;
* Top posters&lt;br /&gt;
&lt;br /&gt;
[http://forum.meego.com/stats/ Stats] are exported from the Forum in CSV format monthly.&lt;br /&gt;
&lt;br /&gt;
=== Bugzilla ===&lt;br /&gt;
* Bugs created (+ evolution)&lt;br /&gt;
* Bugs resolved (+ evolution)&lt;br /&gt;
* Comments on bugs this month&lt;br /&gt;
* Active Bugzilla contributors (2+ comments during month)&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* [http://sourceforge.net/projects/qareports/ Software Quality Reports for Pentaho] of course&lt;br /&gt;
* See the [http://bugs.meego.com/query.cgi?format=report-table bugzilla tabular reports] and relative dates for more. &lt;br /&gt;
* Other potential resources:&lt;br /&gt;
** [http://www.ravenbrook.com/project/p4dti/tool/cgi/bugzilla-schema/ Bugzilla database schema by version]&lt;br /&gt;
** [https://landfill.bugzilla.org/bugzilla-tip/report.cgi Demo site for standard Bugzilla reports]&lt;br /&gt;
** [https://wiki.mozilla.org/Bugzilla:SQLCookBook The Bugzilla SQL cookbook]&lt;br /&gt;
** [http://www.bugzillametrics.org/ Bugzilla Metrics] - Third party add-on to Bugzilla that produces lots of metrics&lt;br /&gt;
** browse.cgi and weekly-summary.html extensions of GNOME Bugzilla: [https://launchpad.net/bugzilla.gnome.org Code], examples in practice: [http://bugzilla.gnome.org/browse.cgi?product=Evolution browse.cgi], [https://bugzilla.gnome.org/page.cgi?id=weekly-bug-summary.html weekly-summary.cgi]&lt;br /&gt;
&lt;br /&gt;
=== Mediawiki ===&lt;br /&gt;
* New wiki pages (+ evolution)&lt;br /&gt;
* Edits this month (+ evolution)&lt;br /&gt;
* Pages deleted this month (+ evolution)&lt;br /&gt;
* Unique editors this month (+ evoluion)&lt;br /&gt;
&lt;br /&gt;
==== Queries ====&lt;br /&gt;
&lt;br /&gt;
A [http://www.mediawiki.org/wiki/Extension:SpecialUserScore MediaWiki extension] exists to provide &amp;quot;user scores&amp;quot; for MediaWiki users, ordered by number of edits and number of pages changed. The guts of the query is:&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(wr.rev_id) as value,&lt;br /&gt;
        COUNT(DISTINCT wr.rev_page) as page_value,&lt;br /&gt;
        wu.user_name as name,&lt;br /&gt;
        wu.user_real_name as real_name&lt;br /&gt;
 FROM   $user wu,&lt;br /&gt;
        $revision wr,&lt;br /&gt;
        $page wp&lt;br /&gt;
 WHERE  wu.user_id = wr.rev_user&lt;br /&gt;
    and wp.page_id = wr.rev_page&lt;br /&gt;
    and wp.page_namespace = 0&lt;br /&gt;
 GROUP BY wu.user_name&lt;br /&gt;
 ORDER BY value desc;&lt;br /&gt;
&lt;br /&gt;
where $user, $revision and $page are the names of the respective MediaWiki tables (MediaWiki tables have a prefix associated with them for a given instance, specified by $wgDBprefix in LocalSettings.php).&lt;br /&gt;
&lt;br /&gt;
For the following group-by-month queries, I did a cross join of (2008,2009,2010,2011) and (01-12) to generate a &amp;quot;year and month&amp;quot; data table.&lt;br /&gt;
&lt;br /&gt;
'''Top editors by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year AS yyyy,&lt;br /&gt;
        mon.timestamp_month AS mm,&lt;br /&gt;
        rev_user_text AS user,&lt;br /&gt;
        COUNT(*) AS c&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months AS mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm,user&lt;br /&gt;
 HAVING c&amp;gt;5&lt;br /&gt;
 ORDER BY yyyy,mm,c desc;&lt;br /&gt;
&lt;br /&gt;
'''Number of edits by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*) AS edits&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
'''New pages per month:'''&lt;br /&gt;
To get the number of new pages per month is a bit trickier - first we need to query $revision to get the page_ids and their date of creation, then group by date. The query is O(n²) on the number of pages, although it should be possible to make it O(n) by grouping the result of the subquery without doing in() on the list of timestamps.&lt;br /&gt;
&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*)&lt;br /&gt;
 FROM mw_revision as rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE CONCAT(CONCAT(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
   AND rev.rev_timestamp in (&lt;br /&gt;
                SELECT MIN(rev_timestamp)&lt;br /&gt;
                FROM mw_revision&lt;br /&gt;
                GROUP BY rev_page)&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
To get just the list of pages &amp;amp; timestamps (this is used as the subquery for above):&lt;br /&gt;
 SELECT rev_page as p,&lt;br /&gt;
        MIN(rev_timestamp) as t&lt;br /&gt;
 FROM mw_revision&lt;br /&gt;
 GROUP BY rev_page;&lt;br /&gt;
&lt;br /&gt;
=== Transifex ===&lt;br /&gt;
* Total languages ordered by translation coverage (+ evolution)&lt;br /&gt;
* Top languages&lt;br /&gt;
* Top translators/teams&lt;br /&gt;
&lt;br /&gt;
This all depends on what is available from Transifex.&lt;br /&gt;
&lt;br /&gt;
=== Git ===&lt;br /&gt;
* Commits this month (+ evolution)&lt;br /&gt;
* Top committers (+ evolution)&lt;br /&gt;
* Committers by company (possible)&lt;br /&gt;
* Active modules (+ evolution)&lt;br /&gt;
&lt;br /&gt;
Using a modified version of gitdm to dump Git logs into a MySQL database for analysis. Modifications required:&lt;br /&gt;
* Create database &amp;amp; tables based on the gitdm data structures&lt;br /&gt;
* Dump data in correct order, avoiding redundancy if possible, into SQL database&lt;br /&gt;
&lt;br /&gt;
gitdm has 3 basic data structures: Hacker, Employer &amp;amp; Patch. Each changeset is a Patch object, each Patch has an Author, and is assigned to an Employer (based on who the Hacker was working for at the time of the Patch). Each Patch also has a list of Hackers who reviewed, reported, signed-off on and tested the patch. Each Hacker links to a list of Patches for which they are the author, a list of email addresses they have used to commit, and separate lists for reviewed, reported, SOB and tested. In addition, each Hacker has a list of Employers he has worked for, and each Employer has a list of Hackers who have worked for them.&lt;br /&gt;
&lt;br /&gt;
=== IRC ===&lt;br /&gt;
&lt;br /&gt;
superseriousstats does some preliminary analysis on data it stores in its database. Its author (tommyrot) has kindly added a parser for the format of the IRC logs we use (supybot) on my request. The [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v3.sql database schema] is a little hard to work out; Several key tables have fields with undescriptive names like l_01. There are some queries in [https://github.com/tommyrot/superseriousstats/blob/master/html.class.php html.class.php] which we can use to generate some reports, though.&lt;br /&gt;
 &lt;br /&gt;
* Total IRC activity (by hour)&lt;br /&gt;
 select sum(`l_00`) as `l_00`, sum(`l_01`) as `l_01`, sum(`l_02`) as `l_02`,&lt;br /&gt;
        sum(`l_03`) as `l_03`, sum(`l_04`) as `l_04`, sum(`l_05`) as `l_05`,&lt;br /&gt;
        sum(`l_06`) as `l_06`, sum(`l_07`) as `l_07`, sum(`l_08`) as `l_08`,&lt;br /&gt;
        sum(`l_09`) as `l_09`, sum(`l_10`) as `l_10`, sum(`l_11`) as `l_11`,&lt;br /&gt;
        sum(`l_12`) as `l_12`, sum(`l_13`) as `l_13`, sum(`l_14`) as `l_14`,&lt;br /&gt;
        sum(`l_15`) as `l_15`, sum(`l_16`) as `l_16`, sum(`l_17`) as `l_17`,&lt;br /&gt;
        sum(`l_18`) as `l_18`, sum(`l_19`) as `l_19`, sum(`l_20`) as `l_20`,&lt;br /&gt;
        sum(`l_21`) as `l_21`, sum(`l_22`) as `l_22`, sum(`l_23`) as `l_23`&lt;br /&gt;
   from `channel`&lt;br /&gt;
* Total active participants (+ evolution) - we may be able to get &amp;quot;number of participants per hour/day/month&amp;quot; (so you can see if it's 2 guys taking amongst themselves or a larger group) - I'll ask tommyrot what the query should look like.&lt;br /&gt;
* Top contributors (per month)&lt;br /&gt;
 select `q_lines`.`ruid`, `csnick`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_total`) as `l_total`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_night`) as `l_night`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_morning`) as `l_morning`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_afternoon`) as `l_afternoon`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_evening`) as `l_evening`,&lt;br /&gt;
        `quote` from `q_lines`&lt;br /&gt;
    join `q_activity_by_month` on `q_lines`.`ruid` = `q_activity_by_month`.`ruid`&lt;br /&gt;
    join `user_status` on `q_lines`.`ruid` = `user_status`.`uid`&lt;br /&gt;
    join `user_details` on `q_lines`.`ruid` = `user_details`.`uid`&lt;br /&gt;
    where `status` != 3&lt;br /&gt;
      and `date` = '2011-02'&lt;br /&gt;
    group by `q_lines`.`ruid`&lt;br /&gt;
    order by `q_activity_by_month`.`l_total` desc, `q_lines`.`ruid` asc limit 30&lt;br /&gt;
&lt;br /&gt;
=== Community &amp;amp; Official OBS ===&lt;br /&gt;
* Package submissions (+ evolution)&lt;br /&gt;
* Active participants (+ evolution)&lt;br /&gt;
&lt;br /&gt;
== Not yet in scope ==&lt;br /&gt;
&lt;br /&gt;
I have not yet considered how I might get web analytics and download stats.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-07-07T10:13:05Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: How we get Forum stats into metrics&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists - MLStats ==&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with &amp;lt;pre&amp;gt;setup.py install --prefix=/install/path&amp;lt;/pre&amp;gt; you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
 /path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
&lt;br /&gt;
The command line option &amp;lt;pre&amp;gt;--no-report&amp;lt;/pre&amp;gt; suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
 for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
             meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
             meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
             meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
             meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
             meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
 do&lt;br /&gt;
   /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
 done;&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[Image:Mlgraph.png]]&lt;br /&gt;
&lt;br /&gt;
== IRC - SuperSeriousStats ==&lt;br /&gt;
&lt;br /&gt;
[http://code.google.com/p/superseriousstats/ SuperSeriousStats] is a super serious IRC log analyser, developed by Jos de Ruiter. It is written in PHP.&lt;br /&gt;
&lt;br /&gt;
To use SuperSeriousStats, you need to install the php-cli command line interface. Once you have this, and the [https://github.com/tommyrot/superseriousstats/wiki/Setup-guide other dependencies for sss listed here], you will be able to run sss.&lt;br /&gt;
&lt;br /&gt;
Before running it for the first time, [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v4.sql initialise the database]. You will need to download all of the [http://mg.pov.lt/meego-irclog/index.html IRC logs] you wish to import first. Logs are expected in the Supybot format ([http://mg.pov.lt/meego-irclog/%23meego.2011-06-15.log example]).&lt;br /&gt;
&lt;br /&gt;
=== Getting the IRC logs ===&lt;br /&gt;
For the initial download, the following script should help get all the logs:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
for y in 2010 2011; do&lt;br /&gt;
 for m in `seq -f '%02g' 1 12`; do&lt;br /&gt;
  for d in `seq -f '%02g' 1 31`; do&lt;br /&gt;
   wget http://mg.pov.lt/meego-irclog/%23meego.$y-$m-$d.log;&lt;br /&gt;
  done;&lt;br /&gt;
 done;&lt;br /&gt;
done;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Thereafter, you can add the following line in your cron job to get the previous day's logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate= `date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Parsing the logs ===&lt;br /&gt;
&lt;br /&gt;
To run sss, you will need to modify a number of fields in sss.conf. Notably, I had to change the filename format to &amp;quot;*.Y-m-d.\lo\g&amp;quot; to match the log file names, and set log format to supybot. I recommend creating a separate config file for each channel you want to parse. Here is the appropriate sss.conf for #meego logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#&lt;br /&gt;
#    This file contains all of sss' settings along with their defaults.&lt;br /&gt;
#    All values must be placed between double quotes, even empty ones.&lt;br /&gt;
#&lt;br /&gt;
#################################  Required  ###################################&lt;br /&gt;
&lt;br /&gt;
channel = &amp;quot;#meego&amp;quot;		# Name of the IRC channel.&lt;br /&gt;
timezone = &amp;quot;UTC&amp;quot;	# Timezone the logs are in. Used for time offset&lt;br /&gt;
			# calculations and conversions.&lt;br /&gt;
			# See http://php.net/manual/en/timezones.php&lt;br /&gt;
db_host = &amp;quot;host&amp;quot;		# IP address or FQDN of the MySQL server.&lt;br /&gt;
db_port = &amp;quot;3306&amp;quot;	# Port the MySQL server is listening on.&lt;br /&gt;
db_user = &amp;quot;db_user&amp;quot;	# MySQL user.&lt;br /&gt;
db_pass = &amp;quot;db_password&amp;quot;		# MySQL password.&lt;br /&gt;
db_name = &amp;quot;db_name&amp;quot;		# Name of the MySQL database used for sss.&lt;br /&gt;
parser = &amp;quot;parser_supybot&amp;quot;	# The parser to use depending on logfile format.&lt;br /&gt;
			# e.g. &amp;quot;parser_irssi&amp;quot; or &amp;quot;parser_eggdrop&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# This string contains the format of the date within a logfile filename.&lt;br /&gt;
# Examples:&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: *.Ymd&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: \#c\h\atroo\m.Ymd&lt;br /&gt;
#   filename: chatroom.log-31012003	dateformat: *.*-dmY&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.\g\z&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.*&lt;br /&gt;
# See http://php.net/date_create_from_format for more specific syntax options.&lt;br /&gt;
logfile_dateformat = &amp;quot;*.Y-m-d.\lo\g&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
sss has a mechanism to prevent it from re-importing log files for the same channel, by storing dates for which it has already parsed files in the table &amp;quot;parse_history&amp;quot;, but you can store summary data from several channels in one database.&lt;br /&gt;
&lt;br /&gt;
To do the initial import, just run&lt;br /&gt;
 php sss.php -c meego.conf -i ${logdir} -o report.html&lt;br /&gt;
&lt;br /&gt;
And to do the nightly import, put the following in a shell script, and call it from cron:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate=`date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   logdir=/path/to/logfiles&lt;br /&gt;
   sssdir=/path/to/sss-4.0&lt;br /&gt;
&lt;br /&gt;
   /usr/bin/wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
   /usr/bin/php ${sssdir}/sss.php -c meego.conf -i ${logdir}/\#meego.${logdate}.log&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Queries ===&lt;br /&gt;
&lt;br /&gt;
The sss report is built from a number of queries. A number of other useful sss queries are already listed in [[Metrics/Dashboard#IRC | the Dashboard page]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Activity by month'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  select date_format(`date`, '%Y-%m') as `date`,&lt;br /&gt;
       sum(`l_total`) as `l_total`,&lt;br /&gt;
       sum(`l_night`) as `l_night`,&lt;br /&gt;
       sum(`l_morning`) as `l_morning`,&lt;br /&gt;
       sum(`l_afternoon`) as `l_afternoon`,&lt;br /&gt;
       sum(`l_evening`) as `l_evening`&lt;br /&gt;
  from `channel`&lt;br /&gt;
  group by year(`date`), month(`date`);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Activity by day over last 30 days'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  select `date`, `l_total`, `l_night`, `l_morning`, `l_afternoon`, `l_evening` from `channel`&lt;br /&gt;
  where `date` &amp;gt; DATE_SUB(CURDATE(),INTERVAL 30 DAY);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Forums ==&lt;br /&gt;
&lt;br /&gt;
Forum stats are available as a series of CSV files on [http://forums.meego.com/stats forums.meego.com], supplied monthly. We need to download the .csv files every month (just the latest ones), parse the CSV files into a database and generate a report from that.&lt;br /&gt;
&lt;br /&gt;
=== Downloading CSV files ===&lt;br /&gt;
&lt;br /&gt;
To get started and load up all of the old stats, run the following:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -nd -P &amp;lt;local dir for data&amp;gt; -r -l1 --no-parent -A.csv http://forum.meego.com/stats&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will download all CSV files to the local directory specified.&lt;br /&gt;
&lt;br /&gt;
For the monthly refresh, we use wget in a cron script as follows:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
y=`/bin/date -d &amp;quot;1 month ago&amp;quot; +%Y`&lt;br /&gt;
m=`/bin/date -d &amp;quot;1 month ago&amp;quot; +%m`&lt;br /&gt;
&lt;br /&gt;
wget -nd -P &amp;lt;local dir for data&amp;gt; -r -l1 --no-parent -A &amp;quot;${y}${m}*.csv&amp;quot; http://forum.meego.com/stats&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
That wget command line is worth explaining:&lt;br /&gt;
* -nd: Don't create the remote directory structure when downloading the files locally&lt;br /&gt;
* -P &amp;lt;directory&amp;gt;: Download files to the parent directory specified&lt;br /&gt;
* -r: Recursively download&lt;br /&gt;
* -l1: Limit to 1 level of directories (combining -r and -l1 allows us to download several files at the same time)&lt;br /&gt;
* --no-parent: Ignore the .. link&lt;br /&gt;
* -A &amp;quot;${y}${m}*.csv&amp;quot;: Match filenames of the form &amp;quot;YYYYMM*.csv&amp;quot; - gives us the latest stats files only&lt;br /&gt;
&lt;br /&gt;
=== Database schema ===&lt;br /&gt;
&lt;br /&gt;
We created 7 tables, one for each of the statistics provided by the forum.&lt;br /&gt;
&lt;br /&gt;
Here is the database schema:&lt;br /&gt;
&amp;lt;pre&amp;gt;--&lt;br /&gt;
-- Table structure for table `forum_cumulative_posts`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_cumulative_posts`;&lt;br /&gt;
CREATE TABLE `forum_cumulative_posts` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `forum` varchar(50) NOT NULL DEFAULT '',&lt;br /&gt;
  `posts` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`forum`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_cumulative_threads`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_cumulative_threads`;&lt;br /&gt;
CREATE TABLE `forum_cumulative_threads` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `forum` varchar(50) NOT NULL DEFAULT '',&lt;br /&gt;
  `threads` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`forum`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_hottest_threads`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_hottest_threads`;&lt;br /&gt;
CREATE TABLE `forum_hottest_threads` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `rank` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;
  `title` varchar(255) DEFAULT NULL,&lt;br /&gt;
  `posts` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`rank`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_most_viewed_threads`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_most_viewed_threads`;&lt;br /&gt;
CREATE TABLE `forum_most_viewed_threads` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `rank` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;
  `title` varchar(255) DEFAULT NULL,&lt;br /&gt;
  `views` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`rank`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_posts`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_posts`;&lt;br /&gt;
CREATE TABLE `forum_posts` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `forum` varchar(50) NOT NULL DEFAULT '',&lt;br /&gt;
  `posts` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`forum`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_top_posters`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_top_posters`;&lt;br /&gt;
CREATE TABLE `forum_top_posters` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `rank` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;
  `member` varchar(50) DEFAULT NULL,&lt;br /&gt;
  `posts` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`rank`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
--&lt;br /&gt;
-- Table structure for table `forum_top_thanked`&lt;br /&gt;
--&lt;br /&gt;
&lt;br /&gt;
DROP TABLE IF EXISTS `forum_top_thanked`;&lt;br /&gt;
CREATE TABLE `forum_top_thanked` (&lt;br /&gt;
  `month` int(11) NOT NULL,&lt;br /&gt;
  `year` int(11) NOT NULL,&lt;br /&gt;
  `rank` int(11) NOT NULL DEFAULT '0',&lt;br /&gt;
  `member` varchar(50) DEFAULT NULL,&lt;br /&gt;
  `thanks` int(11) DEFAULT NULL,&lt;br /&gt;
  PRIMARY KEY (`month`,`year`,`rank`)&lt;br /&gt;
);&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We import the files into the databases via &amp;lt;pre&amp;gt;LOAD DATA LOCAL INFILE&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd &amp;lt;local data directory&amp;gt;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_forum_cumulative_posts.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_forum_cumulative_posts.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # forum,posts&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_forum_cumulative_posts.csv'&lt;br /&gt;
         INTO TABLE forum_cumulative_posts&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (forum, posts)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_forum_cumulative_threads.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_forum_cumulative_threads.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # forum,threads&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_forum_cumulative_threads.csv'&lt;br /&gt;
         INTO TABLE forum_cumulative_threads&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (forum, threads)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_forum_posts.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_forum_posts.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # forum,posts&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_forum_posts.csv'&lt;br /&gt;
         INTO TABLE forum_posts&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (forum, posts)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_hottest_threads.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_hottest_threads.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # rank,title,posts&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_hottest_threads.csv'&lt;br /&gt;
         INTO TABLE forum_hottest_threads&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (rank, title, posts)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_most_viewed_threads.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_most_viewed_threads.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # rank,title,views&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_most_viewed_threads.csv'&lt;br /&gt;
         INTO TABLE forum_most_viewed_threads&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (rank, title, views)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_top_posters.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_top_posters.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # rank,member,posts&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_top_posters.csv'&lt;br /&gt;
         INTO TABLE forum_top_posters&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (rank, member, posts)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
if [ -f ${y}${m}_top_thanked.csv ]; then&lt;br /&gt;
  echo &amp;quot;Importing ${y}${m}_top_thanked.csv&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
  # rank,member,thanks&lt;br /&gt;
  mysql -u&amp;lt;db_user&amp;gt; -p&amp;lt;db_password&amp;gt; -h &amp;lt;host&amp;gt; &amp;lt;database&amp;gt; -e \&lt;br /&gt;
        &amp;quot;LOAD DATA LOCAL INFILE '${y}${m}_top_thanked.csv'&lt;br /&gt;
         INTO TABLE forum_top_thanked&lt;br /&gt;
         FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\&amp;quot;'&lt;br /&gt;
         IGNORE 1 LINES&lt;br /&gt;
         (rank, member, thanks)&lt;br /&gt;
         set year=$y, month=$m&amp;quot;&lt;br /&gt;
fi;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We run this script (which downloads and imports the .csv files froim the server) monthly through cron. Not sure when the files are put up on the 1st, so I get them on the 2nd:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Retrieve forum stats monthly on 2nd of month&lt;br /&gt;
15 2 2 * * /home/dneary/bin/forum_stats.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-07-04T16:09:04Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Queries */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists - MLStats ==&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with &amp;lt;pre&amp;gt;setup.py install --prefix=/install/path&amp;lt;/pre&amp;gt; you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
 /path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
&lt;br /&gt;
The command line option &amp;lt;pre&amp;gt;--no-report&amp;lt;/pre&amp;gt; suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
 for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
             meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
             meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
             meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
             meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
             meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
 do&lt;br /&gt;
   /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
 done;&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[Image:Mlgraph.png]]&lt;br /&gt;
&lt;br /&gt;
== IRC - SuperSeriousStats ==&lt;br /&gt;
&lt;br /&gt;
[http://code.google.com/p/superseriousstats/ SuperSeriousStats] is a super serious IRC log analyser, developed by Jos de Ruiter. It is written in PHP.&lt;br /&gt;
&lt;br /&gt;
To use SuperSeriousStats, you need to install the php-cli command line interface. Once you have this, and the [https://github.com/tommyrot/superseriousstats/wiki/Setup-guide other dependencies for sss listed here], you will be able to run sss.&lt;br /&gt;
&lt;br /&gt;
Before running it for the first time, [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v4.sql initialise the database]. You will need to download all of the [http://mg.pov.lt/meego-irclog/index.html IRC logs] you wish to import first. Logs are expected in the Supybot format ([http://mg.pov.lt/meego-irclog/%23meego.2011-06-15.log example]).&lt;br /&gt;
&lt;br /&gt;
=== Getting the IRC logs ===&lt;br /&gt;
For the initial download, the following script should help get all the logs:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
for y in 2010 2011; do&lt;br /&gt;
 for m in `seq -f '%02g' 1 12`; do&lt;br /&gt;
  for d in `seq -f '%02g' 1 31`; do&lt;br /&gt;
   wget http://mg.pov.lt/meego-irclog/%23meego.$y-$m-$d.log;&lt;br /&gt;
  done;&lt;br /&gt;
 done;&lt;br /&gt;
done;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Thereafter, you can add the following line in your cron job to get the previous day's logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate= `date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Parsing the logs ===&lt;br /&gt;
&lt;br /&gt;
To run sss, you will need to modify a number of fields in sss.conf. Notably, I had to change the filename format to &amp;quot;*.Y-m-d.\lo\g&amp;quot; to match the log file names, and set log format to supybot. I recommend creating a separate config file for each channel you want to parse. Here is the appropriate sss.conf for #meego logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#&lt;br /&gt;
#    This file contains all of sss' settings along with their defaults.&lt;br /&gt;
#    All values must be placed between double quotes, even empty ones.&lt;br /&gt;
#&lt;br /&gt;
#################################  Required  ###################################&lt;br /&gt;
&lt;br /&gt;
channel = &amp;quot;#meego&amp;quot;		# Name of the IRC channel.&lt;br /&gt;
timezone = &amp;quot;UTC&amp;quot;	# Timezone the logs are in. Used for time offset&lt;br /&gt;
			# calculations and conversions.&lt;br /&gt;
			# See http://php.net/manual/en/timezones.php&lt;br /&gt;
db_host = &amp;quot;host&amp;quot;		# IP address or FQDN of the MySQL server.&lt;br /&gt;
db_port = &amp;quot;3306&amp;quot;	# Port the MySQL server is listening on.&lt;br /&gt;
db_user = &amp;quot;db_user&amp;quot;	# MySQL user.&lt;br /&gt;
db_pass = &amp;quot;db_password&amp;quot;		# MySQL password.&lt;br /&gt;
db_name = &amp;quot;db_name&amp;quot;		# Name of the MySQL database used for sss.&lt;br /&gt;
parser = &amp;quot;parser_supybot&amp;quot;	# The parser to use depending on logfile format.&lt;br /&gt;
			# e.g. &amp;quot;parser_irssi&amp;quot; or &amp;quot;parser_eggdrop&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# This string contains the format of the date within a logfile filename.&lt;br /&gt;
# Examples:&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: *.Ymd&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: \#c\h\atroo\m.Ymd&lt;br /&gt;
#   filename: chatroom.log-31012003	dateformat: *.*-dmY&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.\g\z&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.*&lt;br /&gt;
# See http://php.net/date_create_from_format for more specific syntax options.&lt;br /&gt;
logfile_dateformat = &amp;quot;*.Y-m-d.\lo\g&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
sss has a mechanism to prevent it from re-importing log files for the same channel, by storing dates for which it has already parsed files in the table &amp;quot;parse_history&amp;quot;, but you can store summary data from several channels in one database.&lt;br /&gt;
&lt;br /&gt;
To do the initial import, just run&lt;br /&gt;
 php sss.php -c meego.conf -i ${logdir} -o report.html&lt;br /&gt;
&lt;br /&gt;
And to do the nightly import, put the following in a shell script, and call it from cron:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate=`date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   logdir=/path/to/logfiles&lt;br /&gt;
   sssdir=/path/to/sss-4.0&lt;br /&gt;
&lt;br /&gt;
   /usr/bin/wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
   /usr/bin/php ${sssdir}/sss.php -c meego.conf -i ${logdir}/\#meego.${logdate}.log&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Queries ===&lt;br /&gt;
&lt;br /&gt;
The sss report is built from a number of queries. A number of other useful sss queries are already listed in [[Metrics/Dashboard#IRC | the Dashboard page]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Activity by month'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  select date_format(`date`, '%Y-%m') as `date`,&lt;br /&gt;
       sum(`l_total`) as `l_total`,&lt;br /&gt;
       sum(`l_night`) as `l_night`,&lt;br /&gt;
       sum(`l_morning`) as `l_morning`,&lt;br /&gt;
       sum(`l_afternoon`) as `l_afternoon`,&lt;br /&gt;
       sum(`l_evening`) as `l_evening`&lt;br /&gt;
  from `channel`&lt;br /&gt;
  group by year(`date`), month(`date`);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Activity by day over last 30 days'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  select `date`, `l_total`, `l_night`, `l_morning`, `l_afternoon`, `l_evening` from `channel`&lt;br /&gt;
  where `date` &amp;gt; DATE_SUB(CURDATE(),INTERVAL 30 DAY);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-07-04T16:08:50Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Queries */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists - MLStats ==&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with &amp;lt;pre&amp;gt;setup.py install --prefix=/install/path&amp;lt;/pre&amp;gt; you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
 /path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
&lt;br /&gt;
The command line option &amp;lt;pre&amp;gt;--no-report&amp;lt;/pre&amp;gt; suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
 for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
             meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
             meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
             meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
             meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
             meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
 do&lt;br /&gt;
   /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
 done;&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[Image:Mlgraph.png]]&lt;br /&gt;
&lt;br /&gt;
== IRC - SuperSeriousStats ==&lt;br /&gt;
&lt;br /&gt;
[http://code.google.com/p/superseriousstats/ SuperSeriousStats] is a super serious IRC log analyser, developed by Jos de Ruiter. It is written in PHP.&lt;br /&gt;
&lt;br /&gt;
To use SuperSeriousStats, you need to install the php-cli command line interface. Once you have this, and the [https://github.com/tommyrot/superseriousstats/wiki/Setup-guide other dependencies for sss listed here], you will be able to run sss.&lt;br /&gt;
&lt;br /&gt;
Before running it for the first time, [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v4.sql initialise the database]. You will need to download all of the [http://mg.pov.lt/meego-irclog/index.html IRC logs] you wish to import first. Logs are expected in the Supybot format ([http://mg.pov.lt/meego-irclog/%23meego.2011-06-15.log example]).&lt;br /&gt;
&lt;br /&gt;
=== Getting the IRC logs ===&lt;br /&gt;
For the initial download, the following script should help get all the logs:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
for y in 2010 2011; do&lt;br /&gt;
 for m in `seq -f '%02g' 1 12`; do&lt;br /&gt;
  for d in `seq -f '%02g' 1 31`; do&lt;br /&gt;
   wget http://mg.pov.lt/meego-irclog/%23meego.$y-$m-$d.log;&lt;br /&gt;
  done;&lt;br /&gt;
 done;&lt;br /&gt;
done;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Thereafter, you can add the following line in your cron job to get the previous day's logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate= `date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Parsing the logs ===&lt;br /&gt;
&lt;br /&gt;
To run sss, you will need to modify a number of fields in sss.conf. Notably, I had to change the filename format to &amp;quot;*.Y-m-d.\lo\g&amp;quot; to match the log file names, and set log format to supybot. I recommend creating a separate config file for each channel you want to parse. Here is the appropriate sss.conf for #meego logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#&lt;br /&gt;
#    This file contains all of sss' settings along with their defaults.&lt;br /&gt;
#    All values must be placed between double quotes, even empty ones.&lt;br /&gt;
#&lt;br /&gt;
#################################  Required  ###################################&lt;br /&gt;
&lt;br /&gt;
channel = &amp;quot;#meego&amp;quot;		# Name of the IRC channel.&lt;br /&gt;
timezone = &amp;quot;UTC&amp;quot;	# Timezone the logs are in. Used for time offset&lt;br /&gt;
			# calculations and conversions.&lt;br /&gt;
			# See http://php.net/manual/en/timezones.php&lt;br /&gt;
db_host = &amp;quot;host&amp;quot;		# IP address or FQDN of the MySQL server.&lt;br /&gt;
db_port = &amp;quot;3306&amp;quot;	# Port the MySQL server is listening on.&lt;br /&gt;
db_user = &amp;quot;db_user&amp;quot;	# MySQL user.&lt;br /&gt;
db_pass = &amp;quot;db_password&amp;quot;		# MySQL password.&lt;br /&gt;
db_name = &amp;quot;db_name&amp;quot;		# Name of the MySQL database used for sss.&lt;br /&gt;
parser = &amp;quot;parser_supybot&amp;quot;	# The parser to use depending on logfile format.&lt;br /&gt;
			# e.g. &amp;quot;parser_irssi&amp;quot; or &amp;quot;parser_eggdrop&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# This string contains the format of the date within a logfile filename.&lt;br /&gt;
# Examples:&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: *.Ymd&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: \#c\h\atroo\m.Ymd&lt;br /&gt;
#   filename: chatroom.log-31012003	dateformat: *.*-dmY&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.\g\z&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.*&lt;br /&gt;
# See http://php.net/date_create_from_format for more specific syntax options.&lt;br /&gt;
logfile_dateformat = &amp;quot;*.Y-m-d.\lo\g&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
sss has a mechanism to prevent it from re-importing log files for the same channel, by storing dates for which it has already parsed files in the table &amp;quot;parse_history&amp;quot;, but you can store summary data from several channels in one database.&lt;br /&gt;
&lt;br /&gt;
To do the initial import, just run&lt;br /&gt;
 php sss.php -c meego.conf -i ${logdir} -o report.html&lt;br /&gt;
&lt;br /&gt;
And to do the nightly import, put the following in a shell script, and call it from cron:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate=`date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   logdir=/path/to/logfiles&lt;br /&gt;
   sssdir=/path/to/sss-4.0&lt;br /&gt;
&lt;br /&gt;
   /usr/bin/wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
   /usr/bin/php ${sssdir}/sss.php -c meego.conf -i ${logdir}/\#meego.${logdate}.log&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Queries ===&lt;br /&gt;
&lt;br /&gt;
The sss report is built from a number of queries. A number of other useful sss queries are already listed in [[Metrics/Dashboard#IRC | the Dashboard page]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Activity by month'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  select date_format(`date`, '%Y-%m') as `date`,&lt;br /&gt;
       sum(`l_total`) as `l_total`,&lt;br /&gt;
       sum(`l_night`) as `l_night`,&lt;br /&gt;
       sum(`l_morning`) as `l_morning`,&lt;br /&gt;
       sum(`l_afternoon`) as `l_afternoon`,&lt;br /&gt;
       sum(`l_evening`) as `l_evening`&lt;br /&gt;
  from `channel`&lt;br /&gt;
  group by year(`date`), month(`date`);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Activity by day over last 30 days'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select `date`, `l_total`, `l_night`, `l_morning`, `l_afternoon`, `l_evening` from `channel` where `date` &amp;gt; DATE_SUB(CURDATE(),INTERVAL 30 DAY);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-07-04T16:07:55Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* IRC - SuperSeriousStats */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists - MLStats ==&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with &amp;lt;pre&amp;gt;setup.py install --prefix=/install/path&amp;lt;/pre&amp;gt; you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
 /path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
&lt;br /&gt;
The command line option &amp;lt;pre&amp;gt;--no-report&amp;lt;/pre&amp;gt; suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
 for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
             meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
             meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
             meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
             meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
             meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
 do&lt;br /&gt;
   /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
 done;&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[Image:Mlgraph.png]]&lt;br /&gt;
&lt;br /&gt;
== IRC - SuperSeriousStats ==&lt;br /&gt;
&lt;br /&gt;
[http://code.google.com/p/superseriousstats/ SuperSeriousStats] is a super serious IRC log analyser, developed by Jos de Ruiter. It is written in PHP.&lt;br /&gt;
&lt;br /&gt;
To use SuperSeriousStats, you need to install the php-cli command line interface. Once you have this, and the [https://github.com/tommyrot/superseriousstats/wiki/Setup-guide other dependencies for sss listed here], you will be able to run sss.&lt;br /&gt;
&lt;br /&gt;
Before running it for the first time, [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v4.sql initialise the database]. You will need to download all of the [http://mg.pov.lt/meego-irclog/index.html IRC logs] you wish to import first. Logs are expected in the Supybot format ([http://mg.pov.lt/meego-irclog/%23meego.2011-06-15.log example]).&lt;br /&gt;
&lt;br /&gt;
=== Getting the IRC logs ===&lt;br /&gt;
For the initial download, the following script should help get all the logs:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
for y in 2010 2011; do&lt;br /&gt;
 for m in `seq -f '%02g' 1 12`; do&lt;br /&gt;
  for d in `seq -f '%02g' 1 31`; do&lt;br /&gt;
   wget http://mg.pov.lt/meego-irclog/%23meego.$y-$m-$d.log;&lt;br /&gt;
  done;&lt;br /&gt;
 done;&lt;br /&gt;
done;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Thereafter, you can add the following line in your cron job to get the previous day's logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate= `date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Parsing the logs ===&lt;br /&gt;
&lt;br /&gt;
To run sss, you will need to modify a number of fields in sss.conf. Notably, I had to change the filename format to &amp;quot;*.Y-m-d.\lo\g&amp;quot; to match the log file names, and set log format to supybot. I recommend creating a separate config file for each channel you want to parse. Here is the appropriate sss.conf for #meego logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#&lt;br /&gt;
#    This file contains all of sss' settings along with their defaults.&lt;br /&gt;
#    All values must be placed between double quotes, even empty ones.&lt;br /&gt;
#&lt;br /&gt;
#################################  Required  ###################################&lt;br /&gt;
&lt;br /&gt;
channel = &amp;quot;#meego&amp;quot;		# Name of the IRC channel.&lt;br /&gt;
timezone = &amp;quot;UTC&amp;quot;	# Timezone the logs are in. Used for time offset&lt;br /&gt;
			# calculations and conversions.&lt;br /&gt;
			# See http://php.net/manual/en/timezones.php&lt;br /&gt;
db_host = &amp;quot;host&amp;quot;		# IP address or FQDN of the MySQL server.&lt;br /&gt;
db_port = &amp;quot;3306&amp;quot;	# Port the MySQL server is listening on.&lt;br /&gt;
db_user = &amp;quot;db_user&amp;quot;	# MySQL user.&lt;br /&gt;
db_pass = &amp;quot;db_password&amp;quot;		# MySQL password.&lt;br /&gt;
db_name = &amp;quot;db_name&amp;quot;		# Name of the MySQL database used for sss.&lt;br /&gt;
parser = &amp;quot;parser_supybot&amp;quot;	# The parser to use depending on logfile format.&lt;br /&gt;
			# e.g. &amp;quot;parser_irssi&amp;quot; or &amp;quot;parser_eggdrop&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# This string contains the format of the date within a logfile filename.&lt;br /&gt;
# Examples:&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: *.Ymd&lt;br /&gt;
#   filename: #chatroom.20030131	dateformat: \#c\h\atroo\m.Ymd&lt;br /&gt;
#   filename: chatroom.log-31012003	dateformat: *.*-dmY&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.\g\z&lt;br /&gt;
#   filename: chatroom.log-31012003.gz	dateformat: *.*-dmY.*&lt;br /&gt;
# See http://php.net/date_create_from_format for more specific syntax options.&lt;br /&gt;
logfile_dateformat = &amp;quot;*.Y-m-d.\lo\g&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
sss has a mechanism to prevent it from re-importing log files for the same channel, by storing dates for which it has already parsed files in the table &amp;quot;parse_history&amp;quot;, but you can store summary data from several channels in one database.&lt;br /&gt;
&lt;br /&gt;
To do the initial import, just run&lt;br /&gt;
 php sss.php -c meego.conf -i ${logdir} -o report.html&lt;br /&gt;
&lt;br /&gt;
And to do the nightly import, put the following in a shell script, and call it from cron:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate=`date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   logdir=/path/to/logfiles&lt;br /&gt;
   sssdir=/path/to/sss-4.0&lt;br /&gt;
&lt;br /&gt;
   /usr/bin/wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
   /usr/bin/php ${sssdir}/sss.php -c meego.conf -i ${logdir}/\#meego.${logdate}.log&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Queries ===&lt;br /&gt;
&lt;br /&gt;
The sss report is built from a number of queries. A number of other useful sss queries are already listed in [[Metrics/Dashboard#IRC | the Dashboard page]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''Activity by month'''&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_format(`date`, '%Y-%m') as `date`, sum(`l_total`) as `l_total`, sum(`l_night`) as `l_night`, sum(`l_morning`) as `l_morning`, sum(`l_afternoon`) as `l_afternoon`, sum(`l_evening`) as `l_evening` from `channel` group by year(`date`), month(`date`);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Activity by day over last 30 days'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select `date`, `l_total`, `l_night`, `l_morning`, `l_afternoon`, `l_evening` from `channel` where `date` &amp;gt; DATE_SUB(CURDATE(),INTERVAL 30 DAY);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics</id>
		<title>Metrics</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics"/>
				<updated>2011-06-29T12:32:14Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Status Update ==&lt;br /&gt;
&lt;br /&gt;
Metrics for April are posted below [[User:Dawnfoster|Dawnfoster]] 23:56, 10 May 2011 (UTC)&lt;br /&gt;
&lt;br /&gt;
== Monthly Metrics ==&lt;br /&gt;
More details than you could possibly want to know about the activity in the MeeGo community.&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_April_2011.pdf April 2011 (PDF)] or read the [https://meego.com/community/blogs/dawnfoster/2011/meego-community-update-and-metrics-april blog post summary].&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_March_2011.pdf March 2011 (PDF)] or read the [https://meego.com/community/blogs/dawnfoster/2011/meego-community-update-and-metrics-march blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_February_2011.pdf February 2011 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2011/meego-community-update-and-metrics-february blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_January_2011.pdf January 2011 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2011/meego-community-update-and-metrics-january blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_December_2010.pdf December 2010 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2011/meego-community-update-and-metrics-december blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_November_2010.pdf November 2010 (PDF)]  or read the [http://meego.com/community/blogs/dawnfoster/2010/meego-community-update-and-metrics-november blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_October_2010.pdf October 2010 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2010/meego-community-update-and-metrics-october blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_September_2010.pdf September 2010 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2010/meego-community-update-and-metrics-september blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_August_2010.pdf August 2010 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2010/meego-community-update-and-metrics-august blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_July_2010.pdf July 2010 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2010/meego-community-update-and-metrics-july blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_June_2010.pdf June 2010 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2010/meego-community-metrics-june blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_May_2010.pdf May 2010 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2010/meego-community-metrics-may blog post summary]&lt;br /&gt;
* [http://wiki.meego.com/images/MeeGo_Community_Metrics_April_2010.pdf April 2010 (PDF)] or read the [http://meego.com/community/blogs/dawnfoster/2010/meego-community-metrics-april blog post summary]&lt;br /&gt;
&lt;br /&gt;
== Community Health Metrics ==&lt;br /&gt;
&lt;br /&gt;
My approach to metrics is to focus on measuring the things that help indicate whether the community is healthy and growing and identify any issues early. I have an internal deliverable to provide a monthly report in the first week of every month to my management team, and my plan is to deliver a similar report for the community. I also try to focus on picking the things that have the most impact rather than trying to measure everything. -- [http://wiki.meego.com/User:Dawnfoster Dawn Foster]&lt;br /&gt;
&lt;br /&gt;
The metrics that are currently being gathered are organized into four categories:&lt;br /&gt;
* '''Awareness''': Finding the community and visiting it to learn more (potential community members)&lt;br /&gt;
* '''Membership''': Taking the time to sign up and join the community (passive and active community members)&lt;br /&gt;
* '''Engagement''': Interacting with other community members (active community members)&lt;br /&gt;
* '''Development''': Engaging with the MeeGo project at a code level (developers)&lt;br /&gt;
&lt;br /&gt;
Note: [http://wiki.meego.com/User:Dawnfoster Dawn Foster] is currently gathering the metrics listed in this section.&lt;br /&gt;
&lt;br /&gt;
=== Awareness Metrics ===&lt;br /&gt;
&lt;br /&gt;
* Website Visits (source Google Analytics)&lt;br /&gt;
* Unique Visitors (source Google Analytics)&lt;br /&gt;
* Page Views (source Google Analytics)&lt;br /&gt;
* Social Media Posts (source Radian6)&lt;br /&gt;
&lt;br /&gt;
=== Membership Metrics ===&lt;br /&gt;
&lt;br /&gt;
* MeeGo.com Total Members (source Drupal)&lt;br /&gt;
* MeeGo-Dev ML Subscribers (source mailing lists)&lt;br /&gt;
* MeeGo-Community ML Subscribers (source mailing lists)&lt;br /&gt;
* MeeGo-SDK ML Subscribers (source mailing lists)&lt;br /&gt;
* MeeGo L10N ML subscribers (source mailing lists)&lt;br /&gt;
&lt;br /&gt;
=== Engagement Metrics ===&lt;br /&gt;
&lt;br /&gt;
* MeeGo-Dev ML Messages (source mailing lists)&lt;br /&gt;
* MeeGo-Community ML Messages (source mailing lists)&lt;br /&gt;
* MeeGo-SDK ML Messages (source mailing lists)&lt;br /&gt;
* MeeGo-iL10n ML messages (source mailing lists)&lt;br /&gt;
* MeeGo-Dev People with 2+ messages (source mailing lists / mlstats)&lt;br /&gt;
* MeeGo-Community People with 2+ messages (source mailing lists / mlstats)&lt;br /&gt;
* Forum Threads (source vBulletin)&lt;br /&gt;
* Blog Comments (source Drupal)&lt;br /&gt;
* Wiki Content Pages (source MediaWiki)&lt;br /&gt;
* Wiki Page Edits (source MediaWiki)&lt;br /&gt;
* Overall MeeGo Translation in the 16 POR languages (source Margie)&lt;br /&gt;
* Community-sourced translations (beyond the POR 16) that are at 50% complete or above (source Margie)&lt;br /&gt;
* Active language teams (source Margie)&lt;br /&gt;
* Overall MeeGo Translation in the 37 languages (source Margie)&lt;br /&gt;
* IRC top contributors and common words used (source irssistats from Stskeeps)&lt;br /&gt;
&lt;br /&gt;
=== Code Metrics ===&lt;br /&gt;
* Downloads (source Anas)&lt;br /&gt;
* Commits (source Gitorius)&lt;br /&gt;
* New bugs (source Bugzilla)&lt;br /&gt;
* Closed bugs (source Bugzilla)&lt;br /&gt;
* Total bugs (source Bugzilla)&lt;br /&gt;
&lt;br /&gt;
=== Content Analysis ===&lt;br /&gt;
* Most popular mailing list posts across each list (source mailing lists / mlstats)&lt;br /&gt;
* Most popular forum threads (source vBulletin)&lt;br /&gt;
&lt;br /&gt;
== Metrics Dashboard Automation Project ==&lt;br /&gt;
&lt;br /&gt;
* Lead: [[User:Dneary|Dave Neary]]&lt;br /&gt;
* Help from: [[User:Dawnfoster|Dawn Foster]]&lt;br /&gt;
* See [[/Dashboard]] for details.&lt;br /&gt;
&lt;br /&gt;
== Biggest Gaps We Need to Fill Now ==&lt;br /&gt;
* Add Community OBS Metrics - work with lbt.&lt;br /&gt;
* Add Transifex (localization) metrics - work with Dmitry&lt;br /&gt;
* Add Download numbers (including mirrors) - work with Anas / Adam&lt;br /&gt;
&lt;br /&gt;
== Future Improvement Ideas ==&lt;br /&gt;
This is a great way to help out on the MeeGo project - anyone can contribute to the metrics! &lt;br /&gt;
&lt;br /&gt;
If you are starting a task, please add your name next to it to claim your task to avoid duplication (we don't want multiple people working on the same task without coordinating).&lt;br /&gt;
&lt;br /&gt;
'''Dashboard''' - high priority&lt;br /&gt;
* An automated dashboard that people can access at any time gathering some of these metrics. &lt;br /&gt;
* We can start with some of the easy ones to gather and add more over time.&lt;br /&gt;
* Would be great to cut this by month for easy report generation.&lt;br /&gt;
* Note: see the [[#Points to consider | points to consider]] section below for some additional ideas&lt;br /&gt;
&lt;br /&gt;
'''Other metrics that people may want to capture'''&lt;br /&gt;
* Software packaging&lt;br /&gt;
** New packages added to repository&lt;br /&gt;
** Updates to existing packages&lt;br /&gt;
** Number of developers/maintainers uploading packages&lt;br /&gt;
* Server performance and maintenance metrics&lt;br /&gt;
** Server use levels&lt;br /&gt;
** Network bandwidth&lt;br /&gt;
** Downtime and error reports&lt;br /&gt;
** OS and firmware patching&lt;br /&gt;
* bugzilla reports&lt;br /&gt;
** Fixed bug reports&lt;br /&gt;
** Unfixed closed bugreports (dups, not a bug, etc)&lt;br /&gt;
** Upstreamed bug reports (MeeGo bug matched with upstream bug report)&lt;br /&gt;
* various open source aspects&lt;br /&gt;
** Activity of Intel &amp;amp; Nokia developers vs activity of &amp;quot;NIN&amp;quot; (Not Intel/Nokia) developers&lt;br /&gt;
** Number of NIN committers, number of MeeGo specific modules with NIN committers.&lt;br /&gt;
** Not sure what you might want to quantitatively measure here - qualitative is hard to automate. See Siobhan O'Mahony's reports for Eclipse &amp;amp; various Eclipse Foundation reports for inspiration [[User:Dneary|Dneary]] 10:00, 10 March 2010 (UTC)&lt;br /&gt;
* membership / user accounts&lt;br /&gt;
** member satisfaction&lt;br /&gt;
** member recognition&lt;br /&gt;
** Not sure what you might want to quantitatively measure here - qualitative is hard to automate. [[User:Dneary|Dneary]] 10:00, 10 March 2010 (UTC)&lt;br /&gt;
* Planet MeeGo&lt;br /&gt;
** Blog posts&lt;br /&gt;
** Voting activity&lt;br /&gt;
** Reactions to posts from outside meego.com&lt;br /&gt;
&lt;br /&gt;
'''Alerts'''&lt;br /&gt;
* Planned events&lt;br /&gt;
** Infrastructure service outages&lt;br /&gt;
* Unplanned events&lt;br /&gt;
** Capacity shortages (monitor storage and throughput; report on threshold breaches)&lt;br /&gt;
** Attacks on infrastructure&lt;br /&gt;
&lt;br /&gt;
== Points to consider ==&lt;br /&gt;
&lt;br /&gt;
* Reporting solution&lt;br /&gt;
** [http://www.slideshare.net/OpenLogic/open-source-reporting-tool-comparison A comparison of open source solutions] - compares [http://www.jaspersoft.com/ Jasper Reports], [http://www.pentaho.com/ Pentaho], [http://www.eclipse.org/birt/phoenix/ BIRT] and more&lt;br /&gt;
** [http://cricket.sourceforge.net/ Cricket] looks promising&lt;br /&gt;
** Direct graphing with [http://www.graphviz.org/ graphviz] might be simplest&lt;br /&gt;
** [http://developmentseed.org/blog/2010/apr/20/world-bank-open-data-initiative-launched-on-drupal Drupal solution example]&lt;br /&gt;
* Report sharing methods&lt;br /&gt;
** Dashboard(s) for ongoing data reports (server health, member stats, bugs, app stats, etc)&lt;br /&gt;
** Voluntary subscriptions (ie, RSS) for periodic public reports&lt;br /&gt;
* Public vs Private data&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* [http://dash.eclipse.org/dash/commits/web-app/ Eclipse Foundation dashboards]&lt;br /&gt;
* [http://www.artofcommunityonline.org The Art of Community] - Chapter 7: Measuring community&lt;br /&gt;
* [http://www.gitorious.org/mining-tools/gitdm gitdm] - Git data miner&lt;br /&gt;
* [http://www.horsepigcow.com/2007/10/metrics-for-healthy-communities/ Metrics for healthy communities] - Tara Hunt&lt;br /&gt;
* [http://spreadloveproject.pbworks.com/CommunityMeasurements Collection of metrics]&lt;br /&gt;
* [http://tools.libresoft.es/ Tools] for analysing CVS/SVN, mailing lists, social networks and more. Community metrics tools mecca.&lt;br /&gt;
&lt;br /&gt;
== Contributors ==&lt;br /&gt;
&lt;br /&gt;
* Coordinator: [http://meego.com/users/dawnfoster Dawn Foster/dawnfoster] - Intel's MeeGo community manager. &lt;br /&gt;
&lt;br /&gt;
* [http://meego.com/users/dneary Dave Neary/dneary] - Maemo docmaster, I help grow free software communities.&lt;br /&gt;
&lt;br /&gt;
* [http://meego.com/users/bergie Henri Bergius/bergie] - helping to automate the metrics.&lt;br /&gt;
&lt;br /&gt;
== Additional Helpful Details on Specific Metrics ==&lt;br /&gt;
&lt;br /&gt;
'''mlstats queries:'''&lt;br /&gt;
* Count posts of most popular threads:&lt;br /&gt;
*: &amp;lt;code&amp;gt;select subject,year(first_date) as y, monthname(first_date),count(*) as c from messages group by subject, month(first_date) order by y, month(first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Count the number of posts for each person&lt;br /&gt;
*: &amp;lt;code&amp;gt;select p.email_address,year(m.first_date) as y, monthname(m.first_date),count(*) as c from messages as m,messages_people as p where m.message_id=p.message_ID group by p.email_address, month(m.first_date) order by y, month(m.first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Restricting queries to a date range / month&lt;br /&gt;
*: The query above gives month-by-month totals, you could add an order by year(first_date) to get the year too&lt;br /&gt;
*: The easiest way is to add &amp;lt;code&amp;gt;where month(first_date)=3 and year(first_date)=2010&amp;lt;/code&amp;gt; for March 2010. For the current month, &amp;lt;code&amp;gt;month(m.first_date)=month(NOW()) and year(first_date)=year(NOW())&amp;lt;/code&amp;gt; works.&lt;br /&gt;
&lt;br /&gt;
'''Bugzilla:'''&lt;br /&gt;
* A Bugzilla report that shows new bugs, closed bugs, and total bugs (or instructions for creating this report)&lt;br /&gt;
*: Standard Bugzilla search queries &amp;amp; reports can give counts for this. Using relative dates in the date fields you can get all changes from one month ago (set date to -1m) or one week ago (-1w).&lt;br /&gt;
** [http://is.gd/bOO6E New bugs this month, by product, and their current states] (shortened link)&lt;br /&gt;
** [http://bugs.meego.com/report.cgi?x_axis_field=&amp;amp;y_axis_field=product&amp;amp;z_axis_field=&amp;amp;query_format=report-table&amp;amp;short_desc_type=allwordssubstr&amp;amp;short_desc=&amp;amp;longdesc_type=allwordssubstr&amp;amp;longdesc=&amp;amp;bug_file_loc_type=allwordssubstr&amp;amp;bug_file_loc=&amp;amp;deadlinefrom=&amp;amp;deadlineto=&amp;amp;bug_status=RESOLVED&amp;amp;bug_status=RELEASED&amp;amp;bug_status=VERIFIED&amp;amp;bug_status=CLOSED&amp;amp;emailassigned_to1=1&amp;amp;emailtype1=substring&amp;amp;email1=&amp;amp;emailassigned_to2=1&amp;amp;emailreporter2=1&amp;amp;emailqa_contact2=1&amp;amp;emailcc2=1&amp;amp;emailtype2=substring&amp;amp;email2=&amp;amp;bugidtype=include&amp;amp;bug_id=&amp;amp;chfieldfrom=-1M&amp;amp;chfieldto=Now&amp;amp;chfield=bug_status&amp;amp;chfieldvalue=&amp;amp;format=table&amp;amp;action=wrap&amp;amp;field0-0-0=noop&amp;amp;type0-0-0=noop&amp;amp;value0-0-0= Resolved &amp;amp; closed bugs in the past month] (Status changed in the past month, and current status is one of resolved, released, verified or closed - not sure if this is what you want)&lt;br /&gt;
** There are lots of variations - play with [http://bugs.meego.com/query.cgi?format=report-table bugzilla tabular reports] and relative dates for more.&lt;br /&gt;
** Get the SQL queries which generate these reports by manually adding &amp;quot;&amp;amp;debug=1&amp;quot; to the URL&lt;br /&gt;
*: This might be easiest to automate as a Bugzilla extension querying the BZ database directly. Some potential resources:&lt;br /&gt;
** [http://www.ravenbrook.com/project/p4dti/tool/cgi/bugzilla-schema/ Bugzilla database schema by version]&lt;br /&gt;
** [https://landfill.bugzilla.org/bugzilla-tip/report.cgi Demo site for standard Bugzilla reports]&lt;br /&gt;
** [https://wiki.mozilla.org/Bugzilla:SQLCookBook The Bugzilla SQL cookbook]&lt;br /&gt;
** [http://www.bugzillametrics.org/ Bugzilla Metrics] - Third party add-on to Bugzilla that produces lots of metrics&lt;br /&gt;
** browse.cgi and weekly-summary.html extensions of GNOME Bugzilla: [https://launchpad.net/bugzilla.gnome.org Code], examples in practice: [http://bugzilla.gnome.org/browse.cgi?product=Evolution browse.cgi], [https://bugzilla.gnome.org/page.cgi?id=weekly-bug-summary.html weekly-summary.cgi]&lt;br /&gt;
&lt;br /&gt;
'''MediaWiki'''&lt;br /&gt;
* A [http://www.mediawiki.org/wiki/Extension:SpecialUserScore MediaWiki extension] exists to provide &amp;quot;user scores&amp;quot; for MediaWiki users, ordered by number of edits and number of pages changed. The guts of the query is:&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(wr.rev_id) as value,&lt;br /&gt;
        COUNT(DISTINCT wr.rev_page) as page_value,&lt;br /&gt;
        wu.user_name as name,&lt;br /&gt;
        wu.user_real_name as real_name&lt;br /&gt;
 FROM   $user wu,&lt;br /&gt;
        $revision wr,&lt;br /&gt;
        $page wp&lt;br /&gt;
 WHERE  wu.user_id = wr.rev_user&lt;br /&gt;
    and wp.page_id = wr.rev_page&lt;br /&gt;
    and wp.page_namespace = 0&lt;br /&gt;
 GROUP BY wu.user_name&lt;br /&gt;
 ORDER BY value desc;&lt;br /&gt;
&lt;br /&gt;
where $user, $revision and $page are the names of the respective MediaWiki tables (MediaWiki tables have a prefix associated with them for a given instance, specified by $wgDBprefix in LocalSettings.php).&lt;br /&gt;
&lt;br /&gt;
'''IRC'''&lt;br /&gt;
&lt;br /&gt;
Stskeeps has started running monthly IRC reports:&lt;br /&gt;
* [http://www.daimi.au.dk/~cvm/data/irssistats.html February]&lt;br /&gt;
* [http://www.daimi.au.dk/~cvm/data/irssistats-march.html March]&lt;br /&gt;
* [http://www.daimi.au.dk/~cvm/data/irssistats-april.html April]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Quality/UpstreamBugTrackers</id>
		<title>Quality/UpstreamBugTrackers</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Quality/UpstreamBugTrackers"/>
				<updated>2011-06-23T14:15:35Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Mapping */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Practice ==&lt;br /&gt;
&lt;br /&gt;
If a bug in code or in translations exists that comes from an &amp;quot;upstream&amp;quot; project and that was not changed in (&amp;quot;downstream&amp;quot;) MeeGo it is common practice to forward this bug report to the corresponding upstream bugtracker.&lt;br /&gt;
&lt;br /&gt;
This makes upstream developers / translators become aware of issues (that were found in MeeGo but affect any other downstream projects too) so they can get fixed upstream. Patches are of course also welcome to be attached in an upstream report.&lt;br /&gt;
&lt;br /&gt;
Fixing issues upstream reduces the efforts of maintaining downstream patches. It is welcome to query first in the upstream bugtracker for a potentially already existing report covering the issue.&lt;br /&gt;
&lt;br /&gt;
In case an upstream bug has been filed or has been found, add the URL to the field &amp;quot;See Also: Add Bug URLs:&amp;quot; in the report in bugs.meego.com, and vice versa if possible. CC yourself to the upstream bug report to keep track and update the report in bugs.meego.com once it is fixed upstream.&lt;br /&gt;
&lt;br /&gt;
== Mapping ==&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; + Upstream Bug Trackers&lt;br /&gt;
!rowspan=&amp;quot;2&amp;quot;|MeeGo Components&lt;br /&gt;
!colspan=&amp;quot;3&amp;quot;|Upstream&lt;br /&gt;
|-&lt;br /&gt;
! Project&lt;br /&gt;
! Bugtracker URL&lt;br /&gt;
! Product for Translation Issues&lt;br /&gt;
|-&lt;br /&gt;
| Alsa&lt;br /&gt;
| Alsa&lt;br /&gt;
| https://bugtrack.alsa-project.org/alsa-bug/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| Cairo, Exempi, Xorg, PackageKit&lt;br /&gt;
| FreeDesktop.org&lt;br /&gt;
| https://bugs.freedesktop.org/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| Giflib&lt;br /&gt;
| Giflib&lt;br /&gt;
| http://sourceforge.net/tracker/?group_id=102202&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| Banshee, librest, libsocialweb, gnome-packagekit&lt;br /&gt;
| Gnome&lt;br /&gt;
| https://bugzilla.gnome.org/&lt;br /&gt;
| L10N&lt;br /&gt;
|-&lt;br /&gt;
| Icu&lt;br /&gt;
| Icu&lt;br /&gt;
| http://bugs.icu-project.org&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| KDE&lt;br /&gt;
| KDE&lt;br /&gt;
| https://bugs.kde.org/&lt;br /&gt;
| i18n&lt;br /&gt;
|-&lt;br /&gt;
| Fennec&lt;br /&gt;
| Mozilla&lt;br /&gt;
| https://bugzilla.mozilla.org/&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| Iptables&lt;br /&gt;
| Netfilter/Iptables&lt;br /&gt;
| http://bugzilla.netfilter.org/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| ntp&lt;br /&gt;
| ntp&lt;br /&gt;
| http://bugs.ntp.org/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| PulseAudio&lt;br /&gt;
| PulseAudio&lt;br /&gt;
| http://pulseaudio.org/report&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| Qt&lt;br /&gt;
| Qt&lt;br /&gt;
| http://bugreports.qt.nokia.com&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| sudo&lt;br /&gt;
| sudo&lt;br /&gt;
| http://www.sudo.ws/bugs/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| SWI-Prolog&lt;br /&gt;
| SWI-Prolog&lt;br /&gt;
| http://www.swi-prolog.org/bugzilla/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| WebKit, WebCore, CoreJS&lt;br /&gt;
| WebKit&lt;br /&gt;
| https://bugs.webkit.org/&lt;br /&gt;
| &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
If the component you're looking for isn't listed here, please add it, possible places to find further bug trackers include:&lt;br /&gt;
* Ubuntu: https://launchpad.net/bugs/bugtrackers/&lt;br /&gt;
* Gnome: http://live.gnome.org/Bugsquad/TriageGuide/NonGnome&lt;br /&gt;
&lt;br /&gt;
== Other bug trackers ==&lt;br /&gt;
&lt;br /&gt;
=== Components related to MeeGo ===&lt;br /&gt;
&lt;br /&gt;
For components that related to MeeGo but not part of MeeGo.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; + Other Bug Trackers&lt;br /&gt;
! Project&lt;br /&gt;
! Bugtracker URL&lt;br /&gt;
|-&lt;br /&gt;
| Peregrine&lt;br /&gt;
| http://bugs.peregrine-communicator.org/  &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Downstream MeeGo products ===&lt;br /&gt;
&lt;br /&gt;
For products that are based on MeeGo.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; + Other Bug Trackers&lt;br /&gt;
! Project&lt;br /&gt;
! Bugtracker URL&lt;br /&gt;
|-&lt;br /&gt;
| Nokia MeeGo 1.2 Harmattan &lt;br /&gt;
| http://www.developer.nokia.com/bugs/&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[Category:QA]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Quality/UpstreamBugTrackers</id>
		<title>Quality/UpstreamBugTrackers</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Quality/UpstreamBugTrackers"/>
				<updated>2011-06-23T14:15:18Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Mapping */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Practice ==&lt;br /&gt;
&lt;br /&gt;
If a bug in code or in translations exists that comes from an &amp;quot;upstream&amp;quot; project and that was not changed in (&amp;quot;downstream&amp;quot;) MeeGo it is common practice to forward this bug report to the corresponding upstream bugtracker.&lt;br /&gt;
&lt;br /&gt;
This makes upstream developers / translators become aware of issues (that were found in MeeGo but affect any other downstream projects too) so they can get fixed upstream. Patches are of course also welcome to be attached in an upstream report.&lt;br /&gt;
&lt;br /&gt;
Fixing issues upstream reduces the efforts of maintaining downstream patches. It is welcome to query first in the upstream bugtracker for a potentially already existing report covering the issue.&lt;br /&gt;
&lt;br /&gt;
In case an upstream bug has been filed or has been found, add the URL to the field &amp;quot;See Also: Add Bug URLs:&amp;quot; in the report in bugs.meego.com, and vice versa if possible. CC yourself to the upstream bug report to keep track and update the report in bugs.meego.com once it is fixed upstream.&lt;br /&gt;
&lt;br /&gt;
== Mapping ==&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; + Upstream Bug Trackers&lt;br /&gt;
!rowspan=&amp;quot;2&amp;quot;|MeeGo Components&lt;br /&gt;
!colspan=&amp;quot;3&amp;quot;|Upstream&lt;br /&gt;
|-&lt;br /&gt;
! Project&lt;br /&gt;
! Bugtracker URL&lt;br /&gt;
! Product for Translation Issues&lt;br /&gt;
|-&lt;br /&gt;
| Alsa&lt;br /&gt;
| Alsa&lt;br /&gt;
| https://bugtrack.alsa-project.org/alsa-bug/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| Cairo, Exempi, Xorg&lt;br /&gt;
| FreeDesktop.org&lt;br /&gt;
| https://bugs.freedesktop.org/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| Giflib&lt;br /&gt;
| Giflib&lt;br /&gt;
| http://sourceforge.net/tracker/?group_id=102202&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| Banshee, librest, libsocialweb, gnome-packagekit&lt;br /&gt;
| Gnome&lt;br /&gt;
| https://bugzilla.gnome.org/&lt;br /&gt;
| L10N&lt;br /&gt;
|-&lt;br /&gt;
| Icu&lt;br /&gt;
| Icu&lt;br /&gt;
| http://bugs.icu-project.org&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| KDE&lt;br /&gt;
| KDE&lt;br /&gt;
| https://bugs.kde.org/&lt;br /&gt;
| i18n&lt;br /&gt;
|-&lt;br /&gt;
| Fennec&lt;br /&gt;
| Mozilla&lt;br /&gt;
| https://bugzilla.mozilla.org/&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| Iptables&lt;br /&gt;
| Netfilter/Iptables&lt;br /&gt;
| http://bugzilla.netfilter.org/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| ntp&lt;br /&gt;
| ntp&lt;br /&gt;
| http://bugs.ntp.org/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| PulseAudio&lt;br /&gt;
| PulseAudio&lt;br /&gt;
| http://pulseaudio.org/report&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| Qt&lt;br /&gt;
| Qt&lt;br /&gt;
| http://bugreports.qt.nokia.com&lt;br /&gt;
| &lt;br /&gt;
|-&lt;br /&gt;
| sudo&lt;br /&gt;
| sudo&lt;br /&gt;
| http://www.sudo.ws/bugs/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| SWI-Prolog&lt;br /&gt;
| SWI-Prolog&lt;br /&gt;
| http://www.swi-prolog.org/bugzilla/&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
| WebKit, WebCore, CoreJS&lt;br /&gt;
| WebKit&lt;br /&gt;
| https://bugs.webkit.org/&lt;br /&gt;
| &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
If the component you're looking for isn't listed here, please add it, possible places to find further bug trackers include:&lt;br /&gt;
* Ubuntu: https://launchpad.net/bugs/bugtrackers/&lt;br /&gt;
* Gnome: http://live.gnome.org/Bugsquad/TriageGuide/NonGnome&lt;br /&gt;
&lt;br /&gt;
== Other bug trackers ==&lt;br /&gt;
&lt;br /&gt;
=== Components related to MeeGo ===&lt;br /&gt;
&lt;br /&gt;
For components that related to MeeGo but not part of MeeGo.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; + Other Bug Trackers&lt;br /&gt;
! Project&lt;br /&gt;
! Bugtracker URL&lt;br /&gt;
|-&lt;br /&gt;
| Peregrine&lt;br /&gt;
| http://bugs.peregrine-communicator.org/  &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Downstream MeeGo products ===&lt;br /&gt;
&lt;br /&gt;
For products that are based on MeeGo.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; + Other Bug Trackers&lt;br /&gt;
! Project&lt;br /&gt;
! Bugtracker URL&lt;br /&gt;
|-&lt;br /&gt;
| Nokia MeeGo 1.2 Harmattan &lt;br /&gt;
| http://www.developer.nokia.com/bugs/&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[Category:QA]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-06-14T22:03:35Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Add SSS import &amp;amp; parsing to metrics &amp;quot;Gathering data&amp;quot; page. Next up: mediawiki&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists - MLStats ==&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with &amp;lt;pre&amp;gt;setup.py install --prefix=/install/path&amp;lt;/pre&amp;gt; you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
 /path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
&lt;br /&gt;
The command line option &amp;lt;pre&amp;gt;--no-report&amp;lt;/pre&amp;gt; suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
 for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
             meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
             meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
             meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
             meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
             meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
 do&lt;br /&gt;
   /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
 done;&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[Image:Mlgraph.png]]&lt;br /&gt;
&lt;br /&gt;
== IRC - SuperSeriousStats ==&lt;br /&gt;
&lt;br /&gt;
[http://code.google.com/p/superseriousstats/ SuperSeriousStats] is a super serious IRC log analyser, developed by Jos de Ruiter. It is written in PHP.&lt;br /&gt;
&lt;br /&gt;
To use SuperSeriousStats, you need to install the php-cli command line interface. Once you have this, and the [https://github.com/tommyrot/superseriousstats/wiki/Setup-guide other dependencies for sss listed here], you will be able to run sss.&lt;br /&gt;
&lt;br /&gt;
Before running it for the first time, [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v4.sql initialise the database]. You will need to download all of the [http://mg.pov.lt/meego-irclog/index.html IRC logs] you wish to import first. Logs are expected in the Supybot format ([http://mg.pov.lt/meego-irclog/%23meego.2011-06-15.log example]).&lt;br /&gt;
&lt;br /&gt;
=== Getting the IRC logs ===&lt;br /&gt;
For the initial download, the following script should help get all the logs:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
for y in 2010 2011; do&lt;br /&gt;
 for m in `seq -f '%02g' 1 12`; do&lt;br /&gt;
  for d in `seq -f '%02g' 1 31`; do&lt;br /&gt;
   wget http://mg.pov.lt/meego-irclog/%23meego.$y-$m-$d.log;&lt;br /&gt;
  done;&lt;br /&gt;
 done;&lt;br /&gt;
done;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Thereafter, you can add the following line in your cron job to get the previous day's logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   logdate= `date +%Y-%m-%d -d yesterday`&lt;br /&gt;
   wget -P $logdir http://mg.pov.lt/meego-irclog/%23meego.${logdate}.log;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Parsing the logs ===&lt;br /&gt;
&lt;br /&gt;
To run sss, you will need to modify a number of fields in sss.conf. Notably, I had to change the filename format to match the log file names, and set log format to supybot. I recommend creating a separate config file for each channel you want to parse.&lt;br /&gt;
&lt;br /&gt;
sss has a mechanism to prevent it from re-importing log files for the same channel, but you can store summary data from several channels in one database.&lt;br /&gt;
&lt;br /&gt;
To do the initial import, just run&lt;br /&gt;
 php sss.php -i /path/to/logs/&lt;br /&gt;
&lt;br /&gt;
And to do the nightly import, add this after the wget line above:&lt;br /&gt;
 php sss.php -c channel.conf -i ${logdir}/%23meego.${logdate}.log&lt;br /&gt;
&lt;br /&gt;
=== Queries ===&lt;br /&gt;
&lt;br /&gt;
A number of useful sss queries are already listed in [[Metrics/Dashboard#IRC | the Dashboard page]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-06-14T12:12:00Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists ==&lt;br /&gt;
&lt;br /&gt;
=== MLStats ===&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with &amp;lt;pre&amp;gt;setup.py install --prefix=/install/path&amp;lt;/pre&amp;gt; you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
 /path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
&lt;br /&gt;
The command line option &amp;lt;pre&amp;gt;--no-report&amp;lt;/pre&amp;gt; suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
 for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
             meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
             meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
             meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
             meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
             meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
 do&lt;br /&gt;
   /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
 done;&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[Image:Mlgraph.png]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/File:Mlgraph.png</id>
		<title>File:Mlgraph.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/File:Mlgraph.png"/>
				<updated>2011-06-10T16:36:26Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Graph of mailing list messages per month up to April 2011) on all MeeGo mailing lists except meego-commits&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Graph of mailing list messages per month up to April 2011) on all MeeGo mailing lists except meego-commits&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-06-10T16:35:23Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Mailing list graph over time */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists ==&lt;br /&gt;
&lt;br /&gt;
=== MLStats ===&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with &amp;lt;pre&amp;gt;setup.py install --prefix=/install/path&amp;lt;/pre&amp;gt; you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
 /path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
&lt;br /&gt;
The command line option &amp;lt;pre&amp;gt;--no-report&amp;lt;/pre&amp;gt; suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
 for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
             meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
             meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
             meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
             meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
             meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
 do&lt;br /&gt;
   /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
 done;&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[File:Mlgraph.png]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-06-10T16:35:01Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* MLStats */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists ==&lt;br /&gt;
&lt;br /&gt;
=== MLStats ===&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with &amp;lt;pre&amp;gt;setup.py install --prefix=/install/path&amp;lt;/pre&amp;gt; you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
 /path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
&lt;br /&gt;
The command line option &amp;lt;pre&amp;gt;--no-report&amp;lt;/pre&amp;gt; suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
 for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
             meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
             meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
             meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
             meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
             meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
 do&lt;br /&gt;
   /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
 done;&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
{{{&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
}}}&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[File:Mlgraph.png]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-06-10T16:33:57Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* MLStats */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists ==&lt;br /&gt;
&lt;br /&gt;
=== MLStats ===&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with {{{setup.py install --prefix=/install/path}}} you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
{{{&lt;br /&gt;
/path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
}}}&lt;br /&gt;
&lt;br /&gt;
The command line option {{{--no-report}}} suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
{{{&lt;br /&gt;
for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
            meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
            meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
            meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
            meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
            meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
do&lt;br /&gt;
  /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
done;&lt;br /&gt;
}}}&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
{{{&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
}}}&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[File:Mlgraph.png]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-06-10T16:33:38Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Document mailing list graph.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists ==&lt;br /&gt;
&lt;br /&gt;
=== MLStats ===&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;br /&gt;
&lt;br /&gt;
Once you have downloaded and unpacked MLStats 0.4, patched it with the patch above, and installed it with {{{setup.py install --prefix=/install/path}}} you will need to &amp;quot;prime the pump&amp;quot;, and download and import the archives to all of the Maemo mailing lists.&lt;br /&gt;
&lt;br /&gt;
The format of the mlstats command line is:&lt;br /&gt;
{{{&lt;br /&gt;
/path/to/mlstats --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/meego-announce/&lt;br /&gt;
}}}&lt;br /&gt;
&lt;br /&gt;
The command line option {{--no-report}} suppresses the creation of a report after the import, useful for a cron job.&lt;br /&gt;
&lt;br /&gt;
You can list the archives for a number of mailing lists together on the command line. The list of MeeGo mailing lists is [http://lists.meego.com/mailman/listinfo here].&lt;br /&gt;
&lt;br /&gt;
{{{&lt;br /&gt;
for list in meego-adaptation-intel-automotive meego-announce meego-architecture \&lt;br /&gt;
            meego-commits meego-community meego-dev meego-distribution-tools \&lt;br /&gt;
            meego-events meego-handset meego-il10n meego-inputmethods meego-it \&lt;br /&gt;
            meego-ivi meego-kernel meego-packaging meego-pm meego-porting \&lt;br /&gt;
            meego-python meego-qa meego-releases meego-sdk meego-security \&lt;br /&gt;
            meego-security-discussion meego-touch-dev meego-tv;&lt;br /&gt;
do&lt;br /&gt;
  /path/to/mlstats --no-report --db-user=&amp;lt;username&amp;gt; --db-password=&amp;lt;password&amp;gt; http://lists.meego.com/pipermail/${list} &amp;gt;&amp;gt; /tmp/output.txt&lt;br /&gt;
done;&lt;br /&gt;
}}}&lt;br /&gt;
&lt;br /&gt;
This should be run every night through cron.&lt;br /&gt;
&lt;br /&gt;
=== Mailing list graph over time ===&lt;br /&gt;
&lt;br /&gt;
The SQL query (yes, a Big Hairy Beast) which can be used to create a report graphing the messages posted to each list, month by month, is this:&lt;br /&gt;
{{{&lt;br /&gt;
SELECT&lt;br /&gt;
     `messages`.`mailing_list_url` AS list,&lt;br /&gt;
     year(first_date) AS y,&lt;br /&gt;
     monthname(first_date) AS mon,&lt;br /&gt;
     month(first_date) AS m,&lt;br /&gt;
     date_format(first_date, '%M %Y') as monthstr,&lt;br /&gt;
     date_format(first_date,'%Y%m') as monthnum, &lt;br /&gt;
     count(*) AS c&lt;br /&gt;
FROM&lt;br /&gt;
     `messages`&lt;br /&gt;
WHERE&lt;br /&gt;
     year(first_date)		 &amp;gt; 1979 and &lt;br /&gt;
     mailing_list_url not like '%meego-commits%' and first_date&amp;lt;'2011-05-01'&lt;br /&gt;
GROUP BY&lt;br /&gt;
     `messages`.`mailing_list_url`,&lt;br /&gt;
     y,m&lt;br /&gt;
ORDER BY&lt;br /&gt;
     monthnum ASC,&lt;br /&gt;
     list ASC,&lt;br /&gt;
     c ASC&lt;br /&gt;
}}}&lt;br /&gt;
&lt;br /&gt;
We use $monthnum to order the results over time, and $monthstr as a more meaningful X axis label. We eliminate all emails with bad Date headers and filter meego-commits out of our analysis, and group the messages by mailing list, to give the following graph as the final result (how to create a report will follow later):&lt;br /&gt;
&lt;br /&gt;
[[File:Mlgraph.png]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Gathering_data</id>
		<title>Metrics/Gathering data</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Gathering_data"/>
				<updated>2011-06-10T10:33:35Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Created page with &amp;quot;For each of the services we gather data for, here's a guide to getting that data:  == Mailing lists ==  Mailman mailing lists can be downloaded, parsed and stored in a MySQL data...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For each of the services we gather data for, here's a guide to getting that data:&lt;br /&gt;
&lt;br /&gt;
== Mailing lists ==&lt;br /&gt;
&lt;br /&gt;
Mailman mailing lists can be downloaded, parsed and stored in a MySQL database using [https://projects.libresoft.es/projects/show/mlstats MLStats].&lt;br /&gt;
&lt;br /&gt;
The general idea is to point mlstats at the list archive page, and let it do the work of figuring out what to download.&lt;br /&gt;
&lt;br /&gt;
We are carrying a small local patch to mlstats to ensure that it re-downloads the current month's archives and reparses them. The patch [https://bugzilla.libresoft.es/attachment.cgi?id=137 has been submitted upstream].&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Installing_Pentaho</id>
		<title>Metrics/Installing Pentaho</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Installing_Pentaho"/>
				<updated>2011-06-09T11:41:42Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Installing Pentaho */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Installing Pentaho =&lt;br /&gt;
&lt;br /&gt;
This page is part of the instructions for replicating the [[../Dashboard | MeeGo Community Dashboard]] and helping us improve community metrics.&lt;br /&gt;
&lt;br /&gt;
== Installing Tomcat ==&lt;br /&gt;
&lt;br /&gt;
 apt-get install tomcat6 tomcat6-examples tomcat6-docs tomcat6-admin&lt;br /&gt;
&lt;br /&gt;
Add manager and admin role to &amp;lt;code&amp;gt;/etc/tomcat6/tomcat-users.xml&amp;lt;/code&amp;gt;, add user to manager &amp;amp; admin roles with password for authentification.&lt;br /&gt;
&lt;br /&gt;
Tomcat instance is on http://localhost:8080 by default, there are also http://localhost:8080/examples/ and http://localhost:8080/docs/. You can check your Tomcat installation by using the examples webapp.&lt;br /&gt;
&lt;br /&gt;
Tomcat will be launched at start-up via &amp;lt;code&amp;gt;/etc/init.d/tomcat6&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Overview ===&lt;br /&gt;
&lt;br /&gt;
Tomcat is an important common element of Pentaho and JasperReports, and it's worth documenting how it works.&lt;br /&gt;
&lt;br /&gt;
Tomcat is basically a web server which allows the server-side execution of Java code in one of a number of ways:&lt;br /&gt;
&lt;br /&gt;
# Servlets are Java objects which are instantiated on the server and which can conserve state across user sessions, and handle HTTP queries&lt;br /&gt;
# JSP pages allow Java code to be embedded directly in web pages (similar to PHP). JSPs are precompiled into servlets by the server.&lt;br /&gt;
# Taglibs allow the developer to define custom tags which will be parsed on the server and will run predefined methods of specific classes when they are used. Similar to JSPs, custom tags are pre-compiled in JSPs to instantiate and call the classes which handle them.&lt;br /&gt;
&lt;br /&gt;
Tomcat thus requires a way to compile JSPs into Java code into .class files - this component of Tomcat is called Jasper. Tomcat has two other important components: the Coyote HTTP server parses HTTP traffic, and Catalina, its servlet container instantiates and stores classes in memory on the server.&lt;br /&gt;
&lt;br /&gt;
Recent versions of Tomcat no longer include the JSP Standard Taglib Library by default. You will need to download and install [http://tomcat.apache.org/taglibs/standard/ JSTL from Apache] to make all of the examples work correctly - these are also required for Pentaho.&lt;br /&gt;
&lt;br /&gt;
=== Configuration ===&lt;br /&gt;
&lt;br /&gt;
Tomcat is typically started via an init script, &amp;lt;code&amp;gt;/etc/init.d/tomcat6&amp;lt;/code&amp;gt;, which initiates the following environment variables:&lt;br /&gt;
; CATALINA_HOME&lt;br /&gt;
: the directory where the Tomcat binary, class files and any common JAR files are found&lt;br /&gt;
; CATALINA_BASE&lt;br /&gt;
: The directory where Catalina will generate class files and use as the base directory for any deployed webapps&lt;br /&gt;
; JAVA_HOME and JAVA_OPTS&lt;br /&gt;
: Environment variables for the system JRE (or JDK) install&lt;br /&gt;
&lt;br /&gt;
Finally, the init shell will call catalina.sh in $CATALINA_HOME/bin to start the servlet engine.&lt;br /&gt;
&lt;br /&gt;
On starting Catalina, a number of config files will be read, in the following order, and in the following places:&lt;br /&gt;
&lt;br /&gt;
In $CATALINA_BASE/conf, the files catalina.policy, catalina.properties, logging.properties, content.xml, server.xml, tomcat-users.xml and web.xml are parsed:&lt;br /&gt;
&lt;br /&gt;
; catalina.policy&lt;br /&gt;
: defines the global security policy for the server&lt;br /&gt;
; catalina.properties&lt;br /&gt;
: server-wide configuration&lt;br /&gt;
; logging.properties&lt;br /&gt;
: defines various logging-related options&lt;br /&gt;
; content.xml&lt;br /&gt;
: Defines [http://www.mulesoft.com/tomcat-context Tomcat Context] information. A Context is a single web application running inside Tomcat. Context can be defined in a number of ways:&lt;br /&gt;
:* $CATALINA_BASE/conf/context.xml - contexts defined here have their elements loaded by all webapps&lt;br /&gt;
:* $CATALINA_BASE/conf/[enginename]/[hostname]/context.xml.default&lt;br /&gt;
:* XML files in $CATALINA_BASE/conf/[enginename]/[hostname] - each XML file defines a new context path - examples.xml defines the context path &amp;quot;/examples&amp;quot;&lt;br /&gt;
:* /META-INF/context.xml files inside .war files - these files *should* get copied to $CATALINA_BASE/conf/[enginename]/[hostname]&lt;br /&gt;
:* Finally a Context element inside a Host element in conf/server.xml - since other methods provide a means for simple per-webapp contexts, this method should be avoided.&lt;br /&gt;
; server.xml&lt;br /&gt;
: [http://www.mulesoft.com/tomcat-configuration The global configuration file] - defines hosts, contexts, and more. Analogous to httpd.conf for apache.&lt;br /&gt;
; tomcat-users.xml&lt;br /&gt;
: Contains information about users, passwords and trusted Realms on the server&lt;br /&gt;
; web.xml&lt;br /&gt;
: Options and values which will be shared by all webapps. Each webapp also has a context-specific config file in WEB-INF/web.xml&lt;br /&gt;
&lt;br /&gt;
== Installing Pentaho on Tomcat ==&lt;br /&gt;
&lt;br /&gt;
The best guide I found for this is here: https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en&lt;br /&gt;
&lt;br /&gt;
# Install Sun's Java and Tomcat, and get them working. On Ubuntu, this will set CATALINA_HOME to /usr/share/tomcat6 and CATALINA_BASE to /var/lib/tomcat6, and ${CATALINA_BASE}/conf is a sym link to /etc/tomcat6.&lt;br /&gt;
# Download and unpack pentaho-ce-3.7.0.stable.tar.gz from [http://sourceforge.net/projects/pentaho/ sourceforge]&lt;br /&gt;
# Move biserver-ce and administration-console to /opt/pentaho (or /var/lib/pentaho, wherever you want to install it - for the rest of the document, we'll refer to this as PENTAHO_HOME)&lt;br /&gt;
# Copy (if it's not already there) mysql-connector-java-5.0.7.jar from biserver-ce/tomcat/common/lib to $CATALINA_HOME/lib&lt;br /&gt;
# Copy pentaho, pentaho-styles and sw-styles webapps from biserver/tomcat/webapps/ to $CATALINA_BASE/webapps&lt;br /&gt;
# Move biserver-ce/pentaho-solutions to $PENTAHO_HOME&lt;br /&gt;
# Create the required Pentaho databases (quartz, hibernate and sample_datasource) complete with database users, by downloading and executing in order [https://docs.google.com/leaf?id=0B9Jmocc0fj_EN2MyZjc4ZjEtNzFkNC00NzIzLTljZTctZjIzZWQ1NjU3MzJk&amp;amp;hl=en these SQL scripts]. Apparently the sample data delivered with pentaho-ce is lacking in some way, but I don't understand how. You should change the passwords for hibuser and pentaho_user in these scripts to something better than 'password' before running them.&lt;br /&gt;
&lt;br /&gt;
Once you have the databases, you need to configure them as data sources for the Pentaho web application. This has several aspects:&lt;br /&gt;
# Setting the JDBC security to use a MySQL database instead of the packages Hypersonic database (I'll admit, I really don't understand this) - [https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en this document] has details for changes needed&lt;br /&gt;
## applicationContext-spring-security-jdbc.xml in pentaho-solutions/system/: change driver to mysql &amp;amp; password for hibuser&lt;br /&gt;
## applicationContext-spring-security-hibernate.properties in pentaho-solutions/system/:change driver to mysql &amp;amp; password for hibuser&lt;br /&gt;
## hibernate-settings.xml in pentaho-solutions/system/hibernate: change config file to point to MySQL settings&lt;br /&gt;
## mysql5.hibernate.cfg.xml in pentaho-solutions/system/hibernate: change password for hibuser&lt;br /&gt;
# Configure the hibernate and quartz databases for use with the webapp&lt;br /&gt;
## Update the context.xml file which should be present (if you used the biserver-ce package, not the -manual one) in $CATALINA_BASE/webapps/pentaho/META-INF/ to use the MySQL JDBC driver (and set the password correctly for hibuser and pentaho_user)&lt;br /&gt;
# Configure the sampledata database for use as a data source: In the hibernate database, update the DATASOURCE table to set DRIVERCLASS, URL and QUERY to the appropriate values when the NAME is SampleData. If you change the password for the pentaho_user user from 'password', you will need to do the same for the SampleData datasource, which will require you to launch the administration panel. To do this, you first need to configure the panel to set the path to the pentaho-solutions directory and the pentaho webapp in $PENTAHO_HOME/administration-console/resource/config/, then run start_pac.sh in $PENTAHO_HOME/administration-console and connect to http://localhost:8099 (username/password is 'admin'/'password' by default)&lt;br /&gt;
&lt;br /&gt;
Finally, we need to configure the Pentaho application for the local set-up:&lt;br /&gt;
# In $CATALINA_BASE/webapps/pentaho/WEB_INF/web.xml: Set the solutions-path parameter to $PENTAHO_HOME/pentaho-solutions&lt;br /&gt;
# Set fully-qualified-server-url to the name of the server, port and path for the webapp (eg. http://localhost:8080/pentaho) - if you want people to be able to use Pentaho remotely, this should be an IP address or hostname, rather than localhost (since it's the URL used internally to find resources)&lt;br /&gt;
# Disable startup of Hypersonic by commenting the relevant sections (hsqldb-databases and the appropriate listener)&lt;br /&gt;
# Add trusted connecting hosts to TrustedIpAddrs (I used a netmask: 192.168.1.0/255.255.255.0)&lt;br /&gt;
&lt;br /&gt;
Finally, ensure that you have set permissions correctly for the Tomcat user on $PENTAHO_HOME/pentaho-solutions and $CATALINA_BASE/webapps/pentaho - the webapp needs write access there.&lt;br /&gt;
&lt;br /&gt;
Now restart Tomcat, and go to http://localhost:8080/pentaho to see if it's working.&lt;br /&gt;
&lt;br /&gt;
=== Configuring email ===&lt;br /&gt;
&lt;br /&gt;
To send email from reports, you can configure smtp in pentaho-solutions/system/smtp-email/email_config.xml - there is a sample config file for GMail which you can copy over, if you use gmail SMTP, in email_config_gmail.xml - all you need to do is set your username, the email address which will appear in the &amp;quot;From:&amp;quot; header, and your password.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Installing_Pentaho</id>
		<title>Metrics/Installing Pentaho</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Installing_Pentaho"/>
				<updated>2011-06-08T10:12:47Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: Document installation of Pentaho for metrics&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Installing Pentaho =&lt;br /&gt;
&lt;br /&gt;
== Installing Tomcat ==&lt;br /&gt;
&lt;br /&gt;
 apt-get install tomcat6 tomcat6-examples tomcat6-docs tomcat6-admin&lt;br /&gt;
&lt;br /&gt;
Add manager and admin role to &amp;lt;code&amp;gt;/etc/tomcat6/tomcat-users.xml&amp;lt;/code&amp;gt;, add user to manager &amp;amp; admin roles with password for authentification.&lt;br /&gt;
&lt;br /&gt;
Tomcat instance is on http://localhost:8080 by default, there are also http://localhost:8080/examples/ and http://localhost:8080/docs/. You can check your Tomcat installation by using the examples webapp.&lt;br /&gt;
&lt;br /&gt;
Tomcat will be launched at start-up via &amp;lt;code&amp;gt;/etc/init.d/tomcat6&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Overview ===&lt;br /&gt;
&lt;br /&gt;
Tomcat is an important common element of Pentaho and JasperReports, and it's worth documenting how it works.&lt;br /&gt;
&lt;br /&gt;
Tomcat is basically a web server which allows the server-side execution of Java code in one of a number of ways:&lt;br /&gt;
&lt;br /&gt;
# Servlets are Java objects which are instantiated on the server and which can conserve state across user sessions, and handle HTTP queries&lt;br /&gt;
# JSP pages allow Java code to be embedded directly in web pages (similar to PHP). JSPs are precompiled into servlets by the server.&lt;br /&gt;
# Taglibs allow the developer to define custom tags which will be parsed on the server and will run predefined methods of specific classes when they are used. Similar to JSPs, custom tags are pre-compiled in JSPs to instantiate and call the classes which handle them.&lt;br /&gt;
&lt;br /&gt;
Tomcat thus requires a way to compile JSPs into Java code into .class files - this component of Tomcat is called Jasper. Tomcat has two other important components: the Coyote HTTP server parses HTTP traffic, and Catalina, its servlet container instantiates and stores classes in memory on the server.&lt;br /&gt;
&lt;br /&gt;
Recent versions of Tomcat no longer include the JSP Standard Taglib Library by default. You will need to download and install [http://tomcat.apache.org/taglibs/standard/ JSTL from Apache] to make all of the examples work correctly - these are also required for Pentaho.&lt;br /&gt;
&lt;br /&gt;
=== Configuration ===&lt;br /&gt;
&lt;br /&gt;
Tomcat is typically started via an init script, &amp;lt;code&amp;gt;/etc/init.d/tomcat6&amp;lt;/code&amp;gt;, which initiates the following environment variables:&lt;br /&gt;
; CATALINA_HOME&lt;br /&gt;
: the directory where the Tomcat binary, class files and any common JAR files are found&lt;br /&gt;
; CATALINA_BASE&lt;br /&gt;
: The directory where Catalina will generate class files and use as the base directory for any deployed webapps&lt;br /&gt;
; JAVA_HOME and JAVA_OPTS&lt;br /&gt;
: Environment variables for the system JRE (or JDK) install&lt;br /&gt;
&lt;br /&gt;
Finally, the init shell will call catalina.sh in $CATALINA_HOME/bin to start the servlet engine.&lt;br /&gt;
&lt;br /&gt;
On starting Catalina, a number of config files will be read, in the following order, and in the following places:&lt;br /&gt;
&lt;br /&gt;
In $CATALINA_BASE/conf, the files catalina.policy, catalina.properties, logging.properties, content.xml, server.xml, tomcat-users.xml and web.xml are parsed:&lt;br /&gt;
&lt;br /&gt;
; catalina.policy&lt;br /&gt;
: defines the global security policy for the server&lt;br /&gt;
; catalina.properties&lt;br /&gt;
: server-wide configuration&lt;br /&gt;
; logging.properties&lt;br /&gt;
: defines various logging-related options&lt;br /&gt;
; content.xml&lt;br /&gt;
: Defines [http://www.mulesoft.com/tomcat-context Tomcat Context] information. A Context is a single web application running inside Tomcat. Context can be defined in a number of ways:&lt;br /&gt;
:* $CATALINA_BASE/conf/context.xml - contexts defined here have their elements loaded by all webapps&lt;br /&gt;
:* $CATALINA_BASE/conf/[enginename]/[hostname]/context.xml.default&lt;br /&gt;
:* XML files in $CATALINA_BASE/conf/[enginename]/[hostname] - each XML file defines a new context path - examples.xml defines the context path &amp;quot;/examples&amp;quot;&lt;br /&gt;
:* /META-INF/context.xml files inside .war files - these files *should* get copied to $CATALINA_BASE/conf/[enginename]/[hostname]&lt;br /&gt;
:* Finally a Context element inside a Host element in conf/server.xml - since other methods provide a means for simple per-webapp contexts, this method should be avoided.&lt;br /&gt;
; server.xml&lt;br /&gt;
: [http://www.mulesoft.com/tomcat-configuration The global configuration file] - defines hosts, contexts, and more. Analogous to httpd.conf for apache.&lt;br /&gt;
; tomcat-users.xml&lt;br /&gt;
: Contains information about users, passwords and trusted Realms on the server&lt;br /&gt;
; web.xml&lt;br /&gt;
: Options and values which will be shared by all webapps. Each webapp also has a context-specific config file in WEB-INF/web.xml&lt;br /&gt;
&lt;br /&gt;
== Installing Pentaho on Tomcat ==&lt;br /&gt;
&lt;br /&gt;
The best guide I found for this is here: https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en&lt;br /&gt;
&lt;br /&gt;
# Install Sun's Java and Tomcat, and get them working. On Ubuntu, this will set CATALINA_HOME to /usr/share/tomcat6 and CATALINA_BASE to /var/lib/tomcat6, and ${CATALINA_BASE}/conf is a sym link to /etc/tomcat6.&lt;br /&gt;
# Download and unpack pentaho-ce-3.7.0.stable.tar.gz from [http://sourceforge.net/projects/pentaho/ sourceforge]&lt;br /&gt;
# Move biserver-ce and administration-console to /opt/pentaho (or /var/lib/pentaho, wherever you want to install it - for the rest of the document, we'll refer to this as PENTAHO_HOME)&lt;br /&gt;
# Copy (if it's not already there) mysql-connector-java-5.0.7.jar from biserver-ce/tomcat/common/lib to $CATALINA_HOME/lib&lt;br /&gt;
# Copy pentaho, pentaho-styles and sw-styles webapps from biserver/tomcat/webapps/ to $CATALINA_BASE/webapps&lt;br /&gt;
# Move biserver-ce/pentaho-solutions to $PENTAHO_HOME&lt;br /&gt;
# Create the required Pentaho databases (quartz, hibernate and sample_datasource) complete with database users, by downloading and executing in order [https://docs.google.com/leaf?id=0B9Jmocc0fj_EN2MyZjc4ZjEtNzFkNC00NzIzLTljZTctZjIzZWQ1NjU3MzJk&amp;amp;hl=en these SQL scripts]. Apparently the sample data delivered with pentaho-ce is lacking in some way, but I don't understand how. You should change the passwords for hibuser and pentaho_user in these scripts to something better than 'password' before running them.&lt;br /&gt;
&lt;br /&gt;
Once you have the databases, you need to configure them as data sources for the Pentaho web application. This has several aspects:&lt;br /&gt;
# Setting the JDBC security to use a MySQL database instead of the packages Hypersonic database (I'll admit, I really don't understand this) - [https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en this document] has details for changes needed&lt;br /&gt;
## applicationContext-spring-security-jdbc.xml in pentaho-solutions/system/: change driver to mysql &amp;amp; password for hibuser&lt;br /&gt;
## applicationContext-spring-security-hibernate.properties in pentaho-solutions/system/:change driver to mysql &amp;amp; password for hibuser&lt;br /&gt;
## hibernate-settings.xml in pentaho-solutions/system/hibernate: change config file to point to MySQL settings&lt;br /&gt;
## mysql5.hibernate.cfg.xml in pentaho-solutions/system/hibernate: change password for hibuser&lt;br /&gt;
# Configure the hibernate and quartz databases for use with the webapp&lt;br /&gt;
## Update the context.xml file which should be present (if you used the biserver-ce package, not the -manual one) in $CATALINA_BASE/webapps/pentaho/META-INF/ to use the MySQL JDBC driver (and set the password correctly for hibuser and pentaho_user)&lt;br /&gt;
# Configure the sampledata database for use as a data source: In the hibernate database, update the DATASOURCE table to set DRIVERCLASS, URL and QUERY to the appropriate values when the NAME is SampleData. If you change the password for the pentaho_user user from 'password', you will need to do the same for the SampleData datasource, which will require you to launch the administration panel. To do this, you first need to configure the panel to set the path to the pentaho-solutions directory and the pentaho webapp in $PENTAHO_HOME/administration-console/resource/config/, then run start_pac.sh in $PENTAHO_HOME/administration-console and connect to http://localhost:8099 (username/password is 'admin'/'password' by default)&lt;br /&gt;
&lt;br /&gt;
Finally, we need to configure the Pentaho application for the local set-up:&lt;br /&gt;
# In $CATALINA_BASE/webapps/pentaho/WEB_INF/web.xml: Set the solutions-path parameter to $PENTAHO_HOME/pentaho-solutions&lt;br /&gt;
# Set fully-qualified-server-url to the name of the server, port and path for the webapp (eg. http://localhost:8080/pentaho) - if you want people to be able to use Pentaho remotely, this should be an IP address or hostname, rather than localhost (since it's the URL used internally to find resources)&lt;br /&gt;
# Disable startup of Hypersonic by commenting the relevant sections (hsqldb-databases and the appropriate listener)&lt;br /&gt;
# Add trusted connecting hosts to TrustedIpAddrs (I used a netmask: 192.168.1.0/255.255.255.0)&lt;br /&gt;
&lt;br /&gt;
Finally, ensure that you have set permissions correctly for the Tomcat user on $PENTAHO_HOME/pentaho-solutions and $CATALINA_BASE/webapps/pentaho - the webapp needs write access there.&lt;br /&gt;
&lt;br /&gt;
Now restart Tomcat, and go to http://localhost:8080/pentaho to see if it's working.&lt;br /&gt;
&lt;br /&gt;
=== Configuring email ===&lt;br /&gt;
&lt;br /&gt;
To send email from reports, you can configure smtp in pentaho-solutions/system/smtp-email/email_config.xml - there is a sample config file for GMail which you can copy over, if you use gmail SMTP, in email_config_gmail.xml - all you need to do is set your username, the email address which will appear in the &amp;quot;From:&amp;quot; header, and your password.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/Metrics/Dashboard</id>
		<title>Metrics/Dashboard</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/Metrics/Dashboard"/>
				<updated>2011-06-08T10:03:56Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Architecture */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Community Metrics Dashboard ==&lt;br /&gt;
&lt;br /&gt;
The goal is to provide a web page summarising metrics about various aspects of the MeeGo project. The data should update regularly - depending on the metric, that could be real time or  updated automatically on a regular basis.&lt;br /&gt;
&lt;br /&gt;
The dashboard will track the following community resources, ideally: &lt;br /&gt;
* Drupal members&lt;br /&gt;
* Bugzilla (bugs opened, bugs closed, active users)&lt;br /&gt;
* Mailing lists (members, posts, threads)&lt;br /&gt;
* gitorious (commits, employer details for committers) - should use Jon Corbet's scripts like are used in the LF yearly kernel data.&lt;br /&gt;
* Wiki (edits, new pages)&lt;br /&gt;
* Forums (members, posts)&lt;br /&gt;
* IRC (total comments, people on channel)&lt;br /&gt;
* Transifex (Languages, translators, strings translated)&lt;br /&gt;
* Community OBS (uploads, users)&lt;br /&gt;
* SDK downloads (potentially extrapolated from meego.com)&lt;br /&gt;
&lt;br /&gt;
The data should also be available for custom reports for usage and analysis in the monthly MeeGo Metrics report published by [[User:DawnFoster]]&lt;br /&gt;
&lt;br /&gt;
To fulfill these goals, the dashboard will gather data from the various resource into a centralised database, using some sort of Business Intelligence platform including ETL for data acquisition and storage, and a reporting service for generating reports and dashboards.. A web page will provide a view into this database with predefined reports.&lt;br /&gt;
&lt;br /&gt;
Candidate reporting solutions:&lt;br /&gt;
&lt;br /&gt;
* [http://jasperforge.org/index.php?q=project/jasperreports JasperReports]&lt;br /&gt;
* [http://www.pentaho.com/ Pentaho]&lt;br /&gt;
&lt;br /&gt;
The following are essentially ETL engines, and do not provide reporting or dashboard functionality:&lt;br /&gt;
&lt;br /&gt;
* [http://www.talend.com/index.php Talend]&lt;br /&gt;
* [http://petals.ow2.org/ Petals]&lt;br /&gt;
&lt;br /&gt;
[http://www.mulesoft.com/ MuleSoft] is an open source ESB, but does not seem adapted to our needs. The field is thus narrowed to Pentaho and JasperReports.&lt;br /&gt;
&lt;br /&gt;
For each community resource, we need to figure out how to get the data into a usable form, and come up with appropriate queries for metrics reports, and finally present the results on a webpage.&lt;br /&gt;
&lt;br /&gt;
=== Business intelligence engines ===&lt;br /&gt;
&lt;br /&gt;
The area of Business Intelligence is littered with acronyms. Here's a quick overview of the main ones, and how they all fit together.&lt;br /&gt;
&lt;br /&gt;
; BI&lt;br /&gt;
: Business Intelligence - general name for any middleware which allows you to query business processes (sales, inventory, etc) and get data overviews from it&lt;br /&gt;
; ETL&lt;br /&gt;
: Extract, Transform, and Load - the process if extracting data from a data source (database, screen scraping, text file parsing, whatever), transforming it to a well understood format, and loading it in your BI engine database or data warehouse. Good ETL solutions provide a nice way for you to connect another database and have new data sucked in at regular intervals, define views into the source data store which you can then query within your BI engine, etc. Pentaho's ETL, [http://kettle.pentaho.com/ Kettle], and [http://www.jaspersoft.com/jasperetl JasperETL], used by JasperReports, both provide (kind of) straightforward ways to hook into a MySQL database.&lt;br /&gt;
; ESB&lt;br /&gt;
: [http://en.wikipedia.org/wiki/Enterprise_service_bus Enterprise Service Bus] - a middleware bus providing a unique interface to applications on the front-end and data stores on the back end. Often used to link up many front-end applications (eg. library, student registration, employee payroll, syllabus management, accounting, supply-chain, student lodgement programmes, etc in a university). Not really useful for us, as far as I can tell.&lt;br /&gt;
; EAI&lt;br /&gt;
: Enterprise Application Integration - using software to integrate different applications together. As far as I can tell, this is a meaningless catch-all phrase for anything from kludges to architected business intelligence solutions.&lt;br /&gt;
; DW&lt;br /&gt;
: Data Warehouse. Basically the same thing as a database, as far as I can tell, but bigger and more impressive sounding.&lt;br /&gt;
; OLAP&lt;br /&gt;
: On-Line Analytical Processing. Commonly used acronym for extracting data via multi-dimensional queries. Databases can be configured to provide the results of this kind of query. As far as I can tell this is mostly a buzzword - an &amp;quot;OLAP database&amp;quot; like [http://mondrian.pentaho.com/ Mondrian] is basically the same thing as a database. &amp;quot;speed-of-thought&amp;quot; response times indeed.&lt;br /&gt;
; Business reporting&lt;br /&gt;
: An application which allows a graphical view of a database, and allows you to construct queries interactively, often using drag &amp;amp; drop. The results of these queries can then be plugged into graphing software for presentation in a dashboard.&lt;br /&gt;
; Dashboard&lt;br /&gt;
: Organised presentation of information in a web-page or other similar format allowing an at-a-glance overview of the situation for the data being measured.&lt;br /&gt;
&lt;br /&gt;
So, in short, the community dashboard project will likely use an ETL to plug data into an OLAP server, and then use a business reporting engine to query that data and present it in a dashboard.&lt;br /&gt;
&lt;br /&gt;
=== Comparison of candidate ETL/reporting ===&lt;br /&gt;
&lt;br /&gt;
Modules available:&lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
! Software !! License !! ETL !! OLAP database !! BI server !! Reporting !! Dashboard module&lt;br /&gt;
|-&lt;br /&gt;
| Pentaho || EPL || [http://kettle.pentaho.com/ Kettle] || [http://mondrian.pentaho.com/ Mondrian] || [http://community.pentaho.com/projects/bi_platform/ Pentaho BI Platform] || [http://reporting.pentaho.com/ Pentaho Reporting] || [http://wiki.pentaho.com/display/COM/Community+Dashboard+Framework Community Dashboard Framework]&lt;br /&gt;
|-&lt;br /&gt;
| [http://www.jaspersoft.com/editions Jaspersoft] || AGPL v3 || JasperETL (Talend Open Studio) || JasperOLAP || JasperReports Server || iReports editor || No (commercial only)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Pentaho is used as the basis of Mozilla's metrics project, and provides a very strong community software option for both the dashboard and for managing the BI server. Since Mozilla metrics work overlaps what we are trying to achieve, particularly their work on SQR, the Software Quality Reports analytic module for Bugzilla and JIRA, Pentaho is my preference for the dashboard project. In general, I have observed that the Pentaho community provides very good support.&lt;br /&gt;
&lt;br /&gt;
== Architecture ==&lt;br /&gt;
&lt;br /&gt;
Pentaho runs as a webapp in Tomcat6. It can use a variety of databases for its internal data structures, the default (Hypersonic) is a Java database. However, because it's both standard &amp;amp; well understood and to allow consolidation of databases under one DB server, I prefer to use MySQL. The configuration of Pentaho with a MySQL database is a little tricky, but almost all of the steps are covered well [https://docs.google.com/Doc?docid=0AdJmocc0fj_EZDJ3YmZiZF83M2RtaHhwcmRk&amp;amp;hl=en in this tutorial].&lt;br /&gt;
&lt;br /&gt;
The data which is useful for metrics will be copied into a local database from each of the services we query. The copying of data will be accomplished by a set of Kettle &amp;quot;xactions&amp;quot;, which can be created and edited easily with the Spoon tool.&lt;br /&gt;
&lt;br /&gt;
A number of reports will be generated using the Pentaho Report Designer, including a static HTML/Flash dashboard which will be published regularly. Other reports can be created for the community managers, and a more advanced dashboard, allowing detailed analysis of basic metrics, can be provided via the Community Dashboard Framework.&lt;br /&gt;
&lt;br /&gt;
We will need to see how much load the dashboard will generate on the server. I suspect that it will not be practical to expose the dashboard in public.&lt;br /&gt;
&lt;br /&gt;
We will document here everything you need to do to replicate the MeeGo Community Dashboard, with the exception of data which is not publicly available because it contains security related or confidential information (mainly bugzilla).&lt;br /&gt;
&lt;br /&gt;
* [[../Installing Pentaho]]: A guide to getting Pentaho up and running for MeeGo Dashboard&lt;br /&gt;
* [[../Gathering data]]: A guide to installing and using the tools to gather data for a dashboard&lt;br /&gt;
&lt;br /&gt;
=== Extracting data ===&lt;br /&gt;
&lt;br /&gt;
For SQL databases, this implies that the server where the dashboard will run should have access to the database server for MediaWiki, Bugzilla, and Drupal.&lt;br /&gt;
&lt;br /&gt;
For the forum, we will integrate the CSV files currently being exported, which provide the basic analytics we need.&lt;br /&gt;
&lt;br /&gt;
Individual mailing lists will be parsed by [http://forge.morfeo-project.org/projects/libresoft-tools/ MLStats]. We will use the resulting database directly in the dashboard.&lt;br /&gt;
&lt;br /&gt;
Git repositories will be queried with &amp;quot;git log&amp;quot;, and parsed with the parser module from [http://lwn.net/Articles/290957/ gitdm], before being stored directly in a database. we will be able to run analytics on the results from there. gitdm can also do basic analytics of git logs, and we may decide to simply reuse gitdm's analytics. However, if we want to extend them, we will want to have the raw data.&lt;br /&gt;
&lt;br /&gt;
IRC logs will be parsed with [http://code.google.com/p/superseriousstats/ superseriousstats], a PHP command line tool that parses IRC logs and stores the results in an SQL database.&lt;br /&gt;
&lt;br /&gt;
We still need to figure out how to do data interchange with Transifex and OBS. Dimitris tells me that there are already [http://meego.transifex.net/stats/ some analytics] available on Transifex, and that there is a RESTful API available to query this data.&lt;br /&gt;
&lt;br /&gt;
== Data to report ==&lt;br /&gt;
&lt;br /&gt;
For each of the resources, the following statistics (at a minimum) should be extracted:&lt;br /&gt;
&lt;br /&gt;
=== Drupal ===&lt;br /&gt;
* Members of meego.com (+ evolution month over month)&lt;br /&gt;
* Active members (need a decent way to hook up different ways a person can be active: wiki, ML, IRC, forum, git)&lt;br /&gt;
&lt;br /&gt;
=== Mailing lists ===&lt;br /&gt;
* Subscriber numbers (+ evolution) - from Mailman directly, not available in mlstats&lt;br /&gt;
* Emails sent (+ evolution) - from mlstats&lt;br /&gt;
* Active participants (individuals with &amp;gt;=2 emails during month) - from mlstats&lt;br /&gt;
* Hot threads - from mlstats&lt;br /&gt;
* Top posters - from mlstats&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* Count posts of most popular threads:&lt;br /&gt;
*: &amp;lt;code&amp;gt;select subject,year(first_date) as y, monthname(first_date),count(*) as c from messages group by subject, month(first_date) order by y, month(first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Count the number of posts for each person&lt;br /&gt;
*: &amp;lt;code&amp;gt;select p.email_address,year(m.first_date) as y, monthname(m.first_date),count(*) as c from messages as m,messages_people as p where m.message_id=p.message_ID group by p.email_address, month(m.first_date) order by y, month(m.first_date), c;&amp;lt;/code&amp;gt;&lt;br /&gt;
* Restricting queries to a date range / month&lt;br /&gt;
*: The query above gives month-by-month totals, you could add an order by year(first_date) to get the year too&lt;br /&gt;
*: The easiest way is to add &amp;lt;code&amp;gt;where month(first_date)=3 and year(first_date)=2010&amp;lt;/code&amp;gt; for March 2010. For the current month, &amp;lt;code&amp;gt;month(m.first_date)=month(NOW()) and year(first_date)=year(NOW())&amp;lt;/code&amp;gt; works.&lt;br /&gt;
&lt;br /&gt;
=== Forum ===&lt;br /&gt;
* Posts per month (+evolution)&lt;br /&gt;
* Active posters (2+ posts during month)&lt;br /&gt;
* Hot topics&lt;br /&gt;
* Top posters&lt;br /&gt;
&lt;br /&gt;
[http://forum.meego.com/stats/ Stats] are exported from the Forum in CSV format monthly.&lt;br /&gt;
&lt;br /&gt;
=== Bugzilla ===&lt;br /&gt;
* Bugs created (+ evolution)&lt;br /&gt;
* Bugs resolved (+ evolution)&lt;br /&gt;
* Comments on bugs this month&lt;br /&gt;
* Active Bugzilla contributors (2+ comments during month)&lt;br /&gt;
&lt;br /&gt;
==== Useful queries ====&lt;br /&gt;
&lt;br /&gt;
* [http://sourceforge.net/projects/qareports/ Software Quality Reports for Pentaho] of course&lt;br /&gt;
* See the [http://bugs.meego.com/query.cgi?format=report-table bugzilla tabular reports] and relative dates for more. &lt;br /&gt;
* Other potential resources:&lt;br /&gt;
** [http://www.ravenbrook.com/project/p4dti/tool/cgi/bugzilla-schema/ Bugzilla database schema by version]&lt;br /&gt;
** [https://landfill.bugzilla.org/bugzilla-tip/report.cgi Demo site for standard Bugzilla reports]&lt;br /&gt;
** [https://wiki.mozilla.org/Bugzilla:SQLCookBook The Bugzilla SQL cookbook]&lt;br /&gt;
** [http://www.bugzillametrics.org/ Bugzilla Metrics] - Third party add-on to Bugzilla that produces lots of metrics&lt;br /&gt;
** browse.cgi and weekly-summary.html extensions of GNOME Bugzilla: [https://launchpad.net/bugzilla.gnome.org Code], examples in practice: [http://bugzilla.gnome.org/browse.cgi?product=Evolution browse.cgi], [https://bugzilla.gnome.org/page.cgi?id=weekly-bug-summary.html weekly-summary.cgi]&lt;br /&gt;
&lt;br /&gt;
=== Mediawiki ===&lt;br /&gt;
* New wiki pages (+ evolution)&lt;br /&gt;
* Edits this month (+ evolution)&lt;br /&gt;
* Pages deleted this month (+ evolution)&lt;br /&gt;
* Unique editors this month (+ evoluion)&lt;br /&gt;
&lt;br /&gt;
==== Queries ====&lt;br /&gt;
&lt;br /&gt;
A [http://www.mediawiki.org/wiki/Extension:SpecialUserScore MediaWiki extension] exists to provide &amp;quot;user scores&amp;quot; for MediaWiki users, ordered by number of edits and number of pages changed. The guts of the query is:&lt;br /&gt;
&lt;br /&gt;
 SELECT COUNT(wr.rev_id) as value,&lt;br /&gt;
        COUNT(DISTINCT wr.rev_page) as page_value,&lt;br /&gt;
        wu.user_name as name,&lt;br /&gt;
        wu.user_real_name as real_name&lt;br /&gt;
 FROM   $user wu,&lt;br /&gt;
        $revision wr,&lt;br /&gt;
        $page wp&lt;br /&gt;
 WHERE  wu.user_id = wr.rev_user&lt;br /&gt;
    and wp.page_id = wr.rev_page&lt;br /&gt;
    and wp.page_namespace = 0&lt;br /&gt;
 GROUP BY wu.user_name&lt;br /&gt;
 ORDER BY value desc;&lt;br /&gt;
&lt;br /&gt;
where $user, $revision and $page are the names of the respective MediaWiki tables (MediaWiki tables have a prefix associated with them for a given instance, specified by $wgDBprefix in LocalSettings.php).&lt;br /&gt;
&lt;br /&gt;
For the following group-by-month queries, I did a cross join of (2008,2009,2010,2011) and (01-12) to generate a &amp;quot;year and month&amp;quot; data table.&lt;br /&gt;
&lt;br /&gt;
'''Top editors by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year AS yyyy,&lt;br /&gt;
        mon.timestamp_month AS mm,&lt;br /&gt;
        rev_user_text AS user,&lt;br /&gt;
        COUNT(*) AS c&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months AS mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm,user&lt;br /&gt;
 HAVING c&amp;gt;5&lt;br /&gt;
 ORDER BY yyyy,mm,c desc;&lt;br /&gt;
&lt;br /&gt;
'''Number of edits by month:'''&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*) AS edits&lt;br /&gt;
 FROM $revision AS rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE concat(concat(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
'''New pages per month:'''&lt;br /&gt;
To get the number of new pages per month is a bit trickier - first we need to query $revision to get the page_ids and their date of creation, then group by date. The query is O(n²) on the number of pages, although it should be possible to make it O(n) by grouping the result of the subquery without doing in() on the list of timestamps.&lt;br /&gt;
&lt;br /&gt;
 SELECT mon.timestamp_year as yyyy,&lt;br /&gt;
        mon.timestamp_month as mm,&lt;br /&gt;
        COUNT(*)&lt;br /&gt;
 FROM mw_revision as rev,&lt;br /&gt;
      years_months as mon&lt;br /&gt;
 WHERE rev.rev_timestamp LIKE CONCAT(CONCAT(mon.timestamp_year,mon.timestamp_month),'%')&lt;br /&gt;
   AND rev.rev_timestamp in (&lt;br /&gt;
                SELECT MIN(rev_timestamp)&lt;br /&gt;
                FROM mw_revision&lt;br /&gt;
                GROUP BY rev_page)&lt;br /&gt;
 GROUP BY yyyy,mm;&lt;br /&gt;
&lt;br /&gt;
To get just the list of pages &amp;amp; timestamps (this is used as the subquery for above):&lt;br /&gt;
 SELECT rev_page as p,&lt;br /&gt;
        MIN(rev_timestamp) as t&lt;br /&gt;
 FROM mw_revision&lt;br /&gt;
 GROUP BY rev_page;&lt;br /&gt;
&lt;br /&gt;
=== Transifex ===&lt;br /&gt;
* Total languages ordered by translation coverage (+ evolution)&lt;br /&gt;
* Top languages&lt;br /&gt;
* Top translators/teams&lt;br /&gt;
&lt;br /&gt;
This all depends on what is available from Transifex.&lt;br /&gt;
&lt;br /&gt;
=== Git ===&lt;br /&gt;
* Commits this month (+ evolution)&lt;br /&gt;
* Top committers (+ evolution)&lt;br /&gt;
* Committers by company (possible)&lt;br /&gt;
* Active modules (+ evolution)&lt;br /&gt;
&lt;br /&gt;
Using a modified version of gitdm to dump Git logs into a MySQL database for analysis. Modifications required:&lt;br /&gt;
* Create database &amp;amp; tables based on the gitdm data structures&lt;br /&gt;
* Dump data in correct order, avoiding redundancy if possible, into SQL database&lt;br /&gt;
&lt;br /&gt;
gitdm has 3 basic data structures: Hacker, Employer &amp;amp; Patch. Each changeset is a Patch object, each Patch has an Author, and is assigned to an Employer (based on who the Hacker was working for at the time of the Patch). Each Patch also has a list of Hackers who reviewed, reported, signed-off on and tested the patch. Each Hacker links to a list of Patches for which they are the author, a list of email addresses they have used to commit, and separate lists for reviewed, reported, SOB and tested. In addition, each Hacker has a list of Employers he has worked for, and each Employer has a list of Hackers who have worked for them.&lt;br /&gt;
&lt;br /&gt;
=== IRC ===&lt;br /&gt;
&lt;br /&gt;
superseriousstats does some preliminary analysis on data it stores in its database. Its author (tommyrot) has kindly added a parser for the format of the IRC logs we use (supybot) on my request. The [https://github.com/tommyrot/superseriousstats/blob/master/empty_database_v3.sql database schema] is a little hard to work out; Several key tables have fields with undescriptive names like l_01. There are some queries in [https://github.com/tommyrot/superseriousstats/blob/master/html.class.php html.class.php] which we can use to generate some reports, though.&lt;br /&gt;
 &lt;br /&gt;
* Total IRC activity (by hour)&lt;br /&gt;
 select sum(`l_00`) as `l_00`, sum(`l_01`) as `l_01`, sum(`l_02`) as `l_02`,&lt;br /&gt;
        sum(`l_03`) as `l_03`, sum(`l_04`) as `l_04`, sum(`l_05`) as `l_05`,&lt;br /&gt;
        sum(`l_06`) as `l_06`, sum(`l_07`) as `l_07`, sum(`l_08`) as `l_08`,&lt;br /&gt;
        sum(`l_09`) as `l_09`, sum(`l_10`) as `l_10`, sum(`l_11`) as `l_11`,&lt;br /&gt;
        sum(`l_12`) as `l_12`, sum(`l_13`) as `l_13`, sum(`l_14`) as `l_14`,&lt;br /&gt;
        sum(`l_15`) as `l_15`, sum(`l_16`) as `l_16`, sum(`l_17`) as `l_17`,&lt;br /&gt;
        sum(`l_18`) as `l_18`, sum(`l_19`) as `l_19`, sum(`l_20`) as `l_20`,&lt;br /&gt;
        sum(`l_21`) as `l_21`, sum(`l_22`) as `l_22`, sum(`l_23`) as `l_23`&lt;br /&gt;
   from `channel`&lt;br /&gt;
* Total active participants (+ evolution) - we may be able to get &amp;quot;number of participants per hour/day/month&amp;quot; (so you can see if it's 2 guys taking amongst themselves or a larger group) - I'll ask tommyrot what the query should look like.&lt;br /&gt;
* Top contributors (per month)&lt;br /&gt;
 select `q_lines`.`ruid`, `csnick`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_total`) as `l_total`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_night`) as `l_night`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_morning`) as `l_morning`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_afternoon`) as `l_afternoon`,&lt;br /&gt;
        sum(`q_activity_by_month`.`l_evening`) as `l_evening`,&lt;br /&gt;
        `quote` from `q_lines`&lt;br /&gt;
    join `q_activity_by_month` on `q_lines`.`ruid` = `q_activity_by_month`.`ruid`&lt;br /&gt;
    join `user_status` on `q_lines`.`ruid` = `user_status`.`uid`&lt;br /&gt;
    join `user_details` on `q_lines`.`ruid` = `user_details`.`uid`&lt;br /&gt;
    where `status` != 3&lt;br /&gt;
      and `date` = '2011-02'&lt;br /&gt;
    group by `q_lines`.`ruid`&lt;br /&gt;
    order by `q_activity_by_month`.`l_total` desc, `q_lines`.`ruid` asc limit 30&lt;br /&gt;
&lt;br /&gt;
=== Community &amp;amp; Official OBS ===&lt;br /&gt;
* Package submissions (+ evolution)&lt;br /&gt;
* Active participants (+ evolution)&lt;br /&gt;
&lt;br /&gt;
== Not yet in scope ==&lt;br /&gt;
&lt;br /&gt;
I have not yet considered how I might get web analytics and download stats.&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/MeeGo_Conference_Spring_2011</id>
		<title>MeeGo Conference Spring 2011</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/MeeGo_Conference_Spring_2011"/>
				<updated>2011-05-11T07:44:11Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Introduction to Qt */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The main Conference site is: [http://sf2011.meego.com/ San Francisco Meego Conference 2011]&lt;br /&gt;
&lt;br /&gt;
== Logistics ==&lt;br /&gt;
May 23 - 25: San Francisco Regency Hyatt&lt;br /&gt;
* Annual industry event&lt;br /&gt;
* Announcements / Information focus&lt;br /&gt;
* Broader industry audience&lt;br /&gt;
* Call for Proposals and Registration will open early February&lt;br /&gt;
&lt;br /&gt;
== Organizing Committee ==&lt;br /&gt;
[[File:Organizing committee.pdf]]&lt;br /&gt;
&lt;br /&gt;
'''Team Lead: Amy Leeland'''&lt;br /&gt;
&lt;br /&gt;
Advisory and Community Interface:&lt;br /&gt;
* Dawn Foster - Community Management&lt;br /&gt;
* Quim Gil - Marketing Management&lt;br /&gt;
&lt;br /&gt;
Program Committee: &lt;br /&gt;
* Lead: Dirk Hohndel&lt;br /&gt;
* Carsten Munk&lt;br /&gt;
* Thiago  Maciera&lt;br /&gt;
* Ashley Walker (Speaker Management)&lt;br /&gt;
&lt;br /&gt;
Coordination:&lt;br /&gt;
* Lead: Brian Warner Sponsorships&lt;br /&gt;
* AJ Reed: Post Event management&lt;br /&gt;
* Mike Shaver: Web Management&lt;br /&gt;
* Dave Neary: Early Bird Events&lt;br /&gt;
&lt;br /&gt;
== Call for proposals ==&lt;br /&gt;
&lt;br /&gt;
* [http://sf2011.meego.com/program/call-session-proposals CfP]&lt;br /&gt;
* [[MeeGo Conference Spring 2011/CfP Drafting|Draft CfP]]&lt;br /&gt;
&lt;br /&gt;
== Hyatt Space details==&lt;br /&gt;
&lt;br /&gt;
Link to conference area map:[http://sanfranciscoregency.hyatt.com/hyatt/images/hotels/sfors/floorplan.pdf]&lt;br /&gt;
&lt;br /&gt;
'''2nd Floor ''Street Level'' entrance'''&lt;br /&gt;
*Grand Ballroom A+B+C will be used as our Keynote room with redundant Screens/Sound in Market Street Foyer and Grand Ballroom Foyer (Seats 1300, with 100 standing in lobby spaces and feeds coming out)&lt;br /&gt;
Breakouts and room descriptions as follows:&lt;br /&gt;
*1 Grand Ballroom A  (700 theater)&lt;br /&gt;
*2 Grand Ballroom B (200 cab)&lt;br /&gt;
*3 Grand Ballroom C  (200 cab)&lt;br /&gt;
*Tech Showcase will take place in Market Street Foyer and Grand Ballroom Foyer. Not to be just single table tops, requested 3 different orientations to choose from. Tea and Coffee will be available in Tech showcase area&lt;br /&gt;
*Registration + Welcome Kit fulfillment: Market Street Foyer and Grand Ballroom Foyer. Kits will be organized based upon gender/size, not name&lt;br /&gt;
*Media/Blogger Room: Regency A &lt;br /&gt;
*Keynote Prep Room: Regency B &lt;br /&gt;
*Event Office: Plaza Room &lt;br /&gt;
&lt;br /&gt;
'''3rd floor ''Bay Level'''''&lt;br /&gt;
*4 Seacliff A-B (110 cab) &lt;br /&gt;
*5 Seacliff C-D (110 cab)&lt;br /&gt;
*6 Bayview Room A-B (200 cab)&lt;br /&gt;
*Platinum Meeting Rooms: Marina Room and Golden Gate room on Bay Level &lt;br /&gt;
&lt;br /&gt;
'''4th floor ''Atrium Lobby Level'''''&lt;br /&gt;
*All Catering Breakfast, Lunch, breaks and receptions will be held in Atrium of Hotel&lt;br /&gt;
This includes Garden Room A+B, 13 Views Lounge, and Hospitality Room can seat up to 1400 in this area.&lt;br /&gt;
*Nokia Boardroom: Boardroom A [atrium] &lt;br /&gt;
*Intel Boardroom: Boardroom B [atrium] &lt;br /&gt;
&lt;br /&gt;
'''1st Floor ''Pacific Concourse Level'''''&lt;br /&gt;
*This will all be used as Early Bird space partitioned rooms A-O (17,000 sq ft all partitionable) Saturday until Sunday evening then it will be………&lt;br /&gt;
*Hacker Lounge: Pacific Concourse K—O &lt;br /&gt;
*Meeting rooms partitioned in Pacific Concourse: A, B, C, D, E &lt;br /&gt;
*Gold sponsors get A and B&lt;br /&gt;
*Spaces F and G will be used as a meeting room bookable space&lt;br /&gt;
&lt;br /&gt;
== Attendee information ==&lt;br /&gt;
&lt;br /&gt;
For flight arrival &amp;amp; departure times, please see [[MeeGo Conference Spring 2011/Flight Information]] - feel free to add your own arrival &amp;amp; departure times, contact people to share taxis or meet up at the airport.&lt;br /&gt;
&lt;br /&gt;
For sponsored attendees sharing rooms, please choose your sharing partner on the [[/Accommodation | accommodation page]], or we will assign the person sharing a room with you.&lt;br /&gt;
&lt;br /&gt;
== MeeGo Conference Warm-Up ==&lt;br /&gt;
&lt;br /&gt;
The MeeGo Conference Warm-Up will include workshops catering to MeeGo application developers, and tutorials to help people get started developing for the platform. We also plan to have some fun extra-curricular activities involving building things. The warm-up sessions will be held at the Hyatt in San Francisco (the conference hotel/ conference facility) on May 21-22, the weekend before the conference begins.&lt;br /&gt;
&lt;br /&gt;
We will have two parallel tutorial tracks, on a range of topics related to MeeGo, and one workshop track where people can get together and work on programming problems or brainstorm application designs.&lt;br /&gt;
&lt;br /&gt;
And we plan some fun activities throughout the weekend, including a Maker's Contest where contestants will have to use raw materials including balsa wood, paper clips, string, paper and glue to build a machine capable of launching a projectile (Trebuchet, slingshot, catapult, whatever) over a distance of ~10m. The results will be judged with a live-action Angry Birds contest. And every evening we'll be running Werewolf sessions.&lt;br /&gt;
&lt;br /&gt;
=== Schedule ===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Date/Time&lt;br /&gt;
! Tutorial 1&lt;br /&gt;
! Tutorial 2&lt;br /&gt;
! Workshop&lt;br /&gt;
|-&lt;br /&gt;
|Sat 21, 09:00 - 12:30&lt;br /&gt;
| [[#Introduction_to_Qt | Introduction to Qt ]]&lt;br /&gt;
| [[#Harmattan_for_developers | Harmattan for developers]]&lt;br /&gt;
| [[#UX_workshop | UX workshop ]]&lt;br /&gt;
|-&lt;br /&gt;
|Sat 21, 14:00 - 17:30&lt;br /&gt;
| [[#Introduction_to_Qt | Introduction to Qt]]&lt;br /&gt;
| [[#Community OBS | Community OBS &amp;amp; Software Distribution]]&lt;br /&gt;
| [[#UX_workshop | UX workshop]]&lt;br /&gt;
|-&lt;br /&gt;
|Sat 21, 18:00 - 19:30&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; align=&amp;quot;center&amp;quot; | [[#Makers_contest | Siege weapon building, live action Angry Birds]]&lt;br /&gt;
|-&lt;br /&gt;
|Sat 21, 21:00 - late&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; align=&amp;quot;center&amp;quot; | [[/Werewolf | Mini Werewolf]]&lt;br /&gt;
|-&lt;br /&gt;
|Sun 22, 09:00 - 12:30&lt;br /&gt;
| [[#Introduction_to_MeeGo_SDK | Introduction to MeeGo SDK]] ([http://appdeveloper.intel.com/events registration required])&lt;br /&gt;
| [[#Linux_developer_tools | Linux developer tools]]&lt;br /&gt;
| [[#Qt_development_workshop | Qt development workshop]]&lt;br /&gt;
|-&lt;br /&gt;
|Sun 22, 14:00 - 17:30&lt;br /&gt;
| [[#Introduction_to_MeeGo_SDK | Introduction to MeeGo SDK]] ([http://appdeveloper.intel.com/events registration required])&lt;br /&gt;
| [[#MeeGo_localisation | MeeGo localisation]]&lt;br /&gt;
| [[#Qt_development_workshop | Qt development workshop]]&lt;br /&gt;
|-&lt;br /&gt;
|Sun 22, 20:00 - late&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; align=&amp;quot;center&amp;quot; | [[/Werewolf | Mass Werewolf, MeeGo Conference 2011 version]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Introduction to Qt ===&lt;br /&gt;
&lt;br /&gt;
Training course offered by Roland Krause from ICS and MeeGo community member Thomas Perl.&lt;br /&gt;
&lt;br /&gt;
Topics covered include:&lt;br /&gt;
* Setting up the MeeGo SDK&lt;br /&gt;
* Installing MeeGo on the Device&lt;br /&gt;
* Installing and Configurating Qt SDK&lt;br /&gt;
* Configuring Qt Creator&lt;br /&gt;
* Intro to Qt Creator&lt;br /&gt;
* Understanding QMake&lt;br /&gt;
* Building your First Application&lt;br /&gt;
* Introduction to PySide&lt;br /&gt;
* Introduction to QML&lt;br /&gt;
**   Elements&lt;br /&gt;
**   Properties and Types&lt;br /&gt;
**   Signals&lt;br /&gt;
**   Anchoring and Positioning&lt;br /&gt;
**   States&lt;br /&gt;
**   Transitions&lt;br /&gt;
**   Simple Animations&lt;br /&gt;
&lt;br /&gt;
=== Harmattan for developers ===&lt;br /&gt;
Presented by Daniel Wilms&lt;br /&gt;
&lt;br /&gt;
... description pending ...&lt;br /&gt;
&lt;br /&gt;
=== Community OBS ===&lt;br /&gt;
David Greaves, Niels Breet and Henri Bergius will take you through the basics of using Community OBS, uploading your project, getting it built and packaged automatically, fixing any build issues, and making the software available on the community software downloads site.&lt;br /&gt;
&lt;br /&gt;
This will be part tutorial, part hands-on workshop, part BOF.&lt;br /&gt;
 &lt;br /&gt;
=== UX workshop ===&lt;br /&gt;
&lt;br /&gt;
Developers are invited to turn up and present their applications to other attendees and our UX experts, and identify together areas of improvement and possible designs for presenting the same functionality to the use.&lt;br /&gt;
&lt;br /&gt;
=== Makers contest ===&lt;br /&gt;
&lt;br /&gt;
10 teams of 3 to 4 people will have one hour to build the best siege weapon possible for a live action Angry Birds round which will decide the winner. Trebuchet, catapult or slingshot, the weapon must be able to send an Angry Bird bean-bag into a pre-arranged structure containing evil snorting pigs. The winner will be determined by our impartial judges and comperes, Dave Neary, Alison Chaiken &amp;amp; Julien Fourgeaud.&lt;br /&gt;
&lt;br /&gt;
Equipment available to teams will include: various balsa wood cuts, glue guns and glue sticks, twine, paper clips, plain brown paper and elastic bands. There may be some other surprise materials thrown in on the day, if we're feeling generous.&lt;br /&gt;
&lt;br /&gt;
=== Introduction to MeeGo SDK ===&lt;br /&gt;
&lt;br /&gt;
'''Important: You need to [http://appdeveloper.intel.com/events register in advance] if you want to attend this session'''&lt;br /&gt;
&lt;br /&gt;
The Intel AppUpSM Application Lab: MeeGo series will be in San Francisco for the MeeGo Conference Warm Up!  Register now to Meet Bob Spencer from the MeeGo SDK team and members of the Intel AppUpSM developer program team to learn how to create and deploy MeeGo applications using the MeeGo SDK and the Intel AppUpTM SDK for MeeGo.  Discover how to create exciting user experiences with MeeGo* and the Intel AppUpSM developer program.  MeeGo promotes innovation and portability across multiple device types, such as tablets, netbooks and smartphones.  Developing for MeeGo presents a great opportunity to make money and deploy your applications quickly and easily. You don't have to be attending the MeeGo Conference to attend this event! &lt;br /&gt;
&lt;br /&gt;
Join us at The Hyatt Regency San Francisco Embarcadero Centre on Sunday, May 22, 2011 for one of our FREE training sessions to learn how to develop applications for MeeGo and the benefits of the Intel AppUpSM developer program. Two sessions are available to choose from: &lt;br /&gt;
&lt;br /&gt;
Sunday, May 22: 9:00am - 12:30pm&lt;br /&gt;
Sunday, May 22: 2:00pm - 5:30pm &lt;br /&gt;
&lt;br /&gt;
General Agenda:&lt;br /&gt;
* Doors open 30 minutes before each session start for check-in (check-in closes 10 minutes after start)&lt;br /&gt;
* Session Content &lt;br /&gt;
** Overview of the Intel AppUpSM center &amp;amp; Intel AppUpSM developer program&lt;br /&gt;
** Introduction to the MeeGo SDK&lt;br /&gt;
** Introduction to the Intel AppUpTM SDK Suite for MeeGo &lt;br /&gt;
** Application packaging and submission &lt;br /&gt;
* Talk to Intel engineers about your specific code &amp;amp; questions with an information Q&amp;amp;A session&lt;br /&gt;
&lt;br /&gt;
Seating is limited so [http://appdeveloper.intel.com/events register today] at http://appdeveloper.intel.com/events&lt;br /&gt;
&lt;br /&gt;
=== Linux developer tools ===&lt;br /&gt;
&lt;br /&gt;
An overview of common Linux developer tools, including git, gdb and valgrind, by timeless.&lt;br /&gt;
&lt;br /&gt;
'''Using MXR'''&lt;br /&gt;
&lt;br /&gt;
* For Triagers -- when you get a crash trace&lt;br /&gt;
** Using identifier searches to walk through a stack trace&lt;br /&gt;
** When you get a bug report in a foreign language, using text&lt;br /&gt;
searches to work from the foreign report to the codebase's native&lt;br /&gt;
language&lt;br /&gt;
* For Architects -- when you want to understand the ramifications of&lt;br /&gt;
changing an API&lt;br /&gt;
** Using text searches to find&lt;br /&gt;
** Using identifier searches to&lt;br /&gt;
* For Linguists&lt;br /&gt;
** Using filtered text searches to get more awareness of context&lt;br /&gt;
* For themers&lt;br /&gt;
** When you see an image in the ui and need to find its name&lt;br /&gt;
* Aiding MXR&lt;br /&gt;
** Fields packagers can use to provide directory descriptions&lt;br /&gt;
&lt;br /&gt;
=== MeeGo localisation ===&lt;br /&gt;
&lt;br /&gt;
Dimitris Glezos and Margie Foster will walk you through the localisation process for MeeGo:&lt;br /&gt;
* Extracting translatable strings from an application&lt;br /&gt;
* Translating strings locally using Linguist&lt;br /&gt;
* Uploading strings to Transifex and translating them there&lt;br /&gt;
* Retrieving a translation package from Transifex and integrating it into your application&lt;br /&gt;
* Compiling and testing your translated application&lt;br /&gt;
&lt;br /&gt;
=== Qt development workshop ===&lt;br /&gt;
&lt;br /&gt;
A programming problem will be presented, and attendees will have a hands-on programming lab, with amateurs and experts on hand to help you out when you get into trouble. If you'd like to help out as an assistant during this training session, please add your name here:&lt;br /&gt;
&lt;br /&gt;
'''Qt/QML/UI experts'''&lt;br /&gt;
&lt;br /&gt;
* Thomas Perl&lt;br /&gt;
* Sampo Savola&lt;br /&gt;
&lt;br /&gt;
== Hacker Lounge ==&lt;br /&gt;
Was a very popular activity for community members. Having it stocked with beer and snacks was perfect, and we'd love to do something similar for spring.&lt;br /&gt;
&lt;br /&gt;
== Ideas for fun activities ==&lt;br /&gt;
* [http://www.bahiker.com/northbayhikes/stinson.html Hike in Marin County on Mt. Tamalpais] just across the Golden Gate Bridge and get great views of San Francisco (when it's not foggy).   Potentially led by [[User:Alison| Alison Chaiken]].   Would require transportation to the trailhead from the hotel, perhaps a 45-minute drive.&lt;br /&gt;
* [http://www.blazingsaddles.com/store/?catid=7  Cycle across the Golden Gate Bridge], potentially guided by [[User:Alison| Alison Chaiken]]. The route is pancake flat but is on a regular city street with low-speed traffic.   [http://www.blazingsaddles.com/maps-and-rides/san-francisco-self-guided-tours.aspx  Riding across the Bridge] does require some caution on foggy days due to a wet surface.   (Foggy days can occur any time of year.)   I have inquired about group discounts.   The bike rental is walkable from the hotel.&lt;br /&gt;
* The [[MeeGo_Conference_2010/Werewolf]] was popular, and we're scheming on some additional [[MeeGo_Conference_Spring_2011/Werewolf]] MeeGo Werewolf variations for 2010.&lt;br /&gt;
&lt;br /&gt;
[[File:Example.jpg]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/MeeGo_Conference_Spring_2011</id>
		<title>MeeGo Conference Spring 2011</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/MeeGo_Conference_Spring_2011"/>
				<updated>2011-05-09T08:09:28Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: /* Flight Information */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The main Conference site is: [http://sf2011.meego.com/ San Francisco Meego Conference 2011]&lt;br /&gt;
&lt;br /&gt;
== Logistics ==&lt;br /&gt;
May 23 - 25: San Francisco Regency Hyatt&lt;br /&gt;
* Annual industry event&lt;br /&gt;
* Announcements / Information focus&lt;br /&gt;
* Broader industry audience&lt;br /&gt;
* Call for Proposals and Registration will open early February&lt;br /&gt;
&lt;br /&gt;
== Organizing Committee ==&lt;br /&gt;
[[File:Organizing committee.pdf]]&lt;br /&gt;
&lt;br /&gt;
'''Team Lead: Amy Leeland'''&lt;br /&gt;
&lt;br /&gt;
Advisory and Community Interface:&lt;br /&gt;
* Dawn Foster - Community Management&lt;br /&gt;
* Quim Gil - Marketing Management&lt;br /&gt;
&lt;br /&gt;
Program Committee: &lt;br /&gt;
* Lead: Dirk Hohndel&lt;br /&gt;
* Carsten Munk&lt;br /&gt;
* Thiago  Maciera&lt;br /&gt;
* Ashley Walker (Speaker Management)&lt;br /&gt;
&lt;br /&gt;
Coordination:&lt;br /&gt;
* Lead: Brian Warner Sponsorships&lt;br /&gt;
* AJ Reed: Post Event management&lt;br /&gt;
* Mike Shaver: Web Management&lt;br /&gt;
* Dave Neary: Early Bird Events&lt;br /&gt;
&lt;br /&gt;
== Call for proposals ==&lt;br /&gt;
&lt;br /&gt;
* [http://sf2011.meego.com/program/call-session-proposals CfP]&lt;br /&gt;
* [[MeeGo Conference Spring 2011/CfP Drafting|Draft CfP]]&lt;br /&gt;
&lt;br /&gt;
== Hyatt Space details==&lt;br /&gt;
&lt;br /&gt;
Link to conference area map:[http://sanfranciscoregency.hyatt.com/hyatt/images/hotels/sfors/floorplan.pdf]&lt;br /&gt;
&lt;br /&gt;
'''2nd Floor ''Street Level'' entrance'''&lt;br /&gt;
*Grand Ballroom A+B+C will be used as our Keynote room with redundant Screens/Sound in Market Street Foyer and Grand Ballroom Foyer (Seats 1300, with 100 standing in lobby spaces and feeds coming out)&lt;br /&gt;
Breakouts and room descriptions as follows:&lt;br /&gt;
*1 Grand Ballroom A  (700 theater)&lt;br /&gt;
*2 Grand Ballroom B (200 cab)&lt;br /&gt;
*3 Grand Ballroom C  (200 cab)&lt;br /&gt;
*Tech Showcase will take place in Market Street Foyer and Grand Ballroom Foyer. Not to be just single table tops, requested 3 different orientations to choose from. Tea and Coffee will be available in Tech showcase area&lt;br /&gt;
*Registration + Welcome Kit fulfillment: Market Street Foyer and Grand Ballroom Foyer. Kits will be organized based upon gender/size, not name&lt;br /&gt;
*Media/Blogger Room: Regency A &lt;br /&gt;
*Keynote Prep Room: Regency B &lt;br /&gt;
*Event Office: Plaza Room &lt;br /&gt;
&lt;br /&gt;
'''3rd floor ''Bay Level'''''&lt;br /&gt;
*4 Seacliff A-B (110 cab) &lt;br /&gt;
*5 Seacliff C-D (110 cab)&lt;br /&gt;
*6 Bayview Room A-B (200 cab)&lt;br /&gt;
*Platinum Meeting Rooms: Marina Room and Golden Gate room on Bay Level &lt;br /&gt;
&lt;br /&gt;
'''4th floor ''Atrium Lobby Level'''''&lt;br /&gt;
*All Catering Breakfast, Lunch, breaks and receptions will be held in Atrium of Hotel&lt;br /&gt;
This includes Garden Room A+B, 13 Views Lounge, and Hospitality Room can seat up to 1400 in this area.&lt;br /&gt;
*Nokia Boardroom: Boardroom A [atrium] &lt;br /&gt;
*Intel Boardroom: Boardroom B [atrium] &lt;br /&gt;
&lt;br /&gt;
'''1st Floor ''Pacific Concourse Level'''''&lt;br /&gt;
*This will all be used as Early Bird space partitioned rooms A-O (17,000 sq ft all partitionable) Saturday until Sunday evening then it will be………&lt;br /&gt;
*Hacker Lounge: Pacific Concourse K—O &lt;br /&gt;
*Meeting rooms partitioned in Pacific Concourse: A, B, C, D, E &lt;br /&gt;
*Gold sponsors get A and B&lt;br /&gt;
*Spaces F and G will be used as a meeting room bookable space&lt;br /&gt;
&lt;br /&gt;
== Attendee information ==&lt;br /&gt;
&lt;br /&gt;
For flight arrival &amp;amp; departure times, please see [[MeeGo Conference Spring 2011/Flight Information]] - feel free to add your own arrival &amp;amp; departure times, contact people to share taxis or meet up at the airport.&lt;br /&gt;
&lt;br /&gt;
For sponsored attendees sharing rooms, please choose your sharing partner on the [[/Accommodation | accommodation page]], or we will assign the person sharing a room with you.&lt;br /&gt;
&lt;br /&gt;
== MeeGo Conference Warm-Up ==&lt;br /&gt;
&lt;br /&gt;
The MeeGo Conference Warm-Up will include workshops catering to MeeGo application developers, and tutorials to help people get started developing for the platform. We also plan to have some fun extra-curricular activities involving building things. The warm-up sessions will be held at the Hyatt in San Francisco (the conference hotel/ conference facility) on May 21-22, the weekend before the conference begins.&lt;br /&gt;
&lt;br /&gt;
We will have two parallel tutorial tracks, on a range of topics related to MeeGo, and one workshop track where people can get together and work on programming problems or brainstorm application designs.&lt;br /&gt;
&lt;br /&gt;
And we plan some fun activities throughout the weekend, including a Maker's Contest where contestants will have to use raw materials including balsa wood, paper clips, string, paper and glue to build a machine capable of launching a projectile (Trebuchet, slingshot, catapult, whatever) over a distance of ~10m. The results will be judged with a live-action Angry Birds contest. And every evening we'll be running Werewolf sessions.&lt;br /&gt;
&lt;br /&gt;
=== Schedule ===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Date/Time&lt;br /&gt;
! Tutorial 1&lt;br /&gt;
! Tutorial 2&lt;br /&gt;
! Workshop&lt;br /&gt;
|-&lt;br /&gt;
|Sat 21, 09:00 - 12:30&lt;br /&gt;
| [[#Introduction_to_Qt | Introduction to Qt ]]&lt;br /&gt;
| [[#Harmattan_for_developers | Harmattan for developers]]&lt;br /&gt;
| [[#UX_workshop | UX workshop ]]&lt;br /&gt;
|-&lt;br /&gt;
|Sat 21, 14:00 - 17:30&lt;br /&gt;
| [[#Introduction_to_Qt | Introduction to Qt]]&lt;br /&gt;
| [[#Community OBS | Community OBS &amp;amp; Software Distribution]]&lt;br /&gt;
| [[#UX_workshop | UX workshop]]&lt;br /&gt;
|-&lt;br /&gt;
|Sat 21, 18:00 - 19:30&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; align=&amp;quot;center&amp;quot; | [[#Makers_contest | Siege weapon building, live action Angry Birds]]&lt;br /&gt;
|-&lt;br /&gt;
|Sat 21, 21:00 - late&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; align=&amp;quot;center&amp;quot; | [[/Werewolf | Mini Werewolf]]&lt;br /&gt;
|-&lt;br /&gt;
|Sun 22, 09:00 - 12:30&lt;br /&gt;
| [[#Introduction_to_MeeGo_SDK | Introduction to MeeGo SDK]] ([http://appdeveloper.intel.com/events registration required])&lt;br /&gt;
| [[#Linux_developer_tools | Linux developer tools]]&lt;br /&gt;
| [[#Qt_development_workshop | Qt development workshop]]&lt;br /&gt;
|-&lt;br /&gt;
|Sun 22, 14:00 - 17:30&lt;br /&gt;
| [[#Introduction_to_MeeGo_SDK | Introduction to MeeGo SDK]] ([http://appdeveloper.intel.com/events registration required])&lt;br /&gt;
| [[#MeeGo_localisation | MeeGo localisation]]&lt;br /&gt;
| [[#Qt_development_workshop | Qt development workshop]]&lt;br /&gt;
|-&lt;br /&gt;
|Sun 22, 20:00 - late&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; align=&amp;quot;center&amp;quot; | [[/Werewolf | Mass Werewolf, MeeGo Conference 2011 version]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Introduction to Qt ===&lt;br /&gt;
&lt;br /&gt;
Training course offered by Gregg Leibovitz of ICS and Thomas Perl.&lt;br /&gt;
&lt;br /&gt;
Topics covered include:&lt;br /&gt;
* Setting up the MeeGo SDK&lt;br /&gt;
* Installing MeeGo on the Device&lt;br /&gt;
* Installing and Configurating Qt SDK&lt;br /&gt;
* Configuring Qt Creator&lt;br /&gt;
* Intro to Qt Creator&lt;br /&gt;
* Understanding QMake&lt;br /&gt;
* Building your First Application&lt;br /&gt;
* Introduction to PySide&lt;br /&gt;
* Introduction to QML&lt;br /&gt;
**   Elements&lt;br /&gt;
**   Properties and Types&lt;br /&gt;
**   Signals&lt;br /&gt;
**   Anchoring and Positioning&lt;br /&gt;
**   States&lt;br /&gt;
**   Transitions&lt;br /&gt;
**   Simple Animations&lt;br /&gt;
&lt;br /&gt;
=== Harmattan for developers ===&lt;br /&gt;
Presented by Daniel Wilms&lt;br /&gt;
&lt;br /&gt;
... description pending ...&lt;br /&gt;
&lt;br /&gt;
=== Community OBS ===&lt;br /&gt;
David Greaves, Niels Breet and Henri Bergius will take you through the basics of using Community OBS, uploading your project, getting it built and packaged automatically, fixing any build issues, and making the software available on the community software downloads site.&lt;br /&gt;
&lt;br /&gt;
This will be part tutorial, part hands-on workshop, part BOF.&lt;br /&gt;
 &lt;br /&gt;
=== UX workshop ===&lt;br /&gt;
&lt;br /&gt;
Developers are invited to turn up and present their applications to other attendees and our UX experts, and identify together areas of improvement and possible designs for presenting the same functionality to the use.&lt;br /&gt;
&lt;br /&gt;
=== Makers contest ===&lt;br /&gt;
&lt;br /&gt;
10 teams of 3 to 4 people will have one hour to build the best siege weapon possible for a live action Angry Birds round which will decide the winner. Trebuchet, catapult or slingshot, the weapon must be able to send an Angry Bird bean-bag into a pre-arranged structure containing evil snorting pigs. The winner will be determined by our impartial judges and comperes, Dave Neary, Alison Chaiken &amp;amp; Julien Fouregard.&lt;br /&gt;
&lt;br /&gt;
Equipment available to teams will include: various balsa wood cuts, glue guns and glue sticks, twine, paper clips, plain brown paper and elastic bands. There may be some other surprise materials thrown in on the day, if we're feeling generous.&lt;br /&gt;
&lt;br /&gt;
=== Introduction to MeeGo SDK ===&lt;br /&gt;
&lt;br /&gt;
'''Important: You need to [http://appdeveloper.intel.com/events register in advance] if you want to attend this session'''&lt;br /&gt;
&lt;br /&gt;
The Intel AppUpSM Application Lab: MeeGo series will be in San Francisco for the MeeGo Conference Warm Up!  Register now to Meet Bob Spencer from the MeeGo SDK team and members of the Intel AppUpSM developer program team to learn how to create and deploy MeeGo applications using the MeeGo SDK and the Intel AppUpTM SDK for MeeGo.  Discover how to create exciting user experiences with MeeGo* and the Intel AppUpSM developer program.  MeeGo promotes innovation and portability across multiple device types, such as tablets, netbooks and smartphones.  Developing for MeeGo presents a great opportunity to make money and deploy your applications quickly and easily. You don't have to be attending the MeeGo Conference to attend this event! &lt;br /&gt;
&lt;br /&gt;
Join us at The Hyatt Regency San Francisco Embarcadero Centre on Sunday, May 22, 2011 for one of our FREE training sessions to learn how to develop applications for MeeGo and the benefits of the Intel AppUpSM developer program. Two sessions are available to choose from: &lt;br /&gt;
&lt;br /&gt;
Sunday, May 22: 9:00am - 12:30pm&lt;br /&gt;
Sunday, May 22: 2:00pm - 5:30pm &lt;br /&gt;
&lt;br /&gt;
General Agenda:&lt;br /&gt;
* Doors open 30 minutes before each session start for check-in (check-in closes 10 minutes after start)&lt;br /&gt;
* Session Content &lt;br /&gt;
** Overview of the Intel AppUpSM center &amp;amp; Intel AppUpSM developer program&lt;br /&gt;
** Introduction to the MeeGo SDK&lt;br /&gt;
** Introduction to the Intel AppUpTM SDK Suite for MeeGo &lt;br /&gt;
** Application packaging and submission &lt;br /&gt;
* Talk to Intel engineers about your specific code &amp;amp; questions with an information Q&amp;amp;A session&lt;br /&gt;
&lt;br /&gt;
Seating is limited so [http://appdeveloper.intel.com/events register today] at http://appdeveloper.intel.com/events&lt;br /&gt;
&lt;br /&gt;
=== Linux developer tools ===&lt;br /&gt;
&lt;br /&gt;
An overview of common Linux developer tools, including git, gdb and valgrind, by timeless.&lt;br /&gt;
&lt;br /&gt;
'''Using MXR'''&lt;br /&gt;
&lt;br /&gt;
* For Triagers -- when you get a crash trace&lt;br /&gt;
** Using identifier searches to walk through a stack trace&lt;br /&gt;
** When you get a bug report in a foreign language, using text&lt;br /&gt;
searches to work from the foreign report to the codebase's native&lt;br /&gt;
language&lt;br /&gt;
* For Architects -- when you want to understand the ramifications of&lt;br /&gt;
changing an API&lt;br /&gt;
** Using text searches to find&lt;br /&gt;
** Using identifier searches to&lt;br /&gt;
* For Linguists&lt;br /&gt;
** Using filtered text searches to get more awareness of context&lt;br /&gt;
* For themers&lt;br /&gt;
** When you see an image in the ui and need to find its name&lt;br /&gt;
* Aiding MXR&lt;br /&gt;
** Fields packagers can use to provide directory descriptions&lt;br /&gt;
&lt;br /&gt;
=== MeeGo localisation ===&lt;br /&gt;
&lt;br /&gt;
Dimitris Glezos and Margie Foster will walk you through the localisation process for MeeGo:&lt;br /&gt;
* Extracting translatable strings from an application&lt;br /&gt;
* Translating strings locally using Linguist&lt;br /&gt;
* Uploading strings to Transifex and translating them there&lt;br /&gt;
* Retrieving a translation package from Transifex and integrating it into your application&lt;br /&gt;
* Compiling and testing your translated application&lt;br /&gt;
&lt;br /&gt;
=== Qt development workshop ===&lt;br /&gt;
&lt;br /&gt;
A programming problem will be presented, and attendees will have a hands-on programming lab, with amateurs and experts on hand to help you out when you get into trouble. If you'd like to help out as an assistant during this training session, please add your name here:&lt;br /&gt;
&lt;br /&gt;
'''Qt/QML/UI experts'''&lt;br /&gt;
&lt;br /&gt;
* Thomas Perl&lt;br /&gt;
* Sampo Savola&lt;br /&gt;
&lt;br /&gt;
== Hacker Lounge ==&lt;br /&gt;
Was a very popular activity for community members. Having it stocked with beer and snacks was perfect, and we'd love to do something similar for spring.&lt;br /&gt;
&lt;br /&gt;
== Ideas for fun activities ==&lt;br /&gt;
* [http://www.bahiker.com/northbayhikes/stinson.html Hike in Marin County on Mt. Tamalpais] just across the Golden Gate Bridge and get great views of San Francisco (when it's not foggy).   Potentially led by [[User:Alison| Alison Chaiken]].   Would require transportation to the trailhead from the hotel, perhaps a 45-minute drive.&lt;br /&gt;
* [http://www.blazingsaddles.com/store/?catid=7  Cycle across the Golden Gate Bridge], potentially guided by [[User:Alison| Alison Chaiken]]. The route is pancake flat but is on a regular city street with low-speed traffic.   [http://www.blazingsaddles.com/maps-and-rides/san-francisco-self-guided-tours.aspx  Riding across the Bridge] does require some caution on foggy days due to a wet surface.   (Foggy days can occur any time of year.)   I have inquired about group discounts.   The bike rental is walkable from the hotel.&lt;br /&gt;
* The [[MeeGo_Conference_2010/Werewolf]] was popular, and we're scheming on some additional [[MeeGo_Conference_Spring_2011/Werewolf]] MeeGo Werewolf variations for 2010.&lt;br /&gt;
&lt;br /&gt;
[[File:Example.jpg]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	<entry>
		<id>http://wiki.meego.com/MeeGo_Conference_Spring_2011/accommodations</id>
		<title>MeeGo Conference Spring 2011/accommodations</title>
		<link rel="alternate" type="text/html" href="http://wiki.meego.com/MeeGo_Conference_Spring_2011/accommodations"/>
				<updated>2011-05-09T08:06:17Z</updated>
		
		<summary type="html">&lt;p&gt;Dneary: moved MeeGo Conference Spring 2011/accommodations to MeeGo Conference Spring 2011/Accommodation: Naming conventions (leaving redirect)&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[MeeGo Conference Spring 2011/Accommodation]]&lt;/div&gt;</summary>
		<author><name>Dneary</name></author>	</entry>

	</feed>