(→Laying out a report) |
(→Laying out a report) |
||
| Line 169: | Line 169: | ||
[[Image:Report_table.png]] | [[Image:Report_table.png]] | ||
| + | |||
| + | == Next steps == | ||
| + | |||
| + | As with anything graphical, the possibilities are boundless. The next step will be to explain how to create graphics, and integrate many queries into a singnle report using subreports, and finally the deployment of reports so that we can generate them on demand via the Pentaho BI server, or schedule regular static report generation for publication. | ||
Contents |
There are two ways to create a Pentaho report: the first is a method to create simple tabular reports using the Pentaho user console "ad-hoc reporting" capability, but as the more powerful method, we will concentrate on the Pentaho Report Designer, available from Sourceforge.
There are four basic steps to creating a report in PRD: 1. Make a database connection available as a JNDI source for the Pentaho platform 2. Duplicate the JDNI configuration locally for PRD (since the designer will not have direct access to the Pentaho database) 3. Create queries which will be made available to the report designer 4. Lay out the report to make the results of those queries available to the user (as a graph, tabular data, etc)
The easiest way to configure a new JNDI connection is to start the Pentaho Administration Console (found in $PENTAHO_JOME/administration_console) with "start_pac.sh". The reason we use JNDI rather than a direct JDBC connection is that when we deploy the report to a different instance, where the username, password and database host are potentially different, we don't have to change the report, we simply use the same symbolic name for the JNDI source.
In Pentaho Admin Console (PAC), click "Administration", then "Database connections", and create a new connection. For the forum database, for example, the local parameters here were:
Then click "Test", and if everything is fine, you should see "Connection test successful".
Potential pitfalls are if you are using c3p0 libraries to do connection pooling as described in the Pentaho wiki, but have not included c3p0.jar on the classpath for PAC, or if the database user does not have the appropriate permissions or access for the database. You can troubleshoot permissions issues using the MySQL command line client.
Before you can use a "pure" JNDI connection in PRD, you need to declare it locally to make it available.
Open the file~/.pentaho/simple-jndi/default.propertiesand add this to the bottom:
Forum/type=javax.sql.DataSource Forum/driver=com.mysql.jdbc.Driver Forum/user=<db_user> Forum/password=<sure_ill_tell_you> Forum/url=jdbc:mysql://localhost:3306/forum
Hopefully the parallels between this configuration and the JNDI configuration in PAC are obvious! The prefix "Forum/" represents the symbolic name of the JNDI connection.
Once you have downloaded PRD, you can run report_designer.sh directly from the directory in which you uncompressed the package. PRD looks like a typical Java application (that is to say, on Linux it's not very pretty).
Create a new report with File->New, and you will see on the right-hand side of the screen a pane with two tabs, "Data" and "Structure". PRD is semi-WYSIWYG. Reports have a fixed structure, and each part of the report is evaluated based on a query associated with the report. The group header, for example, will appear once for every group defined in a query, what we put in the data section will be evaluated once for each row returned by the query, and the report header will be evaluated only once per report generation. Styling of elements in the report is cascading and hierarchical: if I set a font family for the report root, this will cascade throughout the report. The styling syntax is very similar to CSS.
To make a report do anything useful, we much associate a query with it. If we want to use several queries in one single document, we must declare all the queries at the top level, and then use sub-reports for each query to display.
Click on the Data tab in PRD. We can define a number of sources of data for the report by right-clicking on "Data sets". For a database source, we use the JDBC source type.
On the screen which pops up, we can add a new connection by clicking on the + icon above the connection names, and then choose "JNDI" as our connection type, and set the name of the connection for PRD with the name of the JNDI connection we configured before (see screenshot).
Test the connection with the "Test" button, then click "OK" to finalise things and make the connection available for queries.
Once the connection is correctly configured, you can create and edit queries. Click the "+" icon above the "Available queries" frame, type in a name, and you can start typing directly into the Query text entry area.
There is a graphical tool to help you join tables and perform more complicated wueries, which you can access by double-clicking the pencil icon above this area. By clicking on it, you will be brought to the SQL query designer. I have found this useful in the early stages of writing a queries, for simple joins between tables and for situations where I am not using any SQL functions, but for polishing queries, I tend to use the raw text.
For now, let's create a query to list the top 20 posters for last month. Drag the "top_posters" table onto the query canvas area on the right; all of the fields will be selected by default, and that's fine. Just to test the query, start with "where month=6 and year=2011" as the condition.
The final query is:
SELECT
month,
year,
rank,
member,
posts
FROM
forum_top_posters
WHERE
month=6
and year=2011
LIMIT 20
and when you hit "Preview", you should get a data table like this:
| month | year | rank | member | posts |
|---|---|---|---|---|
| 6 | 2011 | 1 | qgil | 185 |
| 6 | 2011 | 2 | sjgadsby | 115 |
| 6 | 2011 | 3 | shmerl | 104 |
| 6 | 2011 | 4 | texrat | 77 |
| 6 | 2011 | 5 | jaffa | 60 |
| 6 | 2011 | 6 | swift11 | 52 |
| 6 | 2011 | 7 | jonay | 42 |
| 6 | 2011 | 8 | mja | 35 |
| 6 | 2011 | 9 | vitna | 35 |
| 6 | 2011 | 10 | timoph | 35 |
| 6 | 2011 | 11 | hartti | 29 |
| 6 | 2011 | 12 | lm | 27 |
| 6 | 2011 | 13 | rubear2009 | 26 |
| 6 | 2011 | 14 | andre | 25 |
| 6 | 2011 | 15 | profebral | 23 |
| 6 | 2011 | 16 | mikecomputing | 21 |
| 6 | 2011 | 17 | ritratt | 21 |
| 6 | 2011 | 18 | helex | 19 |
| 6 | 2011 | 19 | cckwes | 19 |
| 6 | 2011 | 20 | pycage | 19 |
Hit OK, and now your query is selected for use with your report.
To make the query always refer to "last month", you need to resort to datetime functions in MySQL in the WHERE clause.
WHERE month = month(date_add(now(), INTERVAL -1 MONTH)) AND year = year(date_add(now(), INTERVAL -1 MONTH))
By default, there are 5 visible areas on the report canvas:
On the left hand side of the window, we have a number of different types of widgets (labels, text/number/date fields, included resources, images, shapes and so on.
You can drag and drop elements from the query (under the Data tab) or from the left of the window into the "Details" section now and preview the report, just to see what happens. For now, let's create a "Username" label, drag & drop the "member" field beside it, a "Posts" label, with the number of posts associated, and let's add a horizontal rule underneath. Select the horizontal rule and set its x offset to 5% and width to 90%, in the "Style" tab which is visible when the element is active. Then we can preview the report, just to make sure it's working OK, with "View->Preview".
If all is going well, you should see something like this:
Now we can start making the report a little prettier.
There are a few areas in the document structure which are not visible on the canvas by default (and so we can't drop elements onto them). The important ones for us are:
To make these visible on the canvas, enabling you to drag & drop elements onto them, select them in the "Structure" tab on the left of the window, and under "Attributes", change the attribute "hide-on-canvas" to false.
You can then cut & paste the "Username" and "Posts" labels, add a "Rank" label, and arrange them as a header in the "Details header" area. You can set all the labels in the header to bold by setting the "font"/"bold" attribute in the "Style" tab to "true" for the details header area (the value will be inherited by all labels in the area).
Now set a report title using a Message field (you can include data from a query in the text of the field using $(parameter), and addmonthname(date_add(now(), INTERVAL -1 MONTH)) as monthnameto the fields in the query, and let's set the report title to "Top forum posters for $(monthname) $(year)". PS, if anyone knows how I can avoid those repeated calls to
date_add(now(), INTERVAL -1 MONTH)in Pentaho, I would be delighted to know how.
Just to pretty things up a touch more, let's set a background colour for the header, and alternate the row colours for odd & even rows between white & pale yellow. To do the even/odd banding, we're going to use a Pentaho "Row banding" function, which you'll find in the Format->Row-Banding menu. For the moment, choose a colour for the "Visible colour" area, and click OK. Now in the Data tab, we can see our new row banding function. All we need to do is give it a name in that area, and drag & drop it to the Details area. I just called it "band". Then, select "band" in the Structure tab, and in the "Style" tab, set the field "visible" to "false".
If everything has gone to plan, when you choose "View->Preview", you should see something like the following:
As with anything graphical, the possibilities are boundless. The next step will be to explain how to create graphics, and integrate many queries into a singnle report using subreports, and finally the deployment of reports so that we can generate them on demand via the Pentaho BI server, or schedule regular static report generation for publication.