Meego Wiki
Views

Architecture/planning/evolution-data-server/eds-improvements

From MeeGo wiki
(Difference between revisions)
Jump to: navigation, search
(Optional Parsing of ebook Data in Client)
(Quick check for "data changed")
Line 130: Line 130:
* If the revision string is empty or different, then the content of the database '''might''' have changed. A more detailed check is necessary to determine if it has really changed. This will be the case for non-local storage.
* If the revision string is empty or different, then the content of the database '''might''' have changed. A more detailed check is necessary to determine if it has really changed. This will be the case for non-local storage.
* If it is non-empty and the same as before, the nothing has changed. No further check will be necessary.
* If it is non-empty and the same as before, the nothing has changed. No further check will be necessary.
 +
 +
[https://bugzilla.gnome.org/show_bug.cgi?id=652175 libebook: Upstream evolution-data-server bug]
 +
[https://bugzilla.gnome.org/show_bug.cgi?id=652177 libecal: Upstream evolution-data-server bug]
== Contacts: Store PHOTO Data as Plain Files ==
== Contacts: Store PHOTO Data as Plain Files ==

Revision as of 10:08, 9 June 2011

This page describes improvements that are desirable or even essential for good performance of Evolution Data Server (EDS) as PIM storage in MeeGo.

All of these changes should be developed for the evolution-data-server gnome-2-32 branch, as used by MeeGo 1.2, then ported to master for review by upstream maintainers. Most changes will not be accepted in upstream's gnome-2-32 branch, because is is stable, but Meego will apply them as patches to its package.

Contents

Change Tracking

EDS is based on the model/view/controller principle: an application opens a database (calendar or addressbook) and requests a view with server-side filtering, then the application is sent:

  1. all data covered by the view
  2. the "all data sent" signal
  3. changes (add/update/remove) while the view exists

Applications modify data with asynchronous requests (controller) and keep showing the old data until the change notification comes in. The exact changes made to the data are determined by the server (for example when meeting invitations are sent).

QtContacts and KCalCore do not have such a strict model. With QtContacts the app simply loads and stores contacts and is responsible for maintaining its own view. Changes are sent for changes made after the database was opened. KCalCore is based on the concept of loading a calendar (partially) into the client, modifying it there, then writing it back. Changes are also sent while the storage is open.

TODO:

Suppress Unnecessary Data Transfer

EDS currently sends all data when a view is requested, which is unnecessary when the client is only interested in change notifications. An extension to the EDS query language could optionally suppresses that sending of existing data when the view is opened, to be implemented only for the local contacts and calendar backends. This would allow easier integration with the QContact API.

The Fremantle EDS fork also had a “freezable” (static) capability for libebook, which should probably be investigated. See, for instance: http://maemo.org/api_refs/5.0/5.0-final/libebook/EBookView.html#e-book-view-set-freezable

One possibility for adding the flag is to (ab)use the existing query language with a match for special fields:

  • in the client: query = e_book_query_orv(e_book_query_field_exists("X-EDS-QUERY-FLAG-SKIP-DATA-DUMP"), <real query>, NULL);
  • in the server: "if (query is OR-Query and first clause is exists("X-EDS-QUERY-FLAG-SKIP-DATA-DUMP")) { skip_data_dump = TRUE; query = second clause; }

Upstream evolution-data-server bug

Nofify with UIDs Only

EBookView emits signals when contacts are changed, added, or removed. But those signals sends lists of entire EContacts, containing the entire contact data, though we often have all the data already and just want to know which contacts have changed.

e_book_get_book_view() takes an EBookQuery which could be extended to take a new flag to make the EBookView send only the UID in the signals' EContacts. For instance, "X-EDS-QUERY-FLAG-NOTIFY-UID-ONLY". The UID is already stored separately for use as the primary key of the local-storage backend's BDB database, so no VCard parsing should be necessary to provide it.

This would allow easier integration with the QContact API, greatly improving performance.

Note that users of this feature should probably use just the base EVCard API on the provided EContact objects, to avoid extra processing and caching in the EContact code.

Note: We previously considered implementing e_book_get_book_view()'s requested fields parameter, but that changes the whole view, though we only really want to change what is sent for changed/new contacts.

Note: Chris discovered a dedicated e_book_get_contact_uids() API in the Fremantle EDS fork. The new API call was accepted upstream (in master) and backported as a patch (used in the Meego package) for the gnome-2-32 branch, but this is not a full solution.

Upstream evolution-data-server bug

Optional Parsing of ebook Data in Client

The EDS client-side APIs currently parse and (temporarily) store the parsed details of VCards. However, this is often unnecessary because APIs such as e_book_get_book_view() cause it to return the whole VCard anyway. There will be some performance gain from delaying VCard parsing until it is really needed by calls to functions such as e_contact_new_from_vcard().

libebook provides different functionality:

  1. exchange data as vCard with EDS daemon
  2. parse items and wrap them in a C API

QtContacts replaces the second part. An obvious performance improvement is to only do the second part if the additional C APIs are really called. The following calls then transfer data without any additional parsing in libebook:

  • e_book_get_contact()/asynchronous read/view + e_vcard_to_string()
  • e_contact_new_from_vcard() + e_book_add_contact()/e_book_commit_contact()/asynchronous storing

Note that libecal (for iCalendar) can already do this, though libebook can't yet.

TODO:

  1. add delayed parsing to libebook, triggered by API call which needs parsed data

PROBLEM:

  1. If delayed parsing fails, how can error be reported?
  2. Should delayed parsing be done always (e_vcard_to_string())) or only if requested? How would that be configured by a libebook user?

An e_book_view_set_parse_vcards() patch, already used in the Meego package, partly implements this already. This patch was taken from an old fremantle EDS patch that was never actually used, and is generally considered to be a bad hack that should not be accepted upstream.

Upstream evolution-data-server bug

Asynchronous API for storing items in libecal

libecal should allow the caller to request storing of data without blocking on processing of the data. It should notify the caller when the data has been stored, or when processing has failed.

These functions may require async versions:

  • icalcomponent_new_from_string()
  • e_cal_modify_object[_with_mod]
  • e_cal_create_object()

Making libical asynchronous is almost impossible without going multithreaded. Let's focus on e_cal D-Bus communication first.

Rob started some work on this in the rbradford-wip-ecal-async-api branch. Perhaps you can reuse some of that work. It is not functional, but might be a source of inspiration or code fragments.

Upstream evolution-data-server bug

More Efficient Access to Meta Data for Change Tracking

This was discussed a while ago: [[http://www.mail-archive.com/evolution-hackers@gnome.org/msg03029.html [Evolution-hackers] concurrent modifications of items in GUI and EDS database]]

Data synchronization only needs to know which items exist (list of local IDs) and which revision of each item is currently stored. Revisions can be identified with a string which changes for each modification. Restoring an older revision should (but doesn't have to) restore the revision string.

For calendar, UID+RECURRENCE-ID and LAST-MODIFIED can be used. For contacts, UID + REV.

By comparing a stored list against the current one, changes can be detected without relying on journals in the daemon or comparison of time stamps. Journals are hard to maintain (how many of them, how long?). Comparison of time stamps has problems when changes are allowed while a sync runs and assumes that system time is linear (which is not always true).

As an additional use case, local ID + revision string could be used to prevent overwriting stale data in the server, as in HTTP PUT eTag checks.

TODO:

The Maemo Fremantle port of EDS, which is of course already open source, apparently has some solution for this, according to Mathias Hasselman, so that should be investigated first.

However, this should probably be made possible by:

  1. Extending the EDS query language with a flag that limits the delivered data to just:
    1. UID and REV for libebook (REV is in VCard 3.0)
    2. UID+RECURRENCE-ID+LAST-MODIFIED, for libecal
  2. Implementing these queries for the local contact and calendar backends. The query would be rejected, with an error, for other backends.

Atomic Updates

Later, implement atomic changes as discussed in the mail thread: [[http://www.mail-archive.com/evolution-hackers@gnome.org/msg03029.html

Quick check for "data changed"

The libebook and libical APIs should offer a way to discover if any data whatsoever has changed. This would, for instance, help SyncEvolution to avoid unnecessary detailed checking. This could just check the modification dates of local storage files, such as the BDB .db file, or .ics file, as long as we do not use those files to store permanent caches of parsed data. This would be much quicker than listing items for the whole database and comparing their revisions strings, as is currently necessary.

This would require small API additions, to check that the opaque revision strings are unchanged, such as

gchar* e_book_get_revision(EBook *book)
gboolean e_book_check_revision(EBook *book , const gchar* revision)

and

gchar* e_cal_get_revision(ECal *ecal)
gboolean e_cal_check_revision(ECal ecal, const gchar revision)

The semantic would be this:

  • If the revision string is empty or different, then the content of the database might have changed. A more detailed check is necessary to determine if it has really changed. This will be the case for non-local storage.
  • If it is non-empty and the same as before, the nothing has changed. No further check will be necessary.

libebook: Upstream evolution-data-server bug libecal: Upstream evolution-data-server bug

Contacts: Store PHOTO Data as Plain Files

Photos should be transferred over D-Bus via a filepath to a local file rather than transferring the actual photo data over D-Bus. The VCard format already allows this option (See the PHOTO property). We must be careful to keep these local files in sync and to remove them when contacts are removed, and to restrict access to the files from other processes.

There should be utility code available to libebook users to convert between the different representations.

Goals:

  1. Apps should be able to create photo files without ever passing the data to libebook.
  2. External PHOTO file and corresponding contact must be kept in sync:
    • If a contact is deleted, the photo also needs to be removed.
    • If a photo file is created and then storing the contact fails, who removes the file?
  1. Access to photo data must be limited to processes which have the right permissions.

Proposal:

  • For each ~/.local/share/evolution/foo calendar, store photos in <photo path>=~/.local/share/evolution/foo/photos, with same permissions.
  • New photo files are created with a UUID as base name and suffix according to the file content.
  • If a contact is deleted and has a PHOTO;VALUE=uri:file://<photo path>/*, delete that file.
  • For apps which want to create the photo file directly:
    • Apps get a file descriptor and the path from a new libebook API.
    • They own the file until the EDS server acknowledges the successful storing of a contact with a matching PHOTO;VALUE=uri:file property. If that fails, the app must remove the file.
  • Add new API calls to libebook which split out resp. include data inline.
    • Must track ownership of file after splitting it out and before storing - might never get stored!

Open:

  • Should there be automatic garbage collection of stale photos in case something went wrong and a photo wasn't properly deleted?
  • Trust TYPE=JPEG/PNG/... or look into file content to determine type?


Future: Content Protection

Only authorized applications (= processes) with the necessary permissions should be able to read/write data. Exact permissions and mechanisms for verifying them to be defined as part of the (revised?) MeeGo architecture.

This can be added as follows:

  1. lock down file permissions
  2. add permission checks to EDS D-Bus servers

Currently not planned. Depends on the future MeeGo security infrastructure.

Personal tools