Meego Wiki
Views

Predictive virtual keyboard

From MeeGo wiki
(Difference between revisions)
Jump to: navigation, search
(2010-11-20)
(Similar implementations)
 
(8 intermediate revisions not shown)
Line 12: Line 12:
''Possibly'' we could also reuse [http://meego.gitorious.org/meegotouch/meegotouch-inputmethodengine/blobs/master/words/mimenginewordsinterface.h IM predictive text], but there are statistical reasons why this isn't a perfect solution.
''Possibly'' we could also reuse [http://meego.gitorious.org/meegotouch/meegotouch-inputmethodengine/blobs/master/words/mimenginewordsinterface.h IM predictive text], but there are statistical reasons why this isn't a perfect solution.
 +
 +
Ideally we should make this use [http://www.inference.phy.cam.ac.uk/dasher/ dasher].
(What about the front end?)
(What about the front end?)
 +
 +
== Prototype1 ==
 +
 +
[http://people.collabora.co.uk/~tthurman/predictive/ There is a JavaScript prototype which you may play with.]
 +
 +
== Prototype2 Video ==
 +
 +
http://www.youtube.com/watch?v=8gBtVYMq_ts
 +
 +
== Prototype3 ==
 +
 +
http://marnanel.org/DasherKeyboard/
== Training texts ==
== Training texts ==
Line 26: Line 40:
however for truly personal language balance it would be folly to ignore my own past words, and I have an extensive irc history ready for my own use.  with statistical filtering as described already it would offer completion of words I use most :)
however for truly personal language balance it would be folly to ignore my own past words, and I have an extensive irc history ready for my own use.  with statistical filtering as described already it would offer completion of words I use most :)
other datasources may also be available?
other datasources may also be available?
 +
 +
== Similar implementations ==
 +
 +
Turns out there's [http://thickbuttons.com something similar for Android already].
 +
<br>
 +
[http://bu4.taipudex.com/pinyin.htm Taipudex] provides dictionary based predictive autocompletion and presents alternative keys in a sorted table.
== IRC logs ==
== IRC logs ==
Line 34: Line 54:
IMHO this would be already a good start instead of trying to do markov chains and more sophisticated stuff upfront. Also, in Maemo there is already support for that but that completes the words which is annoying. Applying this to the keyboard keys highlighted could perhaps be better.
IMHO this would be already a good start instead of trying to do markov chains and more sophisticated stuff upfront. Also, in Maemo there is already support for that but that completes the words which is annoying. Applying this to the keyboard keys highlighted could perhaps be better.
 +
 +
-----
 +
 +
18:00
 +
* <timeless_mbp> marnanel: can you teach it to guess the first letter of subsequent words?
 +
* <marnanel> timeless_mbp: certainly; I didn't put that in because I thought it would get annoying, but it was a specific exclusion, so easy to take out again
 +
* <timeless_mbp> marnanel: it's better to have everything in and decide to turn things off
 +
* <timeless_mbp> THIS IS A TEST OF SOME PREDICTION ALGORITHM
 +
* <timeless_mbp> was what i typed
 +
* <timeless_mbp> n.b. not really as prefs, more you try something, see if things work, and decide not to ship certain bits if they're too awkward
 +
* <timeless_mbp> marnanel: oh, and um... you need to offer punctuation
 +
* <timeless_mbp> marnanel: for now, please add:
 +
* <marnanel> timeless_mbp: it's not supposed to be an entire working solution yet!
 +
* <timeless_mbp> [tab] [caps] [shift] <- left side
 +
* <timeless_mbp> ["["] ["]"] <- right side top row
 +
* <timeless_mbp> [;] ['] <- right side middle row
 +
* <timeless_mbp> [,] [.] < right side third row
 +
* <timeless_mbp> if you drag/swipe left/right over the central area, it starts progressing in the appropriate direction
=== 2010-11-20 ===
=== 2010-11-20 ===
Line 77: Line 115:
* <marnanel> lcuk: the meego wiki? sure
* <marnanel> lcuk: the meego wiki? sure
* <lcuk> and then we can flesh it out and do some bits with it
* <lcuk> and then we can flesh it out and do some bits with it
 +
[[Category:MeeGo Input Methods]]

Latest revision as of 18:26, 10 May 2011

Mockup

Contents

The idea

The idea is to have the virtual keyboard know what letters are more likely, and:

  • make them easier to hit, and
  • (possibly) highlight them visually, or at least dim all the other ones.

Implementation

The back end can take a set of Markov chains based on a given language and output the most likely result. Here's a sketch of a program to produce such a database.

Possibly we could also reuse IM predictive text, but there are statistical reasons why this isn't a perfect solution.

Ideally we should make this use dasher.

(What about the front end?)

Prototype1

There is a JavaScript prototype which you may play with.

Prototype2 Video

http://www.youtube.com/watch?v=8gBtVYMq_ts

Prototype3

http://marnanel.org/DasherKeyboard/

Training texts

What would be appropriate training texts for each language? Public domain would be helpful, so they will be quite old, but if they're too old (e.g. Chaucer in the original) they'll be less useful for producing the data.

lcuk: we have extensive conversation logs from the IRC channels around which might be useful?

timeless: to some extent you're going to want to filter against a spelling dictionary to avoid typos. Some additional magic should be applied to learn proper nouns. This is all doable, I even have scripts or beginnings of scripts for some of it.

lcuk: clearly for generic training default documents, using text and books from the most general field possible would be useful. perhaps using a subset of wikipedia for instance? however for truly personal language balance it would be folly to ignore my own past words, and I have an extensive irc history ready for my own use. with statistical filtering as described already it would offer completion of words I use most :) other datasources may also be available?

Similar implementations

Turns out there's something similar for Android already.
Taipudex provides dictionary based predictive autocompletion and presents alternative keys in a sorted table.

IRC logs

2010-11-22

22:04 < sivang> lcuk: interesting idea, why not just use a dictionary and speed search through it while typing, only instead of showing remaining possible words, dim the letters that no longer take part

IMHO this would be already a good start instead of trying to do markov chains and more sophisticated stuff upfront. Also, in Maemo there is already support for that but that completes the words which is annoying. Applying this to the keyboard keys highlighted could perhaps be better.


18:00

  • <timeless_mbp> marnanel: can you teach it to guess the first letter of subsequent words?
  • <marnanel> timeless_mbp: certainly; I didn't put that in because I thought it would get annoying, but it was a specific exclusion, so easy to take out again
  • <timeless_mbp> marnanel: it's better to have everything in and decide to turn things off
  • <timeless_mbp> THIS IS A TEST OF SOME PREDICTION ALGORITHM
  • <timeless_mbp> was what i typed
  • <timeless_mbp> n.b. not really as prefs, more you try something, see if things work, and decide not to ship certain bits if they're too awkward
  • <timeless_mbp> marnanel: oh, and um... you need to offer punctuation
  • <timeless_mbp> marnanel: for now, please add:
  • <marnanel> timeless_mbp: it's not supposed to be an entire working solution yet!
  • <timeless_mbp> [tab] [caps] [shift] <- left side
  • <timeless_mbp> ["["] ["]"] <- right side top row
  • <timeless_mbp> [;] ['] <- right side middle row
  • <timeless_mbp> [,] [.] < right side third row
  • <timeless_mbp> if you drag/swipe left/right over the central area, it starts progressing in the appropriate direction

2010-11-20

  • <marnanel> lcuk: so someone at the conference said (not in these terms) that what we need is to take n-grams of English text and produce Markov chains such that after n-1 letters entered on the osk, the three or so most likely following letters became *slightly* larger
  • <lcuk> yes marnanel
  • <marnanel> lcuk: so I wondered whether you know whether anyone else was working on it
  • <marnanel> lcuk: otherwise I might.
  • <thiago_home> marnanel: they don't have to be larger. Just their hit areas.
  • <marnanel> thiago_home: true
  • <lcuk> marnanel, I remarked that to slightly dim the other non useful characters would be a good visual indicator
  • <marnanel> lcuk: oh yeah, I remember you saying that
  • <marnanel> lcuk: I could probably fix up UI stuff to do that. but atm I am thinking about the back-end implementation
  • <marnanel> I think this could be a thing of great niftiness
  • <lcuk> i heard the keymats for the vkb are a big ass svg file
  • ...
  • <lcuk> anyway, not sure if those keymats can have birightness modified on the fly or if the hitzones can be effected
  • <lcuk> and I am not sure how I would proceed using qml
  • <lcuk> whether you can have an element but have its hit area larger without intefering
  • <marnanel> I don't know either.
  • <lcuk> making a new transparant layer ontop with bigger hitzones would suffice
  • marnanel nods
  • lcuk hmms
  • <lcuk> i am wondering how I would implement it by thinking about the liqbase keyboard
  • <lcuk> marnanel, how many potential keys would we need to show larger
  • <marnanel> lcuk: I am thinking three-ish. more than that would defeat the point
  • <lcuk> marnanel, reasonable
  • <lcuk> so holding a group of transparent redefinable (and relocatable) widgets sitting ontop of the keyboard which would offer larger hitzone for the press would work?
  • <marnanel> I believe so, yes
  • <marnanel> lcuk: so if I wrote a library where you gave it zero to three letters and it returned a string of three characters which were likely to follow, that would be a start
  • <marnanel> plus the database
  • <lcuk> marnanel, one thing about keyboards - using circular distance algorythms from the centre of each letter
  • <lcuk> rather than a rectangle hitzone
  • <lcuk> would be nicer potentially
  • <lcuk> marnanel, so the library would accept a string
  • <lcuk> which is the text leading upto the cursor current position
  • <marnanel> lcuk: circular> yes, we just use pythagoras with respect to each letter and find the shortest maybe, with weighting for the more likely letters
  • <marnanel> lcuk: yes
  • <lcuk> marnanel, lets talk again on monday after getting some thoughts sunk in
  • <marnanel> lcuk: yeah, good plan
  • <marnanel> lcuk: I'll hack around with ngrams a bit maybe
  • <lcuk> do you want to perhaps copy paste this convo into the wiki
  • <marnanel> lcuk: the meego wiki? sure
  • <lcuk> and then we can flesh it out and do some bits with it
Personal tools