Nountype Quirks: Day 3: Geo Day

mitcho

July 31, 2009

8:20 pm

It’s time for one more installment of Nountype Quirks, where I review and tweak Ubiquity’s built-in nountypes. For an introduction to this effort, please read Judging Noun Types and my updates from Day 1 and Day 2.

Today I ended up spending most of the day attempting to implement (but not yet completing) major improvements to the geolocation-related nountypes whose plans I lay out here.

Note: this blog post includes a number of graphs using HTML/CSS formatting. If you are reading this article through a feed reader or planet, I invite you to read it on my site.

noun_type_geolocation

noun_type_geolocation is the nountype used by the weather command for its location argument in input like “weather near Chicago”. The neat feature of noun_type_geolocation is that it has a smart default value which uses Firefox’s geolocation system to give you your current location by default, so I can enter “weather” and get the suggestion “weather near Broomfield, Colorado” (not completely correct, but close enough for the weather). Otherwise, however, noun_type_geolocation does not do too hot… for any input you give it, it’ll just accept it with a score of 0.3, much like noun_arb_text. We could do better.

One issue with this noun_type_geolocation is a conceptual one. Is this nountype supposed to accept only municipalities? Countries? Or should it accept landmarks or addresses as well? Part of the issue is that it’s only used by one built-in command in Ubiquity now, weather. But to be called a general “geolocation” nountype, its output should not be specific to weather’s usage, which is to throw the result at the Weather Underground API.

I propose that we change this to be something like noun_type_geo_town and also make similar nountypes like noun_type_geo_country, noun_type_geo_region, going all the way down to noun_type_address (which already exists—see below). All of the nountypes in this family could use a geocoding API such as Google’s or Yahoo’s. Their data properties could include all of this geocoded geographic data (in English) and also the latitude/longitude coordinate data.

The weather command could then accept noun_type_geo_town but, as some municipalities are not in Weather Underground or, for some countries, it is only as granular as administrative districts, we could just display the results of the geocoding API but then give Weather Underground the geocoded latitude/longitude data.

noun_type_async_address

noun_type_async_address attempts to do exactly what I’ve laid out above for the most granular level: that of geolocations with data all the way down to the street level. This is the nountype which is used for the built-in map command and uses the Yahoo geocoding service to accomplish this. Let’s see what kinds of results it returns:

input suggestion tuner-top.png
mitcho mitcho 0.5
grenada grenada 0.9
jono jono 0.9
mountain view mountain view 0.9

Let’s lay out some immediate quirks:

  1. All scores are either 0.5 or 0.9. In general, if the Yahoo API returns some geocoded interpretation, it gets 0.9, but otherwise it accepts everything with 0.5.
  2. The results that came back from the Yahoo service doesn’t add any useful information like the country or administrative region. Even the case stays lowercase.
  3. Since when is Jono a location!? I’ll get back to this later.

For starters, the Yahoo! Maps API terms of service dictate that we can’t use its geocoding service if we’re not also displaying Yahoo maps, so I rewrote it using the Google API which also had the advantage of offering JSON output.

One quirk of the Google Geocoding API, though, is that all of the resulting municipality names are only in English. Try for example queries for Wien or 東京 (Tokyo). Since we want our suggestions to only add information to our input, not replace the input entirely (and especially not in another language), we’ll then only take results which have the input as an initial substring. On the other hand, if none of the results have the input as a proper prefix of the return value, we will take the geocoding information from the first result but with the original input as the display text. Such results will have a markedly lower score.1

As this is the address nountype, we’ll penalize results which do not have detailed information such as street address or town-level information. All of this is very easy to judge as every result from the API has a geocoding accuracy value.

The best laid plans of mice and men…

I spent a good few hours this afternoon and evening attempting to implement this new family of nountypes, including this new nountype_geo_address, but also nountype_geo_subregion, nountype_geo_region, and nountype_geo_country. Some of the quirks of the weather and map commands, however, have prevented me from completely replacing the legacy noun_type_address and noun_type_geolocation described above. I hope to continue this work again soon and actually make this transition, ideally before 0.5.2.

Look forward to one (or maybe two?) more episode(s) of Nountype Quirks where I hope to definitively explain, analyze, and tweak matchScore, the scoring algorithm which underlies the majority of the nountypes in Ubiquity. As always, I look forward to your comments and feedback.

Bonus: Where’s Jono?

It turns out that noun_type_async_address was recognizing “Jono” as an address because Jono is actually a location afterall! Not only that, but Jono is in Japan!!

Picture 3.png

You clearly can’t take Japan out of Jono, but it turns out you can’t take Jono out of Japan either.


  1. If this crazy algorithm raises a red flag for anyone, you’re not alone… if you think of a more elegant solution, please let me know. This will no doubt be an issue when it comes to localizing the address nountype as well. I wish we could specify an output language for the Google Geocoding API… :(  

Related Posts

  1. Nountype Quirks: Day 2
  2. Nountype Quirks: Day 1
  3. Judging Noun Types

Related posts brought to you by Yet Another Related Posts Plugin.


Caching Noun Type Results in Ubiquity

admin

1:18 pm

Introduction

While the changes made to Ubiquity in the 0.5 and 0.5.1 release have made it an increasingly sophisticated and capable tool, speed is still a serious concern. In an effort to further speed up Ubiquity, we are working on the caching mechanism for noun type results. However, there is a pretty major design decision that needs to be addressed before we can activate this caching. We would appreciate any feedback you may have on the issue.

Explaining Noun Types

Arguments given to Ubiquity are classified into noun types. This is how Ubiquity knows which commands are applicable to your input. For example, the argument Spanish matches the language noun type, and would therefore be applicable to commands like “translate to Spanish (text) (from language)” and “wikipedia in Spanish (search term)“. An argument can match multiple noun types. To test which noun types an argument should be classified into, Ubiquity asks each noun type to give suggestions based on that argument. For the argument Spanish, the language noun type will simply return ["Spanish"]. For the argument Ita, it would return ["Italian"]. For the argument sandwich, it would return an empty array.

Caching and It’s Effects

When a noun type returns suggestions based on a given argument, we can cache the name of the noun type, the argument it was given, and the suggestions it gave. By doing this, the next time that we need to ask that noun type for it’s suggestions based on some argument, we can first look in the cache and see if we have already asked it about this argument before. If that is the case, then we can simply return the cached suggestions rather than making the noun type give them again. This can significantly increase the speed at which we match arguments to noun types. Furthermore, several of the noun types used, including the restaurant and address noun types, do network calls to web services. By caching the results of these calls for a given argument, we can reduce the number of requests made to these service and help avoid DDOS attacks.

The Question: When to flush the cache?

We are currently trying to figure out the best way to handle flushing. How often should cached suggestions remain accurate before we should clear the cache and retrieve the suggestions again? It turns out that the amount of time that a noun type’s suggestions are accurate is highly dependent on the particular noun type in question. The suggestions from the tab noun type, which checks to see if the input matches a currently open tab, may only be accurate for a few minutes or less. The suggestions from the address noun type, which checks a maps service to see if the input matches a known location, may be accurate for at least a few weeks. The suggestions from the percentage noun type, which suggests a percentage value based on the input, would remain accurate so long as the percentage noun type code isn’t modified (which could be months or longer).

It’s clear that there is no single flushing time that is appropriate for all noun types. So how do we handle deciding when to flush the suggestions from each noun type? We have proposed two possible solutions:

  1. Have each noun type contain a cacheTime property which tells Ubiquity how long it’s suggestions are accurate for. This is a good way to get a general approximation for when to flush each noun type, but it’s far from perfect. For the case of the tab noun type, there is a lot of variance in the length of time that a tab may be open, so it seems wrong to declare some single value for how often those suggestions should be refreshed.
  2. Use an event-based model, in which we would add event listeners to events that signify a need to flush a particular noun type’s cache. For the case of the tab noun type, we can clear that cache when tabs are opened and closed. This works really well for the case of tabs, but my concern is that the event-based model may not be robust enough to encompass all of our noun types. For example, the restaurant noun type queries Yelp to figure out if your input is a known restaurant. How will we know when the Yelp database has been updated, thus making old Yelp suggestions no longer accurate? I’m not sure how we would attach an event listener in situations like these.

So perhaps the correct solution is a combination of noun type specific cacheTime properties and event-based flushing? Or are there other possibilites we have not yet thought of?

Thoughts?

We welcome any thoughts you may have on this issue and the possible solutions we have outlined above.

No related posts.


Nountype Quirks: Day 2

mitcho

July 30, 2009

2:44 pm

Today I’m continuing the process of reviewing and tweaking all of the nountypes built-in to Ubiquity. For a more respectable introduction to this endeavor, please read my blog post from a couple days ago, Judging Noun Types and my status update from yesterday, Nountype Quirks: Day 1.

Note: this blog post includes a number of graphs using HTML/CSS formatting. If you are reading this article through a feed reader or planet, I invite you to read it on my site.

noun_type_twitter_user

Let’s begin again by considering the suggestions and scores that a variety of different inputs to this nountype return and see what quirks we find.

To test this nountype, I made sure I had logged into Twitter once with the login mitchoyoshitaka.

input suggestion tuner-top.png
mitcho mitchoyoshitaka 0.85
mitcho 0.5
mitchoyoshi mitchoyoshitaka 0.94
mitcho 0.5
test test 0.5
テスト none
hello world none
@test none

As nountypes go, this is looking pretty good. For usernames which look like logins we’ve saved before, we’re using matchScore to get decent differential scores.1 It’s even ruling out impossible twitter username strings, according to Twitter’s own restriction:

twitter-usernames.png

One possible improvement we could make is to let @ strings be accepted. I went ahead and made this improvement. The initial @ will be stripped off and then will be checked as normal, but the final score will receive a slight boost using an nth root formula. The twitter command was also updated to deal with inputs with and without the initial @.

input suggestion tuner-top.png
mitcho mitchoyoshitaka 0.85
mitcho 0.5
@mitcho @mitchoyoshitaka 0.88
@mitcho 0.57
test test 0.5
@test @test 0.57

Although the noun_type_twitter_user nountype is currently most used by the built-in twitter command to specify the user’s username, in theory it could also be used for example in a command which pulls up another user’s tweets. With that in mind, perhaps in the future we could check the browser history and/or bookmarks for entries of the form http://twitter.com/... and suggest those as well (trac #846).

noun_type_number

input suggestion tuner-top.png
text none
0.5 0.5 1
0.5.1 none

This nountype has an incredibly simple job and does it with ease. I’m going to leave it alone.

noun_type_date and noun_type_time

noun_type_date and noun_type_time both use the magical Date.parse method to parse date- and time-like strings. Let’s first take a look at some of its suggestions:

input date suggestion time suggestion tuner-top.png
June 8th 5pm 2009-06-08 05:00 PM 1
5pm 2009-07-30 05:00 PM 1
5 2009-07-05 12:00 AM 1
June 8th 2009-06-08 12:00 AM 1
today 2009-07-30 12:00 AM 1
now 2009-07-30 02:40 PM 1
5pm is a good time none none

The quirks in these outputs can be summed up into these two factors:

  1. There is no differential scoring at all.
  2. Both nountypes parse the input with Date.parse and then just spit out the date or time components of the result. Thus time-only inputs get the default date and date-only inputs get the default time with equal scores.

I just rewrote both nountypes and also added a new noun_type_date_time. Here are some of the features of the new implementation:

  1. If the input only contains digits and spaces, it is marked down.
  2. With the exception of the outputs ‘today’ and ‘now’, if the resulting Date object’s date is today, its date suggestion is scored lower; equivalently for time being the default value, “12:00 AM”.
  3. Scores (with the exception of ‘today’ and ‘now’) which are shorter than the output string get a slight penalty. This factor reflects the intuition that a longer output than input means some generic information was added and thus there is less confidence in the output.

Here’s what some of the inputs give now:

input suggestion tuner-top.png
June 8th 5pm date: 2009-06-08 0.7
time: 05:00 PM 0.7
date_time: 2009-06-08 05:00 PM 0.86
5pm date: 2009-07-30 0.27
time: 05:00 PM 0.81
date_time: 2009-07-30 05:00 PM 0.49
5 date: 2009-07-05 0.53
time: 12:00 AM 0.19
date_time: 2009-07-05 12:00 AM 0.34
June 8th date: 2009-06-08 0.95
time: 12:00 AM 0.35
date_time: 2009-06-08 12:00 AM 0.58
today date: 2009-07-30 1
time: 12:00 AM 0.45
date_time: 2009-06-08 12:00 AM 0.7
now date: 2009-07-30 0.7
time: 12:00 AM 1
date_time: 2009-06-08 04:34 PM 1

In addition, looking to the future we’d like to make nountypes localizable as well, and these two nountypes in particular will surely require some good thinking and planning to make localizable.

noun_type_email and noun_type_contact

noun_type_email and noun_type_contact are two closely related nountypes. noun_type_email simply validates email address-looking strings, while noun_type_contact will return the noun_type_email suggestions and additionally return contacts from GMail if available.

The first thing to note is that I’ve often found the GMail contact lookup to be finicky in my own use. Reading through the code, I discovered the solution: GMail must either be open in a tab or you must use the “stay signed in” option and close the GMail tab.2 With this mystery solved, and some code cleanup done to this contact fetching, let’s take a look at some example suggestions: (suggestions overlapping with noun_type_email are not listed here)

input suggestion tuner-top.png
aza@m aza@mozilla.com 0.42
jono jdicarlo@mozilla.com 0.28
jdicarlo jdicarlo@mozilla.com 0.19

In general, we see that these scores all look pretty poor. In particular, though, note that the “jono” input yielded a higher score for the same suggestion than “jdicarlo”, even though “jdicarlo” is longer and thus, intuitively, has more informational content and should maybe do better. Digging into the code I realized why this is. It was computing the scores by comparing “jono” and “jdicarlo” not simply to “Jono DiCarlo” and “jdicarlo@mozilla.com” respectively, but to the combined string “Jono DiCarlo <jdicarlo@mozilla.com>”. Now with this change in place, both the email address and name are analyzed individually and, due to the way nountype detection works in Parser 2, no duplicates are returned. Here are the updated results:

input suggestion tuner-top.png
jono jdicarlo@mozilla.com 0.83
jdicarlo jdicarlo@mozilla.com 0.85

That’s much better!

Now let’s consider the suggestions from noun_type_email. Here are what they originally looked like:

input suggestion tuner-top.png
bpung none
bpung@m bpung@m 1
bpung@mozilla.com bpung@mozilla.com 1

noun_type_email is based on a very robust regular expression for RFC 2822. Unfortunately this means that it completely rules out strings such as “bpung” which could be a proper prefix of an email address—something that I’ve advocated for avoiding before (see footnote 2 of Judging Noun Types). Moreover, due to a quirk of how nountypes based on regular expressions are scored, all results are given the score of 1.

I just committed a change so that this behavior is improved. The new version accepts strings which match the username part of the email address spec sans @ and domain, but with a great score penalty.3 Moreover, domains which do not have a final label (the top level domain) with more than one letter (unless it’s an IP address) or do not have any periods (.) in the domain will be penalized as well. Here’s what the same inputs produce now:

input suggestion tuner-top.png
bpung bpung 0.3
bpung@m bpung@m 0.8
bpung@mozilla.com bpung@mozilla.com 1

Same time, same channel

I hope this post sheds light on the many changes I made together as well as the underlying thought process. If you don’t agree with any particular fix or analysis, please comment! I’ll be back again tomorrow with another installment of Nountype Quirks. Stay tuned!


  1. Again, matchScore will be the subject of another blog post in the near future. 

  2. Moreover, due to the way noun_type_contact caches the contact list internally, as long as GMail’s contacts are available once, you should be able to continue accessing those contacts’ suggestions after logging out of GMail. There are also great performance benefits to this caching. The downside is that we currently have no way to know when to clear the cache, so even if you update your contacts in GMail, those new contacts won’t appear in Ubiquity until you restart Firefox. 

  3. Perhaps this is a horrible idea, because if executed or previewed, any verb which uses these nountypes would have to deal with arguments which are not valid email addresses. In my mind, though, as long as it doesn’t actually cause any error, this should be okay. Keep in mind that, given the very low scores given to these suggestions, parses using it would most likely only show up if the verb which requires these nountypes was explicitly given and there are other arguments as well, for example in input like “email hello to bpung”. In such a situation, we would rather this suggestion not disappear until we type “@m”. If executed, the built-in email verb, for instance, will deal with this gracefully by simply putting the incomplete email address in the To field. 

Related Posts

  1. Nountype Quirks: Day 3: Geo Day
  2. Nountype Quirks: Day 1
  3. Judging Noun Types

Related posts brought to you by Yet Another Related Posts Plugin.


Nountype Quirks: Day 1

mitcho

July 29, 2009

3:00 pm

Today I began the process of going through all of the nountypes built-in to Ubiquity using the principles and criteria I laid out yesterday—a task I’ve had in planning for a while now. As I explained yesterday, improved suggestions and scoring from the built-in nountypes could directly translate to better and smarter suggestions, resulting in a better experience for all users. Here I’ll document some of the nountype quirks I’ve discovered so far and what remedy has been implemented or is planned.

Note: this blog post includes a number of graphs using HTML/CSS formatting. If you are reading this article through a feed reader or planet, I invite you to read it on my site.

noun_type_percentage

Here’s what a few different inputs originally returned:

input suggestion tuner-top.png
20 20% 1
20% 20% 1
0.2 20% 1
0.2% 20% 1
20.0 2000% 1
2 hens in the garden 2% 1

Let me highlight a couple obvious quirks:

  1. In certain cases, where the numerical expression includes a decimal and is less than one, it is interpreted as a proportional, rather than percent, value, e.g. “0.2” → “20%”. “0.2%” is not even an option. This is the case even when explicitly adding a % sign.
  2. All suggestions, including those where the numeral was extracted from a long string of text (e.g. “2 hens in the garden”), get the same score of 1.

I just committed a fix so noun_type_percentage now…

  1. Counts the number of characters in the input which match [\d.%] and caps the score by (number of acceptable characters)/(length of input).
  2. Strings which do not include “%” get a 10% penalty.
  3. In the case of decimals less than 1 without a % sign, the proportion interpretation is also suggested (e.g. “0.2” → “20%”) in addition to the original suggestion (“0.2%”), but with a slight penalty.

Here is what they now return:

input suggestion tuner-top.png
20 20% 0.9
20% 20% 1
0.2 0.2% 0.9
20% 0.81
0.2% 0.2% 1
20.0 20% 0.9
2 hens in the garden 2% 0.05

noun_type_tag

Here’s what a few different inputs originally returned. Keep in mind that currently in this test profile, the preexisting tags are “animal”, “help”, “test”, and “ubiquity”.

input suggestion tuner-top.png
animal animal 0.3
mineral mineral 0.3
anim animal 0.7
anim 0.3
help, test, ubiq help,test,ubiquity 0.7
help,test,ubiq 0.3
google, yahoo, ubiq google,yahoo,ubiquity 0.7
google,yahoo,ubiq 0.3
google, , yahoo google,yahoo 0.3

Here are a few of noun_type_tag’s quirks:

  1. There are only two scores ever given out: 0.3 and 0.7.
  2. Only the last tag in the list and whether it exists or not is taken into account.
  3. When the last tag is incomplete, the completion is suggested with a higher score, but if the last tag is exactly equal to an existing tag, it gets the lower score.

Ideally, we want noun_type_tag to look at each of the tags given to it, with higher scores for when there are more preexisting tags and fewer new ones. Keep in mind, though, that we only have to suggest the completion of the very last tag as that may be one where the user hasn’t completed typing yet… for earlier tags, we can assume (safely or not) that the user placed the comma where they meant to. We can’t teach Ubiquity to read minds, after all.1

With this in mind, I just made a change to noun_type_tag which aims to follow these principles. The basic idea is that we start with a base score of 0.3 but then raise it via nth root for every tag in the sequence which is preexisting. Here’s what the same inputs return now. Recall that the preexisting tags are “animal”, “help”, “test”, and “ubiquity”.

input suggestion tuner-top.png
animal animal 0.55
mineral mineral 0.3
anim animal 0.55
anim 0.3
help, test, ubiq help,test,ubiquity 0.86
help,test,ubiq 0.74
google, yahoo, ubiq google,yahoo,ubiquity 0.55
google,yahoo,ubiq 0.3
google, , yahoo google,yahoo 0.3

noun_type_awesomebar

input suggestion tuner-top.png
moz http://www.mozilla.com/   0.8
https://wiki.mozilla.org/Labs/Ubiquity/ Parser_2_API_Conversion_Tutorial   0.8
http://en-us.start3.mozilla.com/ firefox?client=firefox-a&rls= org.mozilla:en-US:official   0.8
http://en-us.www.mozilla.com/en-US/firefox/about/   0.8

There are a couple quirks here:

  1. All suggestions are returned with the same scores.
  2. The nountype returns the URL of the entry as the HTML-formatted result and the title as the text-formatted result, which clearly does not make sense. However, it’s not clear to me whether the title, URL, or some combination of both is what we should be returning as the suggestion text presented to the user.2

I just rewrote noun_type_awesomebar to actually do some differential scoring. This new version also presents the URL or title depending on whichever had a better match using the matchScore function.3

input suggestion tuner-top.png
moz www.mozilla.com   0.7
https://wiki.mozilla.org/Labs/Ubiquity/ Parser_2_API_Conversion_Tutorial   0.63
http://en-us.start3.mozilla.com/ firefox?client=firefox-a&rls= org.mozilla:en-US:official   0.61
http://en-us.www.mozilla.com/en-US/firefox/about/   0.6

noun_type_url

The purpose of noun_type_url’s suggest function is two-fold: first, to accept strings which may look like a URL and, second, to suggest URL’s from the history just like noun_type_url, but only based on URL matches and not title matches.4 Here are a few sample inputs:

input suggestion tuner-top.png
moz http://www.mozilla.com/   0.9
http://moz   0.5
https://wiki.mozilla.org/Labs/Ubiquity/ Parser_2_API_Conversion_Tutorial   0.9
http://en-us.start3.mozilla.com/ firefox?client=firefox-a&rls= org.mozilla:en-US:official   0.9
http://en-us.www.mozilla.com/en-US/firefox/about/   0.9
test http://test   0.5
http:// http://   0.5
http: http:   0.5
http http   0.5
_test http://_test   0.5
hello world! http://hello world!   0.5

Oh, where to begin!? Here are some initial quirks… it’s possible that you could think of more!

  1. There is no differential scoring… only 0.9 for suggestions from history and 0.5 for URL-like strings.
  2. A number of invalid domain names are being accepted and turned into suggestions (“hello world!”, “_test”, etc.).
  3. It’s trying to be smart by suggesting “http://” as a default URI scheme but doing so even for prefixes (initial substrings) of the word “http” itself.

With these thoughts in mind, I just took a first stab at improving this situation. Here are some features of the new implementation:

  1. History entries are scored in the same way as in noun_type_awesomebar, using matchScore.
  2. URLs without an explicit URI scheme (like “http://”) get a 10% penalty.
  3. “http://” is only suggested if one of a long list of common URI schemes are not detected.
  4. It repairs schemes which are missing a slash or two, suggesting for example “http:hello.com” → “http://hello.com”.
  5. It actually uses Firefox’s own IDNService to check if the domain name is a valid internationalized domain name. If it’s an IDN as opposed to LDH (“letters, digits, and hyphens”), it gets a 10% penalty. If it’s not even a valid IDN, it is ruled out (see last two example inputs below).
  6. There are also penalties for only being a domain name with no path and for the domain not having any periods (.) in it.

Here is what our suggestions now look like:

input suggestion tuner-top.png
moz http://www.mozilla.com/   0.6
http://moz   0.65
https://wiki.mozilla.org/Labs/Ubiquity/ Parser_2_API_Conversion_Tutorial   0.63
http://en-us.start3.mozilla.com/ firefox?client=firefox-a&rls= org.mozilla:en-US:official   0.61
http://en-us.www.mozilla.com/en-US/firefox/about/   0.6
test http://test   0.65
http:// http://   1
shttp://   0.75
http: http://   0.9
shttp://   0.7
http http://   0.72
https://   0.71
shttp://   0.68
http://http   0.65
_test none  
hello world! none  

See you tomorrow~

Alright, enough nountype wrangling for one day. I’ll be back again tomorrow for another installment.


  1. If we could make assumptions about what tags look like, for example that they are always pretty short, or use certain character classes, we could use such factors as well to judge non-preexisting tags for “tagginess” but unfortunately it’s possible (though unlikely) that a user would prefer really long tag strings and of course Firefox allows tags in any unicode code range. The only strings we can immediately rule out as impossible are ones which are purely whitespace. 

  2. It’s actually unclear whether the method we’re using (nsIAutoCompleteSearch) is actually searching titles or not… it currently looks like it’s only looking at the URL’s. Perhaps the title query is what we’re supposed to enter in the mystery parameter

  3. I hope to discuss the matchScore function in a separate blog post later. 

  4. While writing up this section I ran into a bug whereby when both noun_type_awesomebar and noun_type_url are active, only one of their async callbacks from Utils.history.search are returned. Thus, if lucky, only one of the nountypes will return the history results and if unlucky the parse query will not complete. Filed as trac #845

Related Posts

  1. Nountype Quirks: Day 3: Geo Day
  2. Nountype Quirks: Day 2

Related posts brought to you by Yet Another Related Posts Plugin.


Weave 0.5 Released

mconnor

1:04 pm

Weave Sync is a prototype that encrypts and securely synchronizes the Firefox experience across multiple browsers, so that your desktop, laptop and mobile phone can all work together. It is part of the Weave project, which aims to integrate services more closely with the browser.

Major Features

What is Weave Sync all about? In short, Weave Sync lets you securely take your Firefox experience with you to all your Firefox browsers — including our mobile browser, codenamed Fennec. It currently supports continuous synchronization of your bookmarks, browsing history, saved passwords and tabs. For example:

  • Get the same results on the Smart Location Bar on each of your Firefox browsers, so you can get to your favorite sites with just a few keystrokes
  • Continue what you were doing: have the ability to open any tab you have open on any of your Firefox browsers
  • Keep the same list of bookmarks on all of your Firefox browsers
  • Easily sign in to all your favorite sites using your saved passwords (this is especially handy on mobile phones, where it’s hard to type in complex passwords)
  • Do it all securely: Weave Sync encrypts user data before uploading it to Mozilla’s servers, so that only you can access your data

What’s new in 0.5?

If you have not looked at Weave recently, now is a great time to jump in and try it out!  In this release we’ve made a bunch of improvements in terms of reliability and performance.  A few of the major changes are:

  • Major performance improvements during upload and download
  • Sync waits until you’re not actively using the browser
  • Improved support for bookmark tags and smart folders
  • Support for changing passwords and passphrases
  • Support for Fennec on Windows Mobile and Firefox on x86 OpenSolaris
  • Better error handling and reporting

Getting Involved with Testing and Development

– Mike Connor, on behalf of the Weave development team


Flexible Membranes and Catch-alls in JavaScript

Atul

10:19 am

One of the recurring issues that the Mozilla platform team has to contend with is the issue of how to allow trusted, privileged JavaScript code to interact with untrusted JavaScript code. Google’s Caja team actually has to deal with a very similar problem, albeit at a different layer in the technology stack.

This issue is quite subtle, and fully explaining it is beyond the scope of this blog post. If you know JavaScript, I recommend checking out the Caja Specification, which nicely lays out the problems inherent in running code with different trust levels in the same environment.

Firefox has to deal with this issue because much of it is actually written in JavaScript. Developers call the JS that powers Firefox chrome JavaScript: it has the ability to write to the filesystem, launch other programs on your computer, and pretty much anything that Firefox itself can do. The code that runs in web pages, on the other hand, is called content JavaScript. Chrome and content JS can interact with each other securely thanks to XPConnect wrappers: little layers of code that “wrap” objects and mediate access between them and the outside world. The self-proclaimed WrapMaster and implementer of most of these wrappers is Blake Kaplan, known in some circles as “Mr. B-Kap” (mrbkap).

Google Caja’s team also has a need for the same kind of functionality, but at a different level: they need to make it possible for web pages themselves able to run code that they don’t trust, which is useful when creating plug-in frameworks for web applications. The Caja team calls wrappers membranes—a word which I find more intuitive than “wrappers” because it’s not an overloaded term in computer science and because its biological definition closely matches that of its CS counterpart.

As I wrote in Jetpack: Summer 2009 State of Security, Part 1, the boundary between trusted and untrusted code has been of some concern to the Jetpack project. Unfortunately, all the XPConnect wrappers currently in Firefox have very specific purposes: for instance, most of them are made expressly to prevent omnipotent chrome code from being exploited by impotent content code. Jetpack’s needs are unique in that a Jetpack feature should be neither as omnipotent as Firefox, nor as impotent as a web page: ideally, we should follow the principle of least privilege and give it the minimum set of capabilities it needs to do its task, and no more.

After talking with the Firefox JS and Google Caja teams, we decided that wrappers were the right kind of solution to Jetpack’s security challenges. The problem was, though, that all of Firefox’s wrappers are in C++, which made them hard to experiment with. Jetpack is, after all, a Labs project, and as such, we needed a sort of “flexible membrane” whose security characteristics we could easily change as the platform evolved. So we decided to expose some functionality to chrome JavaScript that’s traditionally only available to C/C++ code.

One nice aspect of the flexible membranes we’ve created is that they’re useful for more than just prototyping membranes: they effectively allow chrome JS to create objects with characteristics that the JavaScript language doesn’t traditionally make room for, like catch-alls for object properties. Python programmers know of these by names like __getattr__ and __setattr__, and many other dynamic languages have them, but JavaScript doesn’t—yet something like them is needed to implement basic Web APIs like HTML5 localStorage. In other words, these flexible membranes should make it easy for us to develop nicer APIs for Jetpack.

If you’re interested in digging into these flexible membranes, check out our Binary Components documentation on the wiki. And feel free to take the pre-compiled component from our HG repository and use it in your own Firefox extensions.


Tweeting your Blog with HookPress

Abi Raja

9:27 am

I always tweet about my blogposts. What if it was automatically done whenever I published a new post?

Get the script here, dump it into a new Scriptlet, change the parameters at the top, install HookPress, give it your script’s URL and pick the post_date_gmt, post_modified_gmt, post_title, post_url fields. It’s actually pretty straightforward and quick.

There’s one catch, however — your post’s URL probably won’t be right (unless you edit the code) because HookPress doesn’t have the ability to POST post_url yet. But, Mitcho, Ubiquity Chief Linguist and HookPress instigator, is working on it as we speak so expect that to be fixed pretty soon.

Update : This issue has been fixed! Everything should work perfectly now.

Webhooks are awesome and they could be so much awesomer! But that’s it for now. I’ll be writing more about my thoughts on webhooks in the next few weeks.


Experimental Learning vs. Experimental Learning

Abimanyu Raja

3:26 am

You can read the post here on the new blog.

Seriously, you should just subscribe to my new blog and avoid getting stupid link-only posts from this feed.


Judging Noun Types

mitcho

July 28, 2009

10:39 pm

Introduction

Different arguments are classified into different kinds of nouns in Ubiquity using noun types.1 For example, a string like “Spanish” could be construed as a language, while “14.3” should not be. These kinds of relations are then used by the parser to introduce, for example, language-related verbs (like translate) using the former argument, and number-related verbs (like zoom or calculate) based on the latter. Ubiquity nountypes aren’t exclusive—a single string can count as valid for a number of different nountypes and in particular the “arbitrary text” nountype (noun_arb_text) will always accept any string given.

In addition to the various built-in nountypes, Ubiquity lets command authors write their own nountypes as well.

The functions of a noun type

Nountypes have two functions: the first is accepting and suggesting suggestions and the second is scoring.

Accepting and suggesting

Nountypes don’t just have to accept the exact string they were given—they can also return suggestions which are based on that input. For example, the noun_type_language can take the input “span” and return “Spanish.” A nountype can return multiple suggestions which may or may not include the trivial suggestion, i.e. the original input as is. If there is no way that that input could possibly be part of an accepted value, it should return no suggestions, i.e. [].2

Scoring

Ubiquity 0.5 with Parser 2 introduced the notion of a nountype suggestion score. For example, two different nountypes can accept the same input, but with different scores. Scores range from 0 to 1 where 1 is a perfect or exact suggestion and 0.1 or so is a very very improbable suggestion.3 These scores are used in the scoring of parses. Because verbs specify certain nountypes for each of their arguments, the scores that individual nountypes return for each argument are a crucial component of the scoring algorithm and can even determine whether a parse is returned or not.

With this in mind, you may be tempted to make your nountype return a score of 1 on any input so your verb will show up in the suggestions highly. While this would work, it will only act to make your verb annoying and a poor Ubiquity citizen. Appropriate scores must be given to noun suggestions, with higher values reflecting confidence and lower values reflecting imprecision. But how exactly do you figure out what’s an appropriate value?

Judging nountypes with the Nountype Tuner

The Nountype Tuner is a new tool I’ve been building to help both Ubiquity core developers and command authors to check their nountypes against others and to “tune” their behavior and scores. The nountype tuner will take your input and throw it against all of the nountypes referenced in your active verbs and display the suggestions returned with their scores. You can think of it as the Playpen’s little sister.

tuner.png

The Nountype Tuner can be found at chrome://ubiquity/content/tuner.html, though I am pretty sure it is broken in Ubiquity 0.5 and 0.5.1. It has been fixed now and I will make sure it’s in good shape for 0.5.2.

The heart and soul of the Nountype Tuner is this scale:

tuner-top.png

This scale tells you, in plain English, what different scores represent and correspond to, in two sets of vocabulary: “in terms of a guess” and “in terms of a match.” While still subjective, this scale helps developers just different input/output pairs and their scores. For example, “lian” → “http://lian” is given 0.5, so it’s an okay guess or a possible match… does that seem right to you? Or “lian” → “Italian” being between “okay” and “good.” Appropriate? We can look at such statements, decide how we feel about them, and tweak if necessary.

Good nountype scores have roots

roots.jpg

CC-BY Aaron Escobar

…not that kind of root, but more like this kind of root… let me explain…

When comparing the scores that individual nountypes return for different inputs, we must compare those scores within the same nountype’s family of suggestions to see if higher scores truly correspond to higher confidence. For example, the language nountype should give the suggestion “French” for both the inputs “f” and “fren,” but the scores of these suggestions should be different—i.e. the score of “f” → “French” should be much lower than the score for “fren” → “French,” reflecting the additional informational value. We refer to this relation of the scores of successive prefixes of a single suggestion all returning that same suggestion as the score curve and in general it should be non-decreasing.4

One could say the most trivial score function then is the linear one. For a series of converging prefixes of the same suggestion (“Dutch”), under a linear approach we could naively let the score be (length of the input)/(length of the suggestion), as below:

the linear model

input d du dut dutc dutch
output Dutch Dutch Dutch Dutch Dutch
score 0.2 0.4 0.6 0.8 1

This linear model is represented below by the black line.

nth-roots.png

The problem with the linear model is that earlier transitions (additional keystrokes) add more information than the later ones. Once we’ve entered “dutc,” after all, we would like to be pretty darn sure that we mean “Dutch,” so the score difference between “dutc” and “dutch” should be less than the score difference between, say, “d” and “du.” We want a score curve that looks more like the solid or dotted red lines above.

For this reason, I strongly advocate the incorporation of an nth-root in the score computation. Nth-rooted score functions over [0,1] have the feature that they are increasing but also that earlier transitions affect the score more than later ones, which is exactly what we’d like to see. (The solid red line above is x^1/2 and the dotted one is x^1/3.)5

Conclusion

Properly tuning both the built-in nountypes and custom nountypes is crucial to producing more accurate and relevant parse suggestions. I’ll be using the principles and criteria laid out above, combined with the new Nountype Tuner, to tune the built-in nountypes (trac #746) in the coming days in preparation for our 0.5.2 release. I invite you to use the Nountype Tuner in 0.5.2 to tune your custom nountypes as well.


  1. Or, as I often write them, “nountypes.” 

  2. Note that I didn’t just say “if the input is not an accepted value…” That’s because, based on the left-to-right nature of text input, an argument may later become a valid input of a certain nountype with a few more keystrokes. For example, if we had a URL nountype which accepted “http://mitcho.com” but not “http://mitcho”, any command which took this nountype would not show up in the suggestions while we were typing out “http://mitcho”… but would suddenly appear when we completed the “.com”. The best practice here is to suggest a valid value for the initial “http://mitcho”, like “http://mitcho.com”.
    (In reality, I should have said “initial-to-later nature” to be fair to right-to-left languages, but you get the idea. Speaking of which, serious consideration of Ubiquity in right-to-left languages is long overdue.) 

  3. In reality, due to the way parses are scored and the fact that noun_arb_text accepts anything with score 0.3, a suggestion with score below 0.3 is probably not worth even giving out. Notable exceptions are for custom noun types which are used in commands which take multiple arguments… in these cases, even scores below 0.3 could add up and overtake a noun_arb_text parse, but it’s rare. 

  4. The idea that successively longer inputs should yield successively higher scores only makes sense (1) when they are converging on the same suggestion output and (2) when these are truly suggestions, not just acceptances. For nountypes which accept the input verbatim, suggestion scores need not increase… for example “1” is just as good a “number” as “1234” is, so both of their respective suggestions, “1” and “1234” could be given the same score. 

  5. Unfortunately the Nountype Tuner currently only compares the suggestions of one input across a number of nountypes, not a number of inputs across the same nountype. In the future I’d like to make the Nountype Tuner be able to produce these sorts of score curves as well. 

Related Posts

  1. Scoring and Ranking Suggestions
  2. Nountype Quirks: Day 1
  3. Nountype Quirks: Day 2

Related posts brought to you by Yet Another Related Posts Plugin.


Experimental Learning vs. Experimental Learning

Abi Raja

July 27, 2009

9:55 pm

The motto of my high school was “Experiment. Explore. Excel”. And that isn’t where this fetish ends; the word “experiment” was everywhere. The school itself was an experiment1 about experiments (I was a member of first graduating cohort). But therein lies the biggest deficiency of the whole system.

So, we did experiments everyday in the chemistry laboratory, in physics class, everywhere. But what were these experiments about? What did we really do? What we did was to read the handout provided by the teacher and follow the instructions word for word (at this point, what I am going to say must already be obvious to you, dear (well-read and intelligent) reader). We even had classes with names like “Hands-on Chemistry I” and “Hands-on Physics II” 2 3 where we spent entire semesters reading 10 sheets of papers and an hour each week stirring mixtures in glass containers (and the question repeatedly pops into my head ― did I evolve a huge 0.16 inch thick cerebral cortex to stir stuff!?). After that, we’d write a lab report, a report that could have been written a week earlier because the results of the actual experiment were completely inconsequential. On occasion, the outcome of the experiment would differ from what was expected. At these times, you’d stare at the dried powder, ask yourself “Is it really yellow or is it almost white?”, conclude that it was indeed yellowish and then, inform the teacher. She’d give you a long look like you were showing her methamphetamine4 and proceed to dispose off the evil substance. Never once exploring why it was yellow instead of white! Never exploring which faulty procedure could have lead to this state. Never once running a single test to determine, at the very least, its basicity.

Long story short, we repeated experiments that had been done a million times before by a million different people.

The problem, of course, is the word “experiment”. What does it really mean? Here’s what Merriam-Webster lists as one of the definitions ―

ex · per · i · ment ― an operation or procedure carried out under controlled conditions in order to discover an unknown effect or law, to test or establish a hypothesis, or to illustrate a known law.

There are actually two very different definitions within this single one ―

  1. to illustrate a known (effect or) law.
  2. to discover an unknown effect or law.

In schools, these two meanings get compounded together and we end up seeing only experiments of the first kind, which are not as enlightening as the second kind and which did not teach students everything that we expect them to learn from doing experiments, like creativity and problem solving skills.

No doubt there is some value in redoing “experiments”, but that’s only true when students are younger and when there’s no short supply of truly counter-intuitive phenomenon that you have to see to believe. Even in the boring and complicated experiments that we conduct as we get older, in many cases, there is some real experimentation by students (particularly in physics experiments) by modifying the parameters and playing with things. But at some point, experiments fail even at illustrating a known effect. That a chemical which is white and another chemical which is also white react together to form something that is greenish does not excite me any longer. To show that s = ut + 1/2at2 is actually (approximately) true, taking into consideration drag and experimental errors, is practically meaningless to a teenager. A logical derivation from more basic principles/formulae is almost always more convincing and insightful.

The natural question here is how do you have experimental learning in schools of the second kind, where students are able to discover an unknown effect or law5?

One of the best TEDTalks I have seen is about this exact question. Sugata Mitra, a Professor of Educational Technology, started the “Hole in the Wall” experiment, first in a Delhi slum and then, in more rural parts of India. In this experiment, he places a computer in a kiosk within a wall and besides filming it, that’s all he really does. The interesting bit in the talk below is at 9:26 (YouTube lets you go to middle of videos) where a kid who has never before seen a computer (and who thinks it’s an interactive television) learns, by himself, to browse the web in under 8 minutes. It’s fascinating to watch him first figure out that touching the small ball next to the screen moves something on the “television”. Soon, he learns to control the movements of the cursor and then, accidentally discovers that clicking produces interesting effects. By the end of the day, 70 kids have figured out the computer. The propagation of the knowledge is not just by each child discovering it, but also due to children teaching each other about it. It’s truly experimental learning.

Mitra has a really nice and fitting name for his learning methodology too – Minimally Invasive Education.

This is an extreme example of what experimental learning should be like. But it could happen in classrooms too, under the supervision of teachers. We could structure our classes (tolerance for failure would have to be high) so that people have more time to play. (Some schools get this but they also get too playful so to speak. The last time LEGO was fun for me was when I was 8. Play (clearly, I picked the wrong word here) would have to be very different for children of different age groups. By the time you are 14, drop-out-of-school-and-do-a-startup-worthy web services and desktop software are “play”.) Admittedly, it’s tough to let students experiment in many fields. Should schools allow students to mix radioactive and potentially biohazardous substances that aren’t supposed to be mixed together? Maybe not. Math, computer science and most of physics are much easier to experiment in. All you need is either a pen and a piece of paper or a computer.6

The other thing that schools could do is to get out of the fucking way and give more free time to students. For most of my high school life, I hardly had any time to do anything outside school. There was just so much schoolwork and homework. But in my final semester, the school pushed all our classes to Monday and Tuesday and made us do either a 3-day internship or a research project with a university professor. Since I picked the latter, I only had meetings once or twice a week. I practically had 5 day weekends and who’d have guess it, that’s when I did some of the most meaningful things I’ve ever done. I worked on Devo and Ubiquity and a bunch of other personal projects. I started reading novels and nonfiction and blogs again. I had a much better intellectual life than I ever had in school.

To experiment is to discover

I’m not really sure what a school can do to facitilate real experimental learning. I believe that the best role that school can play is the most minimal one – bring together people with similar interests and just let them do whatever they think is fun.

There just might be a more active role for schools but alas, we have gotten hold of the wrong meaning of “experiment”. It’s sad to see millions of dollars in education research wasted and schools around the world fall into the same trap. We continue down this path without realizing how wrong we all are because, to truly experiment is to discover, not repeat.

  1. Apparently, the Singapore government felt that its education system wasn’t producing enough talented people, so like any sufficiently authoritarian (in a loose sense of that word, of course) government blind to its own faults, it figured that the schools were the one that were bad. And then, they started my school.
  2. As you might guess, creativity is not a job requirement for chief-module-namer here.
  3. “Hands-on learning”, of course, is just another alternative term for “experimental learning”; possibly more accurate but in practise, it’s all lumped together to mean the same thing.
  4. I know, methamphetamine isn’t yellowish-white.
  5. They might, in fact, discover something that someone has already discovered before but there’s great joy and value in the act of discovering.
  6. Arts on the other hand are all about play and I suppose that’s why these people are called “creatives”. While we have established that other professions like software developer require a lot of creativity too, this might also be true : that chemical engineers don’t need creativity to do their work and a lot of other people don’t either. One thing is certain – every profession is getting increasingly creative.

Older Posts »