dancing bunnies!

on me, myself, and dancing bunnies
posts - 57, comments - 25, trackbacks - 18937

i18n: What's in a name?

I'm not even going to pretend to discuss character sets, Unicode, UTF-8 and all that jazz... Michael Kaplan does so much better than I ever could, and from within the mother ship as well. Still, I have some thoughts on other internationalization issues: how headache-inducing things can get when you want to get it ALL right. If you think this is all way too far-fetched, then you probably:

A) Are a US citizen that hasn't travelled much, and/or
B) Haven't spent years creating CRM software tailored for internationally operating law firms.

Drat. The latter option does somewhat reveal whence I am deriving my authority from, doesn't it? Oh well... nothing for it but to forge ahead now. Well, I could of course remove the flippant A/B thing above and rewrite it into something not mentioning it, but what fun would that be? Sounds like work to me. As opposed to talking about it the way I am now you mean? Seriously, in the time I am writing this I could have rewritten those two measly lines five times over. Yes, well, but I didn't feel like it, so there.

Anyway, on with the story.

Random aside #1: I personally think issues like these are the reason WinFS was cancelled, since it tried generalized contacts -- exactly where one starts to run into all of these problems.

Random aside #2: For the longest time, I thought "i18n" was an in-joke when people (such as the aforementioned Michael Kaplan) used it online. I shrugged it off and mentally filed it as a synonym, only to discover months later that it simply means i-18-letters-I-am-too-lazy-and/or-dyslexic-to-type-correctly-on-a-continuing-basis-n. What a let-down.

Anyway, on with the story.

What's in a name?

Let's start in the wild, shall we?

Active Directory breaks up names this way[1]:

  • First name
  • Initials
  • Last name
  • Display name

Personally, I think this is a very, very wise approach. First name so you can sort on it, last name so you can sort on it, initials so applications *cough* Outlook *cough* can retrieve them for their own purposes, and after that, bugger all. Free-form, go forth and type it how you want it. If it doesn't display right, it's on you.

Microsoft Office Outlook 2007 Beta 2 Technical Refresh (I swear Microsoft marketers are paid by the letter) does it this way for contacts:

  • Title
  • First name
  • Middle name
  • Last name
  • Suffix

Much to like here: Title and Suffix, while offering defaults, are free-form and thus allow for any multiple of doctorates and honorary titles. It does not even suffer from a very common handicap of US-created software: ridiculously limited middle name length. My middle names (yes, almost the entire world allows for multiple middle names) are Christiaan Hendrikus, which is big fun on forms and a staggeringly large proportion of software that allows for only one middle initial. Think that is freakish? Think again. A rather famous Dutch politician[2] has four middle names: Antonius Franciscus Maria Oliva. Think that is freakish? Think again. A rather famous member of the Dutch Royal family[3] has eight: Leopold Frederik Everhard Julius Coert Karel Godfried Pieter. Okay, now that is freakish. Anyway, Outlook is a positive surprise here, with room for no less than 127 characters for middle name(s).

By the way, let me point out that Outlook even helps you even further.

It has a plain looking name field, like so:

It then attempts to split this name into the 5 behind-the-scenes fields. And does very nicely, at least for US titles:

is parsed as

Nice touch. You know what I'd love to see? The spec for internationalization on this. Ooooh Molly, that puppy must be a 200 page doozy. As in... how would this work in Mandarin?

Anyway, where were we?

Both models do very well for what they are designed for (account information and basic contact lists, respectively), but break down as soon as you get into serious CRM territory[4]. Think mail merge, people...

  • They  miss some of the basics, such as "address as"[5].
  • They do nothing to prevent typos in titles, such as "profesor". Think that does not happen? Think no business is lost because of addressing a PhD as a PoD? Think again.
  • They do not allow for different expressions of titles in different situations, such as "To: Prof. J. M. Shmartypants" as opposed to "Dear Professor Shmartyshorts".
  • They do not allow for differences in initialization for certain languages and cultures. Some want "Theo" to become "T.", others want "Theo" to become "Th.".
  • It does not allow for full entry or correct sorting of those that like their middle names better than their first names, but still want their first name used as an initial (as far as I can tell, a specifically American affliction). Simply adding an alias field does not cut it either, because there are those that have an alias for their first name (I, for one, go by Stuart to prevent me from hearing my name butchered dozens of time a day).
  • It does not allow for translations of titles. For example, a US user might want to add and see a judge as the "Honorable T. Bobby Blackrobe IX", but if Bobby speaks Dutch exclusively, he would like to see  "De Weledelgestrenge Heer T. Bobby Blackrobe IX".

So let us attempt to summarize what one would need, ideally:

  • First, middle, last? Hah. You need first name, alias, middle names (separated, room for at least 8), last name. You need to be able to specify which names are abbreviated and which names are spelled in full. And oh, be careful which parts you deem to be mandatory: what if you want to mail Prince something? If that is dating me too much, umm... Shakira?
  • You need a localized table of titles, and the ability to add them in a certain order. Why the order? To accommodate people like "Prof. Dr. Dr. Joe Smartypants". No, I am not kidding (well, except for the name). Pathological cases like that do exist. This table, of course, also needs to specify whether the title comes before or after the name, and the order of preference between them (if someone is a Medical Doctor and a Judge, do you address this person as Your Honor, or Doctor, or Honorable Doctor, or Honorable Pancake, or...)?
  • Suffix can be free-form, unless you mistrust your data entry sufficiently to mandate a localized table of Jr., Sr., I, II, III et cetera.
  • You need to know both the country of residence and preferred language of address. Why? Because of people living in a country with official languages other than their primary one. You would want an address label to reflect the language of the country of residence (or, disputably, the language of the country where the letter is mailed from -- more on that in later posts), but have the letter inside reflect the language (and form of address) of the primary language and culture of the addressee. And when you have all of that figured out, two words: "window envelopes".

Okay, that is quite enough for one post. I've outlined some of the problems; I will try to outline some of the possible solutions later. Meanwhile, I double-dog dare any current or prospective PM to write a spec that addresses all the issues I mentioned above. For the record, I have the data structure in my head, and I tell you, it is a beast. Yes, that is a hint to the WinFS team, if such a thing still exists within Microsoft. Wink wink, nudge nudge.

I just remembered that I haven't even gotten started on addresses yet. Oh dear.

 

[1] Yes, I know these are just the defaults and that you can add additional fields to the schema. Which not a soul does.

[2] A free virtual cookie if you recognize him! Hurry, while supplies last! And no googling, darn you!

[3] A free virtual cookie if you recognize... blah blah blah, you get the point...

[4] I haven't touched any serious CRM software in ages (other than the one I worked on myself), so I would love for someone to tell me how current commercial CRM solutions handle this stuff so I can add that information here.

[5] There's much more to it than that, of course. Localization of forms of address is quite a bit more advanced than simple translation of "Professor". In most cultures, addressing anyone other than family or loved ones with a translation of "Dear" is Not A Good Idea(TM).

Print | posted on Thursday, October 12, 2006 8:02 PM

Comments have been closed on this topic.

Powered by: