July 29, 2003

OT: worthless word for the day

one site i get a kick out of is wwftd. it aims "at obscure, abstruse and/or recondite word, including such falling into the following categories, if deemed to be appropriately ludicrous: medical terms, foreign monetary units, foreign units of measure, legal terms, or professional jargon of any type." if you're looking for something to spice up your next users manual or help file, this might be the place for you. imagine your users delight to see a menu bar with expiscate (search) or a popup ad offering them a free lant (to mingle ale with stale urine to make it strong).......or maybe not ;-)

July 26, 2003

OT: mad about map projections

the map geek part of my psyche keeps bubbling up. here's an interesting site concerned with map projections. it purports to try to present a "complete a collection as possible, historical published map projections"--don't know about complete but it sure a has a bunch of them. the site is certainly mad about PDFs, 99% of the links are PFD docs--the whole thing's optimized for print. i plan on papering the walls of my home office with these...

July 22, 2003

sign this

while i wouldn't normally like to encourage flash coders ;-) some flash folks in the middle east have put together a petition asking Macromedia to support rtl (right to left) languages in flash. it does look like they have some legitimate gripes.

July 17, 2003

more humbug: the reception to I18N CF5 fillets the page

after we all had a good laugh trying to use machine translations, Uwe Degenhardt and i have rounded up some human beings to translate content (a whole paragraph) on cftools.sdsolutions.de. this is another case showing how idiotic machine translators can be. lets take spanish for instance. the original english was: Welcome to the I18N CF5 Tools page. You can pick up a few free I18N tools for CF5 here. If you're using CFMX, we recommend the MX-specific I18N tools in the Macromedia ColdFusion Exchange. You'll find the available tools listed below. one machine translator, which i won't name but its babelfish ;-) translated that to: La recepción al I18N CF5 filetea la página. Usted puede tomar algunas herramientas libres de I18N para CF5 aquí. Si usted está utilizando CFMX, recomendamos las herramientas MX-especi'ficas de I18N en el intercambio de Macromedia ColdFusion. Usted encontrará las herramientas disponibles enumeradas abajo. while a real live homo sapien translated that english as: Bienvenidos al sitio I18N CF5 Tool. Aquí puede hacerse de algunas herramientas gratis para CF 5. Si se sirve de CFMX, recomendamos las herramientas específicas MX I18N de Macromedia descargar la zona ColdFusion-Exchange. Las herramientas disponibles se encuentran abajo. i don't speak any spanish but even i can see these two translations are quite different, who would you bet on? in any case, a good test of these things is to round-trip the translation--ie do you get back what you put in? in this case, not quite: The reception to I18N CF5 fillets the page. You can take some free tools from I18N for CF5 here. If you are using CFMX, we recommended the MX-specific tools of I18N in the Macromedia interchange ColdFusion. You will find the tools down available enumerated. bah humbug.

w3c i18n activity site

back to our regular scheduled programming. the w3c (world wide web consortium) produces a pretty useful i18n resource site. there is a weekly (in theory anyway) question that the GEO task force answers. i urge cf i18n folks to join the w3c international mailing list to keep abreast of things (its pretty low volume).

geoLocator's grass skirt slips a bit lower

eric mauviere has tweaked the geoLocatorMap a bit more. it now handles mouseovers to show country name (in english) and "hits" (in this case the number of hits into the geoLocator testbed). you can see it here: geoLocatorMap. a few more tweaks are probably in order so if you have any suggestions please send them along to me.

July 16, 2003

the hoi polloi geoLocator in a grass skirt

eric mauviere has added a nifty world map flash front end for geoLocator. its dead simple to use, just pass in flashvars with ISO 2 char code, neatly provided by geoLocator, for the countries you want highlighted and bob's your uncle. flash remoting was out since we're supporting CF5. you can see it in action at cftools.sdsolutions.de.

July 13, 2003

warren zevon's got sand.

Life'll Kill Ya. you have to admire the sand that warren zevon has. dying from lung cancer, dr. hunter s. thompson as his personal physician (not sure a perfectly healthy person would survive his gonzo bedside manner ;-), he manages to pull out yet another album. besides being one of my favorite musicians, he's certainly admirable for his courage and grace.

July 11, 2003

the hoi polloi geoLocator

as promised, Uwe Degenhardt has generously offered a site to distribute the CF5 version of the geoLocator (and maybe a few other cf5 i18n tools). head over to cftools.sdsolutions.de and pick up your copy. since this is geared for a cf5 environment, better aggressively cflock these functions.

encoding jig, java style

i suppose i shouldn't complain so much about cf i18n folks not quite getting on board the unicode 747. even in the java world, a world created from the ground up in unicode, many people are still having problems. a quick cruise thru the java i18n forums shows even more encoding misery than i normally see in the cf forums. that got me to thinking about tatooing "JUU" (just-use-unicode) on my forehead, but my wife suggested that wouldn't go over big w/her and the kids ;-) so as not to be a totally valuless blog posting: if you ever need to declare the encoding of an external CSS style sheet, according to the CSS2 Specification, you can use the "@charset at-rule". in your unicode CSS you would have something like @charset "UTF=8"; as the very first line in the file. note that this won't work for embedded styles as these share the encoding of the webpage. browsers are supposed to interpret a web page's encoding in the following order:
  1. HTTP "charset" parameter in a "Content-Type" field (which mx ignores btw)
  2. the @charset at-rule
  3. mechanisms of the language of the referencing document (e.g., in HTML, the "charset" attribute of the LINK element)
so now you know.

July 09, 2003

dancing the encoding jig

after once again doing the "encoding jig" in the forums, i'm at a bit of a loss as to why some i18n folks will not use unicode and still insist on using codepage encodings for their text data especially for languages/locales that have more than one encoding "standard". you're asking for trouble as its often the case that these multiple encodings actually do not exactly match. you will even see this with english language (well symbols like the euro, smart quotes, etc. anyway). while legacy data might need converting and might be a semi-valid reason for not moving to unicode (and unless you're talking Gb of data, any DBA worth his paycheck shouldn't flinch at this task, so even this isn't really a valid reason for not converting), legacy cf code to my mind isn't. you still need to put some effort into getting legacy codepage based code to work with MX--an effort that really doesn't buy you anything long term, it just moves that code's dead end a bit further down the road. perhaps more importantly, it also misses the opportunity to modernize your code with the new MX functionality (CFC, webservices, etc.). things that will actually help your i18n code overall. besides all the forgoing guff, these are some valid reasons for using unicode.
  • its a standard (as they say here in thailand, "unicode same same ISO 10646")
  • its internet ready (xml, perl, java, javascript, etc. all support unicode)
  • its multilingual
  • it travels well (text in any language can be easily exchanged globally)
  • unicode offers monolithic text processing (and that of course saves you money in development and support costs, time to market, etc.)
  • wide industry support (macromedia, ibm, microsoft, hp, sun, oracle,etc.) make it vendor neutral
  • continuously evolving (its now version at 4.0)
  • possible to convert from legacy code pages
  • its more or less apolitical
  • and lets not forget its the default encoding for MX
you can find more details and examples of unicode's benefits from tex texin's site.

July 08, 2003

geoLocator jar updated

nigel wetters has updated the InetAddressLocator jar (the real brains behind the geoLocator CFC) with new IP/locale data. the CFC code wasn't changed so all you need to do is replace the jar file distributed with the CFC with the one here.

July 07, 2003

blog gone south

no idea why but my latest blog archives have "gone south" (sort of old american slang meaning "to disappear; fail by or as if by vanishing; abscond with money or loot, to cheat at cards, or simply to sharply diminish"). then the RSS went and had a conniption fit (even older american slang meaning "fainting fit, anxiety attack or tantrum") so that all the blog text body took a hike. this is all sort of voodoo like in that i hadn't changed anything. just to compound the problem i decided to change the layout, etc. to make it more readable. what was broken, is still broken, but at least it looks different. i guess this means i will have to move my blog to something i understand a bit more (ie. cf-based blog). the question is which one? ray camden's blog (though i have to re-write that to remove createobject dependency)? or the italian xml blog (which i have to re-write to use java reflection to remove dependency on cffile)? anyone care to share their thoughts on this?

July 06, 2003

boatload of dictionaries

if you ever had the need to look up the word "ice cream" in tsalagi (cherokee), the i love languages site's dictionaries resource is the first place you ought to look. its got dictionaries for sign language, lingua franca (extinct pidgin-style european language), english-finnish fortran 90, indonesian, maori, horse anatomy, romani, russian/swahili, tree dictionary--a whole boatload of them. many are very specific technical dictionaries. by the way "ice cream" in cherokee is "a-(ga)-da-tlv-da-u-ne-s-da-la".

July 03, 2003

general literary gorgeousness

just in case you slept through it, the whole world's been marching towards globalization for some time now. here's mark twain's take on one tiny bit of it, an american learning german ;-) modern german speakers might get a kick out of the 19th century german.

July 02, 2003

j2se 1.4.2 heads up

heads up sun's released version 1.4.2 and changed the installation procedure under windows that might effect G11N installs. the default installer will only install european language support if the windows host system only has support for those languages installed--past versions did a "tower of babel" install. this is the case for most non-officially supported cf locales (thai, arabic, etc.) and most shared hosts (i think this is where the impact might be the most). you can do a custom install and add other language support.

geoLocator for the hoi polloi

Uwe Degenhardt and i have ported the MX geoLocator CFC to CF5 as a sort of CFLIB style include. you can test it here. the code wil be made available someplace, soon. btw hoi polloi is greek for "the common people" according to this reference and not "high society" as its often misused.

July 01, 2003

i live in the condo of babel

a very good background resource for langauges is the ethnologue: languages of the world. if you need to know what languages (and i mean all) are spoken in a given country or where a given language is spoken (and i mean everywhere) this should be your first stop. while you might be somewhat surprised to know that there are 75 languages spoken in thailand, care to take a guess at how many are spoken in france, germany or the usa? and, of course, it has language information geographically.