September 30, 2003

more ISO follies

while i won't go as far as some folks by calling ISO a training camp for sitcom plot writers, they have again seemingly shot somebody's big toe off as a form of comic relief. the folks looking after ISO 3166 (country codes) have decided to re-assign "cs" (formerly czechoslovakia) to serbia and montenegro. ouch. all those folks with .CS in their email or web addresses, printed on business cards, stationary, brochures, etc. can all just take a hike. the ISO 3166 rules call for a wait period of five years (as if everybody using .CS is even aware of that rule, not to even start thinking about cleaning up all the existing data using .CS), the jig is up. the IAB's not too pleased with this. they want a 200 year wait period before a code is re-used. The irish national body, being more reasonable folks, only want a 100 year wait period. either's fine with me. and no, just in case you're wondering, the "&" symbol isn't legal, ie "s & m" can't be used for serbia and montenegro.

ISO reasonableness: how ISO learned to throw a spitter

it never happened says ISO. good news no matter how you slice and dice it.

September 27, 2003

locale holiday info?

as you can probably tell from my blog recently, i've been beavering away with ibm's icu4j lib's calendar classes. one very interesting related class is Holidays. it contains data (names, dates, date rules, locales, etc.) and methods (isOn, isBetween, firstAfter, etc.) for, well, holidays. its actually very useful for easily building calendars (holidays are something i noticed that most cf calendar tags, etc. do a swell job failing at). i've got a testbed setup. there's a few bugs yet (double applying timezone offsets & the en_US holidays seem flaky) which will get fixed but what's really missing is holiday for more locales. so i'd like to ask if folks would be kind enough to send me holiday info (names, dates, date rules, etc.) for any of the locales not used in the testbed. it would be great if i could collect holiday info from recognized national sources (gathering that now but any pointers would be most welcome--i'm quite sure i will miss some). you can add the info here as comments or email me paul<at> this will eventually be built into a CFC & of course distributed for free w/acknowledgements. thanks.

September 25, 2003

unicode superstar makes the NY Times

the NY Times has a nice article about unicode and one of it's leading lights, dublin-based michael everson. in that article he raises an interesting point concerning unicode, fundamental human rights. if you can't use computers because OS's don't support your alphabet, i'd agree your rights are being violated. the article further points out, something which i wasn't really aware of, how commercial support for the continued work of the unicode consortium is starting to decline as the major computer markets' languages are encoded. i guess klingon's unicode hopes are really slim and next to none.

September 24, 2003

we used to call that circular logic

here's a new twist on the VeriSign "issue". hell's bells, golf cap too tight?

hey don't look at me

several days ago, blogspot ate my older archives. today, my oldest archives came back to life and now all but the oldest archives seem to have gone south. oh well, what can you expect for free. i guess its the last impetuous i need to force my lazy rear-end into gear & move this puppy to a cf-based blog.

September 23, 2003

calendars: straight up easy buddhist calendar

after filling my hoary head with the hebrew calendaring system (and yes, my head still hurts a bit), i thought i'd take on the more straight forward buddhist calendar system. like the hebrew calendar, its based on ibm's icu4j lib. the buddhist calendar numbers years since the birth of the Buddha. in predominantly buddhist countries like thailand (where i live these days) its the civil calendar (ie the official one in general use by most folks and of course the government). it's often used for religious purposes elsewhere. the buddhist calendar is identical to the gregorian calendar in all respects except for the year and era (BC, AD, etc.). years are numbered since the birth of the Buddha in 543 BC (gregorian calendar), so that 1 AD (gregorian calendar) is equivalent to 544 BE (buddhist era) and 2003 AD is 2546 BE. easy enough,right? like the hebrew calendar, i've built a CFC and posted it to the devnet gallery, where it will bubble up eventually.

September 22, 2003

...and chicks for free

xml cover page weighs in on the ISO royalty argument with a detailed article. the more i think and read about this, the more bizarre it sounds. the i18n community's been urged to adopt the ISO codes as standards--swell, we all like standards. but lets take a quick peek at how these ISO standards came about. dozens of individuals in the i18n community contributed ideas and information to the development of these standards. many of these same folks fought the good fight for their adoption (and i rather doubt any of those folks will see a thin dime of royalties if it indeed comes to pass). and to top it off, some of the ISO material duplicates pre-existing information. now that the scale's been tipped in ISO's favor, it seems its time to pay up (or from the goodfella school of business, "screw you, pay me"). this may sound like a tempest in a teapot to non-i18n cf folks, but as the xml cover page article above points out, what if the US post office suddenly announced "hey all you meatballs using 2-letter abbreviations for state names, thanks for adopting our standards, you now owe us money for using them, screw you, pay up". mark davis, unicode president and IBM's i18n frontman (and unicode/i18n guru), thinks this proposal will 'die a well-deserved death.' but he goes on to point out that this was a serious ISO proposal and needs a serious response from the i18n community. so gather your pitchforks and torches folks, the monster's escaped the lab again.

September 21, 2003

making money from thin air

looks like ISO is leaping on "the money for nothing" gravy train. the ISO commercial policies steering group (CPSG) is proposing a royalty on the commercial use of ISO language, country and currency codes. the reason? the necessity for a number of ISO standards to be published as databases.......gosh darned databases again. tim berners-lee has fired off a letter to ISO's president dr. smoot. while this shakes itself out, better not go around mumbling "en" or "US" in a commercial sort of way. swell.

September 19, 2003

calendars: way more than you want to know

most cf developers can conceivably go thru their entire professional lives without ever thinking too much about calendars (dates are another matter of course). most folks are familiar, in an unconscious sort of way, with the gregorian calendar, its the standard for cf and javaexcept in a few locales like here in Thailand. the world, however, is a big place and there exists quite a few other calendar systems. a couple of days ago someone on cf-talk was asking about hebrew dates. my first thought about that was "is the beer cold?" next thought was about ibm's ICU4J lib which contains a good number of the more popular, non-gregorian calendar systems in use today. since i stuffed my greying head full of its calendar details, some of it was bound to leak out: the hebrew calendar is "lunisolar" (where i actually thought it was lunar, silly me). that gives it what some folks would call "a number of interesting properties" distinct from the gregorian calendar. months start on the day of each new moon (the iCU4J lib actually makes an approximation of this). the solar year (as everyone knows is 365.24 days) is not an even multiple of the lunar month (approximately 29.53 days), so an extra "leap month" needs to be inserted in 7 out of every 19 years (yes, i'd call that interesting). and just to make sure everybody's paying attention, the start of a year can be delayed by up to three days in order to prevent certain holidays from falling on the Sabbath (as well as to prevent illegal year lengths). as the final cherry on the ice cream, the lengths of certain months can vary depending on the number of days in the year. hurts my head too, so i figure let the brainiacs at IBM worry about this and just use the java lib's calendar classes. the problem was i normally used these ICU4J calendars in that locale (ie hebrew langauge w/hebrew calendar, arabic w/islamic calendar, etc.). but what was required for this problem was english transliteraions of hebrew dates. i started thinking about jumping thru all sorts of transliteration hoops when i noticed that the lib's calendar and dateFormat classes all took locales as options. that made things much easier, just had to pass in an english langauge locale and bob's your uncle (snippet from a CFC i built): // fire up the ICU4J & java objects we need aCalendar = createObject("java",""); aLocale = createObject("java","java.util.Locale"); aDateFormat = createObject("java",""); //american locale usLocale=aLocale.init("en","US"); //point the hebrew calendar at that locale hCalendar=aCalendar.init(usLocale); //CFC private method to figure epoch date thisJavaDate=getJavaTime(thisDate,tzOffset); //this will return date string based on hebrew calendar formatted for that locale hDate=aDateFormat.getDateInstance(hCalendar,tDateFormat,usLocale).format(thisJavaDate); //and cool & out return hDate; you can see a testbed for the whole CFC here. i'll post the CFC to the gallery. since i use this sort of thing quite a bit, i guess i'll tidy up and make CFCs for the other calendars that ICU4J supports.

September 17, 2003

tengwar marches on

omniglot's guide to writing systems has several good pages dealing with tengwar. how the script originated, how it works, how to write it, etc. it also has interesting links for scottish gaelic, welsh, and even polish. better beware JD ;-) actually omniglot's got quite a few language references that are excellent i18n resources.

geoLocator updated (again)

nigel's updated the IP database so we have released a "new" version of the CFC. you can pick it up here. for some reason the devnet gallery's not updating this any longer, i guess the almost monthly IP database updates is freaking something out. let me know if you into any problems.

September 11, 2003


tony weeg on cf-talk was working on a UDF to determine if a given date was in DST (daylight savings time). he arrived at a good solution for his problem geared towards the east coast of the US. being an i18n sort of guy i suggested something a bit more global based on java.util.TimeZone and wrapped up in a CFC. its actually sort of useful in that you can "cast" to a timezone other than the one used by your cf server (ala i18n functions CFC). you can see it in action here. two things to note: - this CFC is geared towards redsky (actually jre1.4.1 or above). my host is still running MX 6.0 (jre1.3) so the timezones info, etc. is a bit off (i had to tweak the CFC to get it to work w/mx 6.0). - it uses the gregorian calendar for its date calculations. i'll have another timezone CFC based on IBM's icu4j lib calendars ready soon. the CFC should be on the devet gallery in a "bit". in the meantime you can download it here.

September 08, 2003

warren zevon has passed away

sad to say warren zevon passed away sunday. he was 56. he was something else, right to the end.