June 23, 2005

utf-7

as you might already know utf-7 is not a supported java (and hence cf) charset. it does however exist in the wild, mainly as part of bounced email systems and sometimes used in webmail like hotmail (well mainly hotmail, i've never seen it anywhere else to tell you the truth) as well as MS Exchange. folks have been complaining off and on about this for years, many mistakenly blaming macromedia for a sun java bug. votes have piled up in sun's java bugparade but alas and alack, nothing's been done about it. until now. there's a very persistent thread (its been running since feb-2004) in the cf support forums concerning this issue. a few days ago somebody (gdbezona) posted a link to an opensource utf-7 charset JCharset. if you drop that jar (jcharset.jar) into the cfinstall/runtime/jre/lib dir and stop/restart cf server ervice, cf will pick up that utf-7 charset fine. we've exercised this jar pretty thoroughly over the last two days and it has yet to blow up in our faces. it works with cfpop/cfmail/cfile and shows up in the server's available charsets via our charset CFC.

if you're experiencing this issue, you might want to give this thing a whirl.

June 04, 2005

eat your heart out core java

the unicode consortium has announced the release of version 1.3 of the Common Locale Data Repository (CLDR). this release pumps up the locale data from 230+ to 296 locales (96 languages and 130 territories). this release's highlights include:
  • a complete set of POSIX-format data generated, along with a tool to generate different platform versions.
  • the addition of new data to support localization of timezones
  • the addition of data for UN M.49 regions, including continents and region
  • the canonicalization (data in many forms converted to a "standard" form) of the data files, including the consolidation of inherited data
  • currency codes are restricted to ISO 4217 codes (historical as well)
  • number and data tests to verify LDML implementations
  • metadata for LDML
  • mappings from language to script and territory
  • various other fixes and additions of data, and extensions to the specification

for more details see the press blurb and the version information page.

as a reminder, icu4j makes use of the CLDR for it's locale data. hubba hubba.