July 09, 2003

dancing the encoding jig

after once again doing the "encoding jig" in the forums, i'm at a bit of a loss as to why some i18n folks will not use unicode and still insist on using codepage encodings for their text data especially for languages/locales that have more than one encoding "standard". you're asking for trouble as its often the case that these multiple encodings actually do not exactly match. you will even see this with english language (well symbols like the euro, smart quotes, etc. anyway). while legacy data might need converting and might be a semi-valid reason for not moving to unicode (and unless you're talking Gb of data, any DBA worth his paycheck shouldn't flinch at this task, so even this isn't really a valid reason for not converting), legacy cf code to my mind isn't. you still need to put some effort into getting legacy codepage based code to work with MX--an effort that really doesn't buy you anything long term, it just moves that code's dead end a bit further down the road. perhaps more importantly, it also misses the opportunity to modernize your code with the new MX functionality (CFC, webservices, etc.). things that will actually help your i18n code overall. besides all the forgoing guff, these are some valid reasons for using unicode.
  • its a standard (as they say here in thailand, "unicode same same ISO 10646")
  • its internet ready (xml, perl, java, javascript, etc. all support unicode)
  • its multilingual
  • it travels well (text in any language can be easily exchanged globally)
  • unicode offers monolithic text processing (and that of course saves you money in development and support costs, time to market, etc.)
  • wide industry support (macromedia, ibm, microsoft, hp, sun, oracle,etc.) make it vendor neutral
  • continuously evolving (its now version at 4.0)
  • possible to convert from legacy code pages
  • its more or less apolitical
  • and lets not forget its the default encoding for MX
you can find more details and examples of unicode's benefits from tex texin's site.


Post a Comment

<< Home