Malayalam Unicode and MS Word

My last post was about the difficulties faced by people/companies involved in localization and translation processes. With this post, I am trying to present some useful information I gained through my involvement in Malayalam localization and translation domain over the past 13 years.

Rule number 1: Never trust DOCX when you are processing Malayalam Unicode. If you use obsolete standard for Malayalam Unicode (joiner dependent standard), DOCX will spoil the show. Until you save and close the file, everything will seem to be OK. Once you close and open the saved file, you will see clubbed word and stripped CHILLAKSHARA.

You can not retrieve the content into original format. The only solution here is to retype the content.

Even if you use latest standards for imputing Malayalam Unicode (standard which depending up on atomic CHILLAKSHARA), you will end up in trouble. Once you close your DOCX and open it, you will see many clubbed words.

Rule number 2: Avoid RTF when you deal with Malayalam Unicode. You will face the above mentioned problems with RTF also.

Rule number 3: Always use DOC. You may be using Windows 7 or 8 and you may be having latest Office suits. But, do not venture to use latest Word formats. Stable Word format is DOC. I suggest you to use Microsoft Word 2003 version for a stable output.