Product SiteDocumentation Site

3.7. Character sets and Unicode

Character sets are a little bit nasty. The reason for this is that we are working with three (variable) charsets in the Python bindings:
What’s more, a user can send information using a string or a unicode type in python.
This is the way that charsets are used:
The nice thing about this is that when you parse commandline arguments or when you are printing to the terminal, you never have to do any charset conversions. The drawback is that if you know that you are receiving, say, UTF-8 from some other library (eg. an XML reader), then you can do any of two things:
  1. Make sure that the current locale is in utf-8 (use the locale command from the bash shell to check your locale)
  2. Convert the utf-8 data from the other library to unicode strings and use the PT_UNICODE data types (and possibly MAPI_UNICODE flag, but this only affects strings in the argument list of a method call):
message = folder.CreateMessage(0)
s = 'some string from XML lib'

message.SetProps([SPropValue(PR_SUBJECT_W, s.decode('utf-8'))]);

Note

In upcoming versions of Zarafa, unicode will receive server-wide support, but the current version of Zarafa internally works with the windows-1252 (almost identical to iso-8859-15 or Latin-1) charset. This means that although using unicode strings is supported, any character outside the windows-1252 charset will be converted to a questionmark symbol (?).

Note

The python interface will not change for the upcoming unicode version. Python programs written for Zarafa 6.30 or 6.40 will work unchanged on 7.00.