A basic explanation of encoding stuff: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) – Joel on Software and its impact on .Net world: Strings in C# and .NET and Unicode and .NET.
Late edit: Eric Sink shows us how Microsoft has become trapped with UTF-16.
They have taken the decision too early to go to UCS-2. And when computers had to manage more characters than UCS-2 was able to handle (with the arrival of many Asian countries using large alphabets), their installed base was too large to switch to UTF-8. So they have chosen UTF-16, just because if you forget extra characters, UCS-2 and UTF-16 match. UCS-4 was too big, so a variable length was mandatory, but UTF-16 is the wrong intermediate: you have to deal with complexity of variable encoding without compactness of UTF-8. Even worse, it can lead you to use ignore kerning-pairs and use UTF-16 as a fixed length encoding. Lots of .Net programers think that System.String is just an array of 16 bits values.