Strange characters – despite everything being UTF-8

This is typically caused when you are copying/pasting MS Word information into the WordPress content editor. WordPress uses something called “Smart Quotes”, via a function named wptexturize(). Ideal Solution The ideal solution would be to go back through your content, and replace all single/double quotes using the keyboard. However, if you’re working with massive copy/pastes, … Read more

Byte and char conversion in Java

A character in Java is a Unicode code-unit which is treated as an unsigned number. So if you perform c = (char)b the value you get is 2^16 – 56 or 65536 – 56. Or more precisely, the byte is first converted to a signed integer with the value 0xFFFFFFC8 using sign extension in a widening conversion. This in turn is … Read more

Is ‘# -*- coding: utf-8 -*-‘ also a comment in Python?

Yes, it is also a comment. And the contents of that comment carry special meaning if located at the top of the file, in the first two lines. From the Encoding declarations documentation: If a comment in the first or second line of the Python script matches the regular expression coding[=:]\s*([-\w.]+), this comment is processed as an encoding declaration; the … Read more