UnicodeDecodeError, invalid continuation byte

In binary, 0xE9 looks like 1110 1001. If you read about UTF-8 on Wikipedia, you’ll see that such a byte must be followed by two of the form 10xx xxxx. So, for example: But that’s just the mechanical cause of the exception. In this case, you have a string that is almost certainly encoded in … Read more

What Unicode characters represent “time”?

The following code points exist related to clocks, watches, and other devices to indicate time: You can copy and paste the characters from this page into most editors. At unicode-table.com you might find more useful code points.

How well is Unicode supported in C++11?

How well does the C++ standard library support unicode? Terribly. A quick scan through the library facilities that might provide Unicode support gives me this list: Strings library Localization library Input/output library Regular expressions library I think all but the first one provide terrible support. I’ll get back to it in more detail after a … Read more

What is a unicode string?

Update: Python 3 In Python 3, Unicode strings are the default. The type str is a collection of Unicode code points, and the type bytes is used for representing collections of 8-bit integers (often interpreted as ASCII characters). Here is the code from the question, updated for Python 3: Working with files: Historical answer: Python 2 In Python 2, … Read more

Using unicode character u201c

The reason is that in 3.x Python You can’t just mix unicode strings with byte strings. Probably, You’ve read the manuals dealing with Python 2.x where such things are possible as long as bytestring contains convertable chars. works fine for me, so the only reason is that you’re using wrong encoding for source file or … Read more