What is the difference between UTF-8 and Unicode?

To expand on the answers others have given: We’ve got lots of languages with lots of characters that computers should ideally display. Unicode assigns each character a unique number, or code point. Computers deal with such numbers as bytes… skipping a bit of history here and ignoring memory addressing issues, 8-bit computers would treat an … Read more

The origin on why ‘%20’ is used as a space in URLs

It’s called percent encoding. Some characters can’t be in a URI (for example #, as it denotes the URL fragment), so they are represented with characters that can be (# becomes %23) Here’s an excerpt from that same article: When a character from the reserved set (a “reserved character”) has special meaning (a “reserved purpose”) in a certain context, … Read more

Is ‘# -*- coding: utf-8 -*-‘ also a comment in Python?

Yes, it is also a comment. And the contents of that comment carry special meaning if located at the top of the file, in the first two lines. From the Encoding declarations documentation: If a comment in the first or second line of the Python script matches the regular expression coding[=:]\s*([-\w.]+), this comment is processed as an encoding declaration; the … Read more

python encoding utf-8

You don’t need to encode data that is already encoded. When you try to do that, Python will first try to decode it to unicode before it can encode it back to UTF-8. That is what is failing here: Just write your data directly to the file, there is no need to encode already-encoded data. If you instead build up unicode values instead, you would … Read more

Python – Reading and writing csv files with utf-8 encoding

You report three separate problems. This is a bit of a guess into the blue, because there’s not enough information to be sure, but you should try the following: input encoding: As suggested in comments, try “utf-8-sig”. This will remove the Byte Order Mark (BOM) from your input. double quotes: Among the csv parameters, you specify quoting=csv.QUOTE_NONE. This tells the csv library … Read more