UnicodeDecodeError: ‘utf8’ codec can’t decode byte 0x80 in position 3131: invalid start byte

It doesn’t help that you have sys.setdefaultencoding(‘utf-8′), which is confusing things further – It’s a nasty hack and you need to remove it from your code. See https://stackoverflow.com/a/34378962/1554386 for more information The error is happening because line is a string and you’re calling encode(). encode() only makes sense if the string is a Unicode, so Python tries to convert it Unicode first using …

Read more

Using unicode character u201c

The reason is that in 3.x Python You can’t just mix unicode strings with byte strings. Probably, You’ve read the manuals dealing with Python 2.x where such things are possible as long as bytestring contains convertable chars. works fine for me, so the only reason is that you’re using wrong encoding for source file or …

Read more

python encoding utf-8

You don’t need to encode data that is already encoded. When you try to do that, Python will first try to decode it to unicode before it can encode it back to UTF-8. That is what is failing here: Just write your data directly to the file, there is no need to encode already-encoded data. If you instead build up unicode values instead, you would …

Read more

Python – Reading and writing csv files with utf-8 encoding

You report three separate problems. This is a bit of a guess into the blue, because there’s not enough information to be sure, but you should try the following: input encoding: As suggested in comments, try “utf-8-sig”. This will remove the Byte Order Mark (BOM) from your input. double quotes: Among the csv parameters, you specify quoting=csv.QUOTE_NONE. This tells the csv library …

Read more