Python CSV error: line contains NULL byte

I’m working with some CSV files, with the following code:

reader = csv.reader(open(filepath, "rU"))
try:
    for row in reader:
        print 'Row read successfully!', row
except csv.Error, e:
    sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))

And one file is throwing this error:

file my.csv, line 1: line contains NULL byte

What can I do? Google seems to suggest that it may be an Excel file that’s been saved as a .csv improperly. Is there any way I can get round this problem in Python?

== UPDATE ==

Following @JohnMachin’s comment below, I tried adding these lines to my script:

print repr(open(filepath, 'rb').read(200)) # dump 1st 200 bytes of file
data = open(filepath, 'rb').read()
print data.find('\x00')
print data.count('\x00')

And this is the output I got:

'\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1\x00\x00\x00\x00\x00\x00\x00\x00\ .... <snip>
8
13834

So the file does indeed contain NUL bytes.

Leave a Comment