memory error in python

This one here:

s = raw_input()
a=len(s)
for i in xrange(0, a):
    for j in xrange(0, a):
        if j >= i:
            if len(s[i:j+1]) > 0:
                sub_strings.append(s[i:j+1])

seems to be very inefficient and expensive for large strings.

Better do

for i in xrange(0, a):
    for j in xrange(i, a): # ensures that j >= i, no test required
        part = buffer(s, i, j+1-i) # don't duplicate data
        if len(part) > 0:
            sub_Strings.append(part)

A buffer object keeps a reference to the original string and start and length attributes. This way, no unnecessary duplication of data occurs.

A string of length l has l*l/2 sub strings of average length l/2, so the memory consumption would roughly be l*l*l/4. With a buffer, it is much smaller.

Note that buffer() only exists in 2.x. 3.x has memoryview(), which is utilized slightly different.

Even better would be to compute the indexes and cut out the substring on demand.

Leave a Comment