You can use parameter usecols with order of columns:
import pandas as pd from pandas.compat import StringIO temp=u"""TIME XGSM 2004 006 01 00 01 37 600 1 2004 006 01 00 02 32 800 5 2004 006 01 00 03 28 000 8 2004 006 01 00 04 23 200 11 2004 006 01 00 05 18 400 17""" #after testing replace StringIO(temp) to filename df = pd.read_csv(StringIO(temp), sep="\s+", skiprows=1, usecols=[0,7], names=['TIME','XGSM']) print (df) TIME XGSM 0 2004 1 1 2004 5 2 2004 8 3 2004 11 4 2004 17
Edit:
You can use separator regex
– 2 and more spaces and then add engine='python'
because warning:
ParserWarning: Falling back to the ‘python’ engine because the ‘c’ engine does not support regex separators (separators > 1 char and different from ‘\s+’ are interpreted as regex); you can avoid this warning by specifying engine=’python’.
import pandas as pd from pandas.compat import StringIO temp=u"""TIME XGSM 2004 006 01 00 01 37 600 1 2004 006 01 00 02 32 800 5 2004 006 01 00 03 28 000 8 2004 006 01 00 04 23 200 11 2004 006 01 00 05 18 400 17""" #after testing replace StringIO(temp) to filename df = pd.read_csv(StringIO(temp), sep=r'\s{2,}', engine='python') print (df) TIME XGSM 0 2004 006 01 00 01 37 600 1 1 2004 006 01 00 02 32 800 5 2 2004 006 01 00 03 28 000 8 3 2004 006 01 00 04 23 200 11 4 2004 006 01 00 05 18 400 17