pandas
Pandas: change data type of Series to String
You can convert all elements of id to str using apply Edit by OP: I think the issue was related to the Python version (2.7.), this worked:
Python: create a pandas data frame from a list
DataFrame.from_records treats string as a character list. so it needs as many columns as length of string. You could simply use the DataFrame constructor.
Group by index + column in pandas
From version 0.20.1 it is simplier: Strings passed to DataFrame.groupby() as the by parameter may now reference either column names or index level names
xlrd.biffh.XLRDError: Excel xlsx file; not supported
As noted in the release email, linked to from the release tweet and noted in large orange warning that appears on the front page of the documentation, and less orange, but still present, in the readme on the repository and the release on pypi: xlrd has explicitly removed support for anything other than xls files. In your case, the solution is to: … Read more
How to count the NaN values in a column in pandas DataFrame
You can use the isna() method (or it’s alias isnull() which is also compatible with older pandas versions < 0.21.0) and then sum to count the NaN values. For one column: For several columns, it also works:
How to load a tsv file into a Pandas DataFrame?
The .read_csv function does what you want: If you have a header, you can pass header=0. Note: Prior 17.0, pd.DataFrame.from_csv was used (it is now deprecated and the .from_csv documentation link redirects to the page for pd.read_csv).
Python: pandas merge multiple dataframes
Below, is the most clean, comprehensible way of merging multiple dataframe if complex queries aren’t involved. Just simply merge with DATE as the index and merge using OUTER method (to get all the data). Now, basically load all the files you have as data frame into a list. And, then merge the files using merge or reduce function. Note: you can add as … Read more
Combine two columns of text in pandas dataframe
If both columns are strings, you can concatenate them directly: If one (or both) of the columns are not string typed, you should convert it (them) first, Beware of NaNs when doing this! If you need to join multiple string columns, you can use agg: Where “-” is the separator.
AttributeError: Can only use .dt accessor with datetimelike values
Your problem here is that to_datetime silently failed so the dtype remained as str/object, if you set param errors=’coerce’ then if the conversion fails for any particular string then those rows are set to NaT. So you need to find out what is wrong with those specific row values. See the docs