Data type conversion error: ValueError: Cannot convert non-finite values (NA or inf) to integer
If your DF is big, you’re probably not seeing the missing numbers. But you can use the fillna function to help
If your DF is big, you’re probably not seeing the missing numbers. But you can use the fillna function to help
I ran into a similar problem. It turned out the CSV I had downloaded had no permissions at all. The error message from pandas did not point this out, making it hard to debug. Check that your file have read permissions
Just following on Matt and Dirk. If you want to recreate your existing data frame without changing the global option, you can recreate it with an apply statement: This will convert all variables to class “character”, if you want to only convert factors, see Marek’s solution below. As @hadley points out, the following is more concise. … Read more
You need to select that column: So the syntax here is: You can check the docs and also the 10 minutes to pandas which shows the semantics EDIT If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int this will convert True and False to 1 and 0 respectively:
Here’s an example using apply on the dataframe, which I am calling with axis = 1. Note the difference is that instead of trying to pass two values to the function f, rewrite the function to accept a pandas Series object, and then index the Series to get the values needed. Depending on your use case, it is sometimes … Read more
You need astype: For converting to categorical: Another solution is Categorical: Sample with data:
To delete multiple columns at the same time in pandas, you could specify the column names as shown below. The option inplace=True is needed if one wants the change affected column in the same dataframe. Otherwise remove it. Source: Python Pandas – Deleting multiple series from a data frame in one command
You can use pd.Series.isin. For “IN” use: something.isin(somewhere) Or for “NOT IN”: ~something.isin(somewhere) As a worked example:
You can either Drop the columns you do not need OR Select the ones you need
You can use the package sklearn and its associated preprocessing utilities to normalize the data. For more information look at the scikit-learn documentation on preprocessing data: scaling features to a range.