Sven has shown how to use the class gaussian_kde
from Scipy, but you will notice that it doesn’t look quite like what you generated with R. This is because gaussian_kde
tries to infer the bandwidth automatically. You can play with the bandwidth in a way by changing the function covariance_factor
of the gaussian_kde
class. First, here is what you get without changing that function:
However, if I use the following code:
import matplotlib.pyplot as plt import numpy as np from scipy.stats import gaussian_kde data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8 density = gaussian_kde(data) xs = np.linspace(0,8,200) density.covariance_factor = lambda : .25 density._compute_covariance() plt.plot(xs,density(xs)) plt.show()
I get
which is pretty close to what you are getting from R. What have I done? gaussian_kde
uses a changable function, covariance_factor
to calculate its bandwidth. Before changing the function, the value returned by covariance_factor for this data was about .5. Lowering this lowered the bandwidth. I had to call _compute_covariance
after changing that function so that all of the factors would be calculated correctly. It isn’t an exact correspondence with the bw parameter from R, but hopefully it helps you get in the right direction.