-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
In Python + scikit-learn
from sklearn import feature_selection
import numpy as np
X = np.arange(10)
X = X / np.pi
X = X.reshape((10,1))
y = np.array([0.1, 0.3, 0.2, 0.4, 0.5, 0.7, 0.6, 0.8, 1.0, 0.9])
# By defaut: n_neighbors = 3
feature_selection.mutual_info_regression(X, y) # = 0.69563492
In Julia + CausalityTools
import CausalityTools: mutualinfo, KSG1, KSG2
X = (0:9) ./ π
y = [0.1, 0.3, 0.2, 0.4, 0.5, 0.7, 0.6, 0.8, 1.0, 0.9]
mutualinfo(KSG1(k=3), X, y) # = -0.43223601423458935
mutualinfo(KSG2(k=3), X, y) # = 0.4746008686099013
The results are clearly different. In addition, in scikit-learn, if mi<0, 0 will be reported.
For running time, I have
In Python
X = np.random.rand(171522,1)
y = np.random.rand(171522,)
tic=time.perf_counter(); feature_selection.mutual_info_regression(X,y); toc=time.perf_counter()
array([0.00072401])
toc - tic # = 0.96151989325881 seconds
In Julia
mutualinfo(KSG1(k=3), rand(171522), rand(171522))
@time mutualinfo(KSG1(k=3), rand(171522), rand(171522)) # 2.245426 seconds (857.67 k allocations: 83.230 MiB)
mutualinfo(KSG2(k=3), rand(171522), rand(171522))
@time mutualinfo(KSG2(k=3), rand(171522), rand(171522)) # 2.113910 seconds (857.68 k allocations: 91.082 MiB)
So, Julia implementation is much slower than Python implementation.
Metadata
Metadata
Assignees
Labels
No labels