-
Notifications
You must be signed in to change notification settings - Fork 185
Open
Description
https://github.com/hexiangnan/neural_factorization_machine/blob/master/LoadData.py#L47
In the read_features() function, you just init a dict, and record the feature names and the first user which get this feature!! Nonsense!! Lost of Data.
This is a preview of ml-tag.test.libfm file:
-1.0 51798:1 2473:1 37583:1
-1.0 66335:1 61344:1 29842:1
-1.0 89085:1 60033:1 47050:1
1.0 61293:1 8073:1 3903:1
-1.0 81335:1 56575:1 50067:1
-1.0 65166:1 48181:1 12510:1
-1.0 75300:1 26027:1 38510:1
1.0 10219:1 2122:1 383:1
1.0 80855:1 80856:1 24728:1
1.0 67033:1 721:1 19495:1
I rewrite the code and test on upper file, and clearly you lost many data! Wrong Badly.
def read_features(file): # read a feature file
features = {}
i = len(features)
with open(file) as f:
for line in f:
items = line.strip().split(' ')
for item in items[1:]: # ['51798:1', '2473:1', '37583:1']
if item not in features:
features[item] = i
i = i + 1
else:
print('nfm load code error', i, item)
return features
Metadata
Metadata
Assignees
Labels
No labels