-
Notifications
You must be signed in to change notification settings - Fork 42
Description
Hello, thank you for sharing this package. I'm hoping to use it to help group users into diverse groups based on socioeconomic factors like race, gender, age, etc. Our dataset contains 20 factors that need to be taken into consideration. Have you used this to solve such a problem?
I've started some preliminary testing, and seem to be getting results but I can't tell what is happening behind the scenes. Furthermore, I would like to be able to weight each factor. For example, race may be the most important factor in some cases, while gender may be in others.
Here is what the data looks like:
user_id => [
race,
gender,
age
]
The numerical representation for each possible value is what we store:
array:10 [
1 => array:3 [
0 => -10
1 => 6
2 => 1
]
2 => array:3 [
0 => 3
1 => 2
2 => 1
]
3 => array:3 [
0 => 2
1 => 1
2 => 5
]
4 => array:3 [
0 => 9
1 => 3
2 => 4
]
5 => array:3 [
0 => -12
1 => 6
2 => 0
]
6 => array:3 [
0 => -6
1 => 7
2 => 3
]
7 => array:3 [
0 => 7
1 => 7
2 => 5
]
8 => array:3 [
0 => 4
1 => 4
2 => 0
]
9 => array:3 [
0 => 5
1 => 7
2 => 1
]
10 => array:3 [
0 => -11
1 => 3
2 => 2
]
]
I'm curious as well, after the clustering is performed, is there anyway to retrieve the original key for the data? This is needed because I need to know which users are in each cluster.
If this is not the appropriate channel for this type of question, or beyond the scope of the repo, please let me know. I certainly appreciate any feedback you may have. Thank you :)