Skip to content

Multidimensional arrays and diversity clustering #35

@LarryBarker

Description

@LarryBarker

Hello, thank you for sharing this package. I'm hoping to use it to help group users into diverse groups based on socioeconomic factors like race, gender, age, etc. Our dataset contains 20 factors that need to be taken into consideration. Have you used this to solve such a problem?

I've started some preliminary testing, and seem to be getting results but I can't tell what is happening behind the scenes. Furthermore, I would like to be able to weight each factor. For example, race may be the most important factor in some cases, while gender may be in others.

Here is what the data looks like:

 user_id => [
    race,
    gender,
    age
 ]

The numerical representation for each possible value is what we store:

array:10 [
  1 => array:3 [
    0 => -10
    1 => 6
    2 => 1
  ]
  2 => array:3 [
    0 => 3
    1 => 2
    2 => 1
  ]
  3 => array:3 [
    0 => 2
    1 => 1
    2 => 5
  ]
  4 => array:3 [
    0 => 9
    1 => 3
    2 => 4
  ]
  5 => array:3 [
    0 => -12
    1 => 6
    2 => 0
  ]
  6 => array:3 [
    0 => -6
    1 => 7
    2 => 3
  ]
  7 => array:3 [
    0 => 7
    1 => 7
    2 => 5
  ]
  8 => array:3 [
    0 => 4
    1 => 4
    2 => 0
  ]
  9 => array:3 [
    0 => 5
    1 => 7
    2 => 1
  ]
  10 => array:3 [
    0 => -11
    1 => 3
    2 => 2
  ]
]

I'm curious as well, after the clustering is performed, is there anyway to retrieve the original key for the data? This is needed because I need to know which users are in each cluster.

If this is not the appropriate channel for this type of question, or beyond the scope of the repo, please let me know. I certainly appreciate any feedback you may have. Thank you :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions