Skip to content

Request to improve the documentation  #97

@kaurao

Description

@kaurao

@kwinkunks Thanks a lot for creating a nice package.
Here are some more detailed comments that I think could be helpful.

  1. wasserstein could return a pandas DataFrame with appropriate index and column names.
    It will be helpful to see an example where the data is not identically distributed by construction, e.g. after applying different SrandardScalers?
    The sentence "This shows us that the distributions of the PE log in well indices 6 and 7 are somewhat different and may be anomalous." should be revised to match with the text/str indices as shown in the figure above.

  2. is_correlated how does this work? Which correlation is used? How does it convert the correlation value to a binary outcome? How are the chunks correlated?
    The sentence "That is, shuffling the data removes the correlation, but does not mean the records are independent." is unclear and confusing. How is a user supposed to make sense of the output of this function?

  3. feature_importances it is unclear how the order of the output is related to the input. Also do the lower or higher values mean higher importance?
    The API documentation says "In each case, the n normalized importances with the most variance are averaged.". This is unclear and will be helpful for a user to get more precise information on what was done like how were the scores normalized? If something like Z-scoring was done then would that be appropriate as it could change the importance scores to negative.
    Also this combining of importance scores is non-standard, at least I have not seen it commonly used. So it will be helpful to have references for this approach.

  4. More genereally, the documentation should provide more details on what the functions are exactly doing.

ping openjournals/joss-reviews#6065

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions