-
Notifications
You must be signed in to change notification settings - Fork 78
Capstone Project 1
Really good tutorial on how to build a logistic regression model: https://towardsdatascience.com/building-a-logistic-regression-in-python-step-by-step-becd4d56c9c8 https://stackoverflow.com/questions/14745022/how-to-split-a-column-into-two-columns
- https://towardsdatascience.com/random-forest-in-python-24d0893d51c0
- https://www.datacamp.com/community/tutorials/random-forests-classifier-python
- https://stackabuse.com/random-forest-algorithm-with-python-and-scikit-learn/
- https://machinelearningmastery.com/implement-random-forest-scratch-python/
- https://en.wikipedia.org/wiki/Random_forest
-
https://nycdatascience.com/blog/meetup/featured-talk-1-kaggle-data-scientist-owen-zhang/
-
https://datascienceplus.com/extreme-gradient-boosting-with-python/
-
https://medium.com/mlreview/gradient-boosting-from-scratch-1e317ae4587d
-
http://benalexkeen.com/gradient-boosting-in-python-using-scikit-learn/
-
https://machinelearningmastery.com/gentle-introduction-gradient-boosting-algorithm-machine-learning/
- https://robots.thoughtbot.com/analyzing-minards-visualization-of-napoleons-1812-march
- https://blog.ouseful.info/2017/11/28/quick-round-up-visualising-flows-using-network-and-sankey-diagrams-in-python-and-r/
- https://plotlyblog.tumblr.com/post/120532468127/how-to-analyze-data-seven-modern-remakes-of-the
- https://stackoverflow.com/questions/28651079/pandas-unstack-problems-valueerror-index-contains-duplicate-entries-cannot-re
- http://www.datasciencemadesimple.com/reshape-long-wide-pandas-python-pivot-function/
- https://www.datacamp.com/community/tutorials/pandas-multi-index
- https://stackoverflow.com/questions/13295735/how-can-i-replace-all-the-nan-values-with-zeros-in-a-column-of-a-pandas-datafra
- https://stackoverflow.com/questions/20110170/turn-pandas-multi-index-into-column
- https://stackoverflow.com/questions/36537945/reshape-wide-to-long-in-pandas
- https://stackoverflow.com/questions/45352909/pandas-indexingerror-unalignable-boolean-series-provided-as-indexer
- https://stackoverflow.com/questions/42477572/sort-values-method-in-pandas
- https://stackoverflow.com/questions/16958499/sort-pandas-dataframe-and-print-highest-n-values
- http://pandas.pydata.org/pandas-docs/version/0.17/generated/pandas.DataFrame.sort.html
- https://stackoverflow.com/questions/19523277/renaming-column-names-in-pandas-groupby-function
- https://stackoverflow.com/questions/47138271/how-to-create-a-stacked-bar-chart-for-my-dataframe-using-seaborn
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot_table.html
- How To Add a New Column to Using a Dictionary in Pandas Data Frame ?: Pandas Tutorial
- https://stackoverflow.com/questions/13445241/replacing-blank-values-white-space-with-nan-in-pandas
- https://stackoverflow.com/questions/37840812/pandas-subtracting-two-date-columns-and-the-result-being-an-integer
- https://www.dataquest.io/blog/regular-expressions-data-scientists/
- https://python-graph-gallery.com/all-charts/
- https://stats.stackexchange.com/questions/95797/how-to-split-the-dataset-for-cross-validation-learning-curve-and-final-evaluat
- https://stackoverflow.com/questions/17071871/select-rows-from-a-dataframe-based-on-values-in-a-column-in-pandas/46165056#46165056
- https://stackoverflow.com/questions/14745022/how-to-split-a-column-into-two-columns
- https://stackoverflow.com/questions/13996302/python-rolling-functions-for-groupby-object
- https://stackoverflow.com/questions/13872533/plot-different-dataframes-in-the-same-figure
- https://stackoverflow.com/questions/51711306/filter-group-by-and-count-in-pandas
- https://pandas.pydata.org/pandas-docs/stable/reshaping.html
- https://stackoverflow.com/questions/26646191/pandas-groupby-month-and-year
- https://stackoverflow.com/questions/23891575/how-to-merge-two-dataframes-side-by-side
- https://seaborn.pydata.org/examples/wide_data_lineplot.html
-
For figuring out how to optimize logistic regression model: https://towardsdatascience.com/logistic-regression-model-tuning-with-scikit-learn-part-1-425142e01af5
-
For explaining LogisticRegression vs LogisticRegressionCV: https://stackoverflow.com/questions/46507606/what-does-the-cv-stand-for-in-sklearn-linear-model-logisticregressioncv
-
Hyperparameter tuning on random forests: https://towardsdatascience.com/hyperparameter-tuning-the-random-forest-in-python-using-scikit-learn-28d2aa77dd74
-
XGBoost Hyperparameter tuning: https://www.kaggle.com/tilii7/hyperparameter-grid-search-with-xgboost
-
https://stackoverflow.com/questions/23199796/detect-and-exclude-outliers-in-pandas-data-frame
- Compare means between two different groups:
- Compare categorical distributions:

