Skip to content

Commit 1717289

Browse files
Merge pull request #24 from alteryx/woodwork_fix
Woodwork update
2 parents 24020e9 + 7d89f4c commit 1717289

File tree

1 file changed

+21
-13
lines changed

1 file changed

+21
-13
lines changed

predict-credit-churn/CreditChurn.ipynb

Lines changed: 21 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -94,18 +94,7 @@
9494
"cell_type": "markdown",
9595
"metadata": {},
9696
"source": [
97-
"First let's use some built in functions from EvalML to convert the data to a woodwork data structure and then cast its dtypes to something we'd rather work with. Then we're going to take a look at some of the unqiue, non-numeric values in the features. Sure enough, `Education_Level`, `Marital_Status`, and `Income_Category` have `Unknown` as a value. This is something we'll have to remember before we get to the model training, since `Unknown` isn't an acceptable value for any of the features."
98-
]
99-
},
100-
{
101-
"cell_type": "code",
102-
"execution_count": null,
103-
"metadata": {},
104-
"outputs": [],
105-
"source": [
106-
"from evalml.utils.gen_utils import _convert_to_woodwork_structure, _convert_woodwork_types_wrapper\n",
107-
"data = _convert_to_woodwork_structure(data)\n",
108-
"data = _convert_woodwork_types_wrapper(data.to_dataframe())"
97+
"We're going to take a look at some of the unqiue, non-numeric values in the features. Sure enough, `Education_Level`, `Marital_Status`, and `Income_Category` have `Unknown` as a value. This is something we'll have to remember before we get to the model training, since `Unknown` isn't an acceptable value for any of the features."
10998
]
11099
},
111100
{
@@ -183,7 +172,7 @@
183172
"outputs": [],
184173
"source": [
185174
"X = data.copy()\n",
186-
"data = data.drop(['Credit_Limit'], axis=1)\n",
175+
"X = X.drop(['Credit_Limit'], axis=1)\n",
187176
"y = X.pop('Attrition_Flag')\n",
188177
"\n",
189178
"X['Income_Category'] = X['Income_Category'].replace({'Less than $40K':0,\n",
@@ -230,6 +219,25 @@
230219
"X = preprocessing(X, y)"
231220
]
232221
},
222+
{
223+
"cell_type": "markdown",
224+
"metadata": {},
225+
"source": [
226+
"Using `infer_feature_types`, we can convert our dataset into a [Woodwork](https://github.com/alteryx/woodwork) data structure, and even [specify what types](https://evalml.alteryx.com/en/stable/user_guide/automl.html) certain features should be. For example, we want to cast `Income_Category` as a categorical type, rather than natural language which is what it was inferred as."
227+
]
228+
},
229+
{
230+
"cell_type": "code",
231+
"execution_count": null,
232+
"metadata": {},
233+
"outputs": [],
234+
"source": [
235+
"from evalml.utils.gen_utils import infer_feature_types\n",
236+
"X = infer_feature_types(X, feature_types={'Income_Category': 'categorical',\n",
237+
" 'Education_Level': 'categorical'})\n",
238+
"X"
239+
]
240+
},
233241
{
234242
"cell_type": "markdown",
235243
"metadata": {},

0 commit comments

Comments
 (0)