Skip to content

Commit b2cb0e7

Browse files
committed
Undocumented HtmlFeatureExtractor post-processing step is removed to simplify code and make it faster.
If needed, it can be implemented as a global feature.
1 parent ed40e3e commit b2cb0e7

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

webstruct/feature_extraction.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -415,8 +415,7 @@ def transform_single(self, html_tokens):
415415
for feat in self.global_features:
416416
feat(token_data)
417417

418-
return [{k: fd[k] for k in fd if not k.startswith('_')}
419-
for tok, fd in token_data]
418+
return [featdict for tok, featdict in token_data]
420419

421420
def _pruned(self, X, low=None):
422421
if low is None or low <= 1:

0 commit comments

Comments
 (0)