Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
c6c64bd
Bump path-to-regexp and express in /website (#298)
dependabot[bot] Sep 23, 2024
e060a44
Bump nltk from 3.8.1 to 3.9 (#297)
dependabot[bot] Sep 23, 2024
6a05e80
Bump body-parser and express in /website (#296)
dependabot[bot] Sep 23, 2024
1c72695
Check embedding update (#295)
xehu Sep 23, 2024
607548a
Merge branch 'main' into dev
xehu Sep 23, 2024
650197e
Update README.md to remove col = "message"
xehu Sep 23, 2024
36cd76e
Closes #302.
xehu Sep 27, 2024
21987f3
Amy/website (#301)
amytangzheng Oct 7, 2024
119efe4
Update github-actions-website.yaml (#309)
xehu Oct 7, 2024
6d25efd
Update github-actions-feature_dict.yaml (#308)
xehu Oct 7, 2024
7e87679
Package updates in Amy/website (#310)
xehu Oct 7, 2024
5678567
Update package-lock.json to local version
xehu Oct 7, 2024
d75837f
Update package-lock.json
xehu Oct 7, 2024
143cb77
Update package.json
xehu Oct 7, 2024
28f85f7
Update package-lock.json
xehu Oct 7, 2024
8b8bd24
Fix "@babel/plugin-proposal-private-property-in-object" error (#311)
xehu Oct 7, 2024
89cd16b
upgrade node packages
xehu Oct 7, 2024
d04037d
update team page + try to remove some of the deprecated packages
xehu Oct 7, 2024
bdf7035
Revert "update team page + try to remove some of the deprecated packa…
xehu Oct 7, 2024
ec2ed64
revert attempts to upgrade packages
xehu Oct 7, 2024
d83f854
Denormalize liwc (#312)
xehu Oct 7, 2024
7905240
address https://github.com/Watts-Lab/team_comm_tools/issues/300 (#313)
xehu Oct 7, 2024
bf762d0
Address issues with making feature names more clear; have cleaner def…
xehu Oct 8, 2024
1dad080
small fix to ensure filtered_dict does not generate in every run
xehu Oct 8, 2024
ed17d7a
merge in main + bump dev's version up for next time
xehu Oct 8, 2024
6b94149
PATCH FIX: Defaults in 0.1.4 were incorrectly specified
xehu Oct 8, 2024
fd50f83
Merge pull request #320 from Watts-Lab/temp-dev
xehu Oct 16, 2024
d15c105
Massive PR (#321)
agshruti12 Oct 22, 2024
db1b9ef
Merge branch 'dev' of https://github.com/Watts-Lab/team_comm_tools in…
amytangzheng Nov 6, 2024
8ebc726
Fix SBERT issue for empty vectors (#327)
xehu Nov 21, 2024
1988102
Fix issues in which featurizer fails when the text is a number (#331)
xehu Dec 1, 2024
c7b27df
Positivity zscore tests (#324)
agshruti12 Dec 2, 2024
cbf04fa
Valence punctuation + unit tests (#325)
agshruti12 Dec 2, 2024
32cb55c
Shruti/time diff tests (#328)
agshruti12 Dec 2, 2024
aa7b5f4
Amy/package v2 (#326)
amytangzheng Dec 3, 2024
f04a1eb
Bring your own LIWC & matplotlib dependency fix (#322)
sundy1994 Dec 3, 2024
75b46c1
update website (#338)
xehu Dec 6, 2024
35208a0
Update Team.js
xehu Dec 6, 2024
827ca02
Merge branch 'main' into dev
xehu Dec 6, 2024
1c5ed73
Merge branch 'dev' of https://github.com/Watts-Lab/team_comm_tools in…
amytangzheng Dec 11, 2024
24e81d5
Bump path-to-regexp and express in /website (#298)
dependabot[bot] Sep 23, 2024
7fd47cb
Bump nltk from 3.8.1 to 3.9 (#297)
dependabot[bot] Sep 23, 2024
8888709
Bump body-parser and express in /website (#296)
dependabot[bot] Sep 23, 2024
710066a
Check embedding update (#295)
xehu Sep 23, 2024
90774b8
Amy/website (#301)
amytangzheng Oct 7, 2024
68813f8
Package updates in Amy/website (#310)
xehu Oct 7, 2024
c090cfe
update team page + try to remove some of the deprecated packages
xehu Oct 7, 2024
4fe3506
Revert "update team page + try to remove some of the deprecated packa…
xehu Oct 7, 2024
81784fe
revert attempts to upgrade packages
xehu Oct 7, 2024
00f3f67
Denormalize liwc (#312)
xehu Oct 7, 2024
bea3008
address https://github.com/Watts-Lab/team_comm_tools/issues/300 (#313)
xehu Oct 7, 2024
1996a7f
Address issues with making feature names more clear; have cleaner def…
xehu Oct 8, 2024
3ef556f
PATCH FIX: Defaults in 0.1.4 were incorrectly specified
xehu Oct 8, 2024
2e8df5e
allow users customize vectors
amytangzheng Nov 6, 2024
d81761b
testing turns
amytangzheng Nov 10, 2024
6fe3c61
fix SBERT handling of NA and batching
xehu Nov 10, 2024
73f72bc
updates to testing vectors
amytangzheng Dec 11, 2024
45c10dd
updates to vector
amytangzheng Dec 18, 2024
0c89fa9
Merge branch 'dev' into amy/vector
amytangzheng Dec 18, 2024
57bfe88
vector summarization updates
amytangzheng Dec 18, 2024
5b2b9f0
updates to message_original error
amytangzheng Dec 19, 2024
d8291db
updates to message_original error
amytangzheng Dec 19, 2024
d8377dc
updates to fix message_original error
amytangzheng Dec 19, 2024
ed718b2
updates to fix message_original error
amytangzheng Dec 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ src/team_comm_tools/ipython_notebooks/.ipynb_checkpoints/
tests/ipython_notebooks/.ipynb_checkpoints/
tests/data/vector_data/
tests/test.log
tests/helper.ipynb
tests/output/*
tests/vector_data/*
src/utils/__pycache__/
Expand Down
Binary file modified docs/build/doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/build/doctrees/examples.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/feature_builder.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/lexical_features_v2.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/readability.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/temporal_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/features/word_mimicry.doctree
Binary file not shown.
Binary file not shown.
Binary file modified docs/build/doctrees/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/calculate_chat_level_features.doctree
Binary file not shown.
Binary file not shown.
Binary file modified docs/build/doctrees/utils/calculate_user_level_features.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/check_embeddings.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/preprocess.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/utils/summarize_features.doctree
Binary file not shown.
151 changes: 134 additions & 17 deletions docs/build/html/_sources/examples.rst.txt

Large diffs are not rendered by default.

15 changes: 12 additions & 3 deletions docs/build/html/_sources/features/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,11 @@ Utterance-Level features are calculated *first* in the Toolkit, as many conversa

Conversation-Level Features
****************************
Once utterance-level features are computed, we compute conversation-level features; some of these features represent an aggregation of utterance-level information (for example, the "average level of positivity" in a conversation is simply the mean positivity score for each utterance). Other conversation-level features are constructs that are defined only at the conversation-level, such as the level of "burstiness" in a team's communication patterns.

Base Conversation-Level Features
+++++++++++++++++++++++++++++++++++

The following features are constructs that are defined only at the conversation-level, such as the level of "burstiness" in a team's communication patterns. We call these the "base" conversation-level features, and they can be accessed using a property of the ``FeatureBuilder`` object: ``FeatureBuilder.conv_features_base``.

.. toctree::
:maxdepth: 1
Expand All @@ -46,12 +50,17 @@ Once utterance-level features are computed, we compute conversation-level featur
within_person_discursive_range
turn_taking_features

Conversation-Level Aggregates
+++++++++++++++++++++++++++++++++++
Once utterance-level features are computed, we compute conversation-level features; some of these features represent an aggregation of utterance-level information (for example, the "average level of positivity" in a conversation is simply the mean positivity score for each utterance).

By default, all numeric attributes generated at the utterance (chat) level are aggregated using the functions ``mean``, ``max``, ``min``, and ``stdev``. However, this behavior can be customized, with details in the Worked Example (see :ref:`custom_aggregation`).

Speaker- (User) Level Features
*********************************
User-level features generally represent an aggregation of features at the utterance- level (for example, the average number of words spoken *by a particular user*). There is therefore limited speaker-level feature documentation, other than a function used to compute the "network" of other speakers that an individual interacts with in a conversation.

You may reference the :ref:`Speaker (User)-Level Features Page <user_level_features>` for more information.

You may reference the :ref:`Speaker (User)-Level Features Page <user_level_features>` for more information, as well as the details in the Worked Example (see :ref:`custom_aggregation`).

.. toctree::
:maxdepth: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,16 @@ Citation

Implementation Basics
**********************
To compute the feature, we count the number of shared content words (defined as anything that is not on the function word list) between the current and previous utterance in a conversation, then normalize it by the frequency of the word across all inputs in the dataset. This follows the original authors' method:
To compute the feature, we count the number of shared content words (defined as anything that is not on the function word list) between the current and previous utterance in a conversation, normalized by the frequency at which the word appears. This follows the original authors' method:

Content words are defined as any word that is not a function word. For each content word w in a given speaker’s turn, if w also occurs in the immediately preceding turn of the other, we count w as an accommodated content word. The raw count of accommodated content words is be the total number of these accommodated content words over every turn in the conversation side. Because content words vary widely in frequency, we normalized our counts by the frequency of each word.

For completeness, we interprete "the frequency of each word" in two distinct ways:

1. **The frequency of each word across the entire dataset (`content_word_accommodation`)**: here, we normalize non-function words with respect to the language used across all conversations in the dataset. This version of accommodation is useful if the entire dataset consists of similar conversations, or conversations about the same topic. Normalizing with respect to a larger dataset will be useful in establishing better estimates in identifying (and appropriately weigting) whichs words carry meaningful content in a particular domain.

2. **The frequency of each word within a given conversation (`content_word_accommodation_per_conv`)**: here, we normalize non-function words with respect only to the language in a given conversation. This version of accommodation is useful if the dataset consists of very distinct conversations, for which it may not make sense to assume that the distribution of which words are "important" will hold across different domains.

The feature requires a reference list of function words, which are defined by the original authors as follows.

**Auxiliary and copular verbs**
Expand Down
4 changes: 2 additions & 2 deletions docs/build/html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -150,10 +150,10 @@ Use the Table of Contents below to learn more about our tool. We recommend that

intro
basics
feature_builder
examples
features/index
features_conceptual/index
examples
feature_builder
utils/index

Indices and Tables
Expand Down
Loading
Loading