Conversational Dynamics Similarity (ConDynS) and ConvoKit GenAI Tool #288

seanzhangkx8 · 2025-07-31T22:20:11Z

Conversational Dynamics Similarity

This PR adds ConvoSimilarity, a new module implementing the framework from the paper A Similarity Measure for Comparing Conversational Dynamics. It provides tools for comparing conversations based on their interaction patterns rather than individual utterances.

Key Features

SCDWriter for generating Summaries of Conversation Dynamics (SCD) and extracting Sequences of Patterns (SoP) via LLMs
ConDynS and NaiveConDynS measures for similarity computation
Baseline methods and utilities for preprocessing, evaluation, and visualization

Example Notebooks

GenAI Module

This PR also introduces GenAI, a unified interface for integrating LLMs into ConvoKit workflows for conversational analysis.

Key Features

Abstract LLMClient base class with implementations for OpenAI, Gemini, and a template for local models
Centralized configuration via GenAIConfigManager for API keys and settings
Factory method for flexible and consistent client instantiation

Example Notebook

GenAI demo with GPT

…ine methods, as well as validation example notebooks

…dyns computation

cristiandnm · 2025-09-22T12:53:48Z

@seanzhangkx8 can you add the option to use SCDs that are already in the metadata? So when passing the input as the function that gets the SCD, you should also accept a string, and if it's a string it is the name of the metadata containing the SCD that is to be used.

vianxnguyen

Hi @seanzhangkx8, really great contributions to Convokit, thanks for putting this out! Just checking if there any changes still pending on the PR? (It's currently marked as a draft). I added some comments based on the current version below:

Merge conflicts in convokit/__init__.py and docs/source/analysis.rst that need to be resolved before merging. It also looks like the branch is a bit out of date so may be worth rebasing onto the latest master
For documentation structure, currently, all the SCD and ConDynS transformers are documented under ConDynS.rst. Since SCD could be useful independently of ConDynS, it might make sense to split this into two separate pages: one for SCD and another for ConDynS that references the SCD page
For naive baselines, I think they are really helpful for initial experimentation, but has there been any discussion on whether we want to include them in Convokit from a practical perspective?
For GenAI output handling, it could be useful to allow users to specify a custom function for converting raw LLM text output into a structured format (e.g., json, list, etc.), depending on their downstream use case
For local GenAI integration, the local client seems to be more of a placeholder/mock template right now, maybe including a simple working example of integrating a local model could be really helpful
For GenAI documentation could be helpful to link to specific setup instructions from providers (OpenAI/Gemini)
Currently it seems like LLMPromptTransformer operates on a single "unit" (e.g. different levels of the corpus: conversation, speaker, utterance) wondering if would be helpful if theres an option to have different or multiple units in the same prompt (e.g., if you want to prompt an utterance with respect to a conversation) or multiple subunits in the same "unit" (e.g., you want to prompt different parts of the same conversation)
For GenAI error retry logic, seems like it raises an exception when retries are exhausted, if users are running this in a loop over some unit, maybe would be good to have a marker or way to indicate "where they left off" so when they rerun transform they can start where the retry period ended off

Let me know if anything is still pending, happy to take another look if that would be helpful. Thanks!

seanzhangkx8 · 2025-10-22T22:03:13Z

Hi Vivian! Thanks for the detailed review.

I resolved the conflicts.
I agree, SCD is now having its own documentation page.
I will keep naive baselines in for now for reproducibility because i have a notebook comparing between ConDynS and baselines. We can remove it if you think it is unnecessary.
I totally agree with that added feature! But I think this PR is big enough and I hink I will leave the improvements to a later development. Here we focus on introducing the modules and later we can develop cool stuff in addition to it.
Same as above.
Nice! added links
Same as above mentioned, I think it would be a nice feature, but maybe little tricky to develop for multiple levels. We can discuss this later.

Thanks for your review and let me know if you have any other concerns we should address!

vianxnguyen

Hi Sean, thanks for addressing the comments, looks great! Feel free to note if there are any suggested improvements that you would like to defer. Also just approved the PR!

seanzhangkx8 · 2025-10-25T15:35:57Z

thanks Vivian! incrementing version number and will merge now.

seanzhangkx8 added 30 commits July 22, 2025 12:20

Add support for calling genAI clients via API (openai, gemini)

74b9f01

modify genai for consistent naming, add condyns computation and basel…

220872e

…ine methods, as well as validation example notebooks

reorganize and add description to validation notebooks

32d2873

add applications, allow user to set custom prompt for scd/sop and con…

9d27453

…dyns computation

finish applications

a02692f

remove unused modules

cd44b55

add init file and redesign parameters for genai config

30ec9d7

add documentations

48e9dd4

add documentations to modules

2ed1da7

add conditional import if package for genai not available

36f7315

Merge branch 'CornellNLP:master' into convo-sim-condyns

ce999d1

black formatter

5b017ed

fix setup.py

0d45625

black format setup.py

69443ee

re black format setup.py

c4d72be

add unified SCD Transformer

776cad1

add unified transformer for LLM output writing to metadata

9d9b7a5

black formater

abfe26b

implement lazy loading

faa3202

black formatter

8d14595

add wiki german and friends demos

23a683c

Merge branch 'master' into convo-sim-condyns

d58acc0

Merge branch 'master' into fix-llm-option

ef02ae9

fix merge issue

01609fd

update requirements-dev.txt

a6a9d9a

try disable some tests, may fail

f4299f0

black formatter

7397a22

add examples

c67273c

merge in lazy loading changes

93fddec

add to init to load the modules

9a86d82

seanzhangkx8 and others added 7 commits September 18, 2025 21:49

black formatter

7e46317

fix bugs

cbf0c27

run with gpt

124b4f7

run with gemini

f252d8c

finish example

97d8986

align example config using

f06c60b

run genai example notebook

1eb61cf

seanzhangkx8 and others added 9 commits September 22, 2025 10:44

Update local changes

8fef26c

update 1001

3fd4b53

black formatter

d21be57

init

1fcd894

ran examples

abe9794

update

37a12f4

add doc

37f35c6

update doc

24fba6c

black formatter

8f07a45

cristiandnm requested a review from vianxnguyen October 2, 2025 15:58

change genai transfomer name and update error emssage

443e9d9

vianxnguyen reviewed Oct 22, 2025

View reviewed changes

seanzhangkx8 added 2 commits October 22, 2025 23:56

update

0c00ffd

Merge branch 'master' into convo-sim-condyns

5adcf8b

seanzhangkx8 marked this pull request as ready for review October 22, 2025 21:57

vianxnguyen approved these changes Oct 22, 2025

View reviewed changes

increment version number

ff80ffb

seanzhangkx8 merged commit 59a68a4 into CornellNLP:master Oct 25, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Conversational Dynamics Similarity (ConDynS) and ConvoKit GenAI Tool #288

Conversational Dynamics Similarity (ConDynS) and ConvoKit GenAI Tool #288

Uh oh!

seanzhangkx8 commented Jul 31, 2025

Uh oh!

cristiandnm commented Sep 22, 2025

Uh oh!

vianxnguyen left a comment

Uh oh!

seanzhangkx8 commented Oct 22, 2025

Uh oh!

vianxnguyen left a comment

Uh oh!

seanzhangkx8 commented Oct 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversational Dynamics Similarity (ConDynS) and ConvoKit GenAI Tool #288

Conversational Dynamics Similarity (ConDynS) and ConvoKit GenAI Tool #288

Uh oh!

Conversation

seanzhangkx8 commented Jul 31, 2025