Skip to content

Conversation

bridgetmcg
Copy link

@bridgetmcg bridgetmcg commented Oct 3, 2025

This PR introduces support for parsing and chunking source code files in Docling, enabling structured processing of programming languages with intelligent function-level chunking capabilities. It requires docling-core PR 398 be merged for testing to work.

Features

  • Multi-language support: Python (.py), JavaScript (.js), and Java (.java) files
  • Unified code representation using Docling's document model
  • Integration with existing DocumentConverter
  • Pipeline configuration using SimplePipeline for code processing

Testing

  • Basic Java function chunking functionality

Checklist:

  • Documentation has been updated, if necessary.
  • Examples have been added, if necessary.
  • Tests have been added, if necessary.

Copy link
Contributor

github-actions bot commented Oct 3, 2025

DCO Check Passed

Thanks @bridgetmcg, all your commits are properly signed off. 🎉

Copy link

dosubot bot commented Oct 3, 2025

Related Documentation

Checked 2 published document(s). No updates required.

You have 5 draft document(s). Publish docs to keep them always up-to-date

How did I do? Any feedback?  Join Discord

I, Bridget McGinn <bridget.mcginn@ibm.com>, hereby add my Signed-off-by to this commit: 7235688

Signed-off-by: Bridget McGinn <bridget.mcginn@ibm.com>
Copy link

mergify bot commented Oct 3, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Require two reviewer for test updates

This rule is failing.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant