[fix]: Fix NSDL CAS parser to correctly handle Mutual Fund Folios #115

brf153 · 2025-09-21T17:53:36Z

Problem

The parser was failing to handle Mutual Fund Folios sections due to:

Regex mismatch: DEMAT_MF_HEADER_RE did not match the actual header format in CAS PDFs.
Logic flow issues: When current_demat was None, relevant lines were skipped, causing incomplete MF data extraction.

Solution

Updated the regex to correctly detect Mutual Fund Folios headers.
Adjusted the parsing logic to ensure lines are not skipped when current_demat is None.

Impact

This fix enables complete parsing of NSDL CAS statements containing Mutual Fund Folios alongside traditional demat accounts.

Signed-off-by: brf153 <153hsb@gmail.com>

brf153 · 2025-09-21T17:56:12Z

Before fix:

After fix:

codereverser · 2025-09-21T22:20:46Z

I'm not able to reproduce this issue. Current code works fine with all my test NSDL statement files
for eg:-

Can you send me the list of packages in your env? I suspect something might've changed in the underlying parser modules.

brf153 · 2025-09-22T04:34:00Z

@codereverser I’m using pip install ., which should install the packages defined in pyproject.toml. I’m using the same .toml file that’s present in the main branch of the codebase. Could this be a parser issue? I got an error saying that I need to install the pymupdf package.

brf153 · 2025-09-22T04:36:01Z

Screen.Recording.2025-09-22.100534.mp4

These are the packages

codereverser · 2025-09-22T04:45:08Z

Hi, could you try it with an older version of pymupdf, like before 1.25 (1.24.14 for eg)? I think something's messed up in the newer versions, and I'm trying to figure it out. Also, maybe install casparser with the fast extra? Just do `pip install -U casparser[fast]`.

…

On Mon, 22 Sept, 2025, 2:36 pm Devaansh Bhandari, ***@***.***> wrote: *brf153* left a comment (codereverser/casparser#115) <#115 (comment)> https://github.com/user-attachments/assets/9c054194-7cb7-4fa9-85ae-320070bc4c33 These are the packages — Reply to this email directly, view it on GitHub <#115 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACBIEX7FATJDV3SWZ5TBIMD3T54EPAVCNFSM6AAAAACHDE2L7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGMJWG4ZTONZWG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

brf153 · 2025-09-22T04:47:30Z

I tried logging values as well. These are my observations.

This is the code to get the mf data

When I use the original regex DEMAT_MF_HEADER_RE = r"Mutual Fund Folios\s+(\d+)\s+folios\s+(\d+)\s+([\d,.]+)"

When I use the updated regex
DEMAT_MF_HEADER_RE = (
r"(Mutual Fund Folios)\s+(\d+)\s+Folios"
r"[\s\S]*?Total\s+\d+\s+[\d,.]+\s+[\d,.]+\s+([\d,.]+)"
)

brf153 · 2025-09-22T04:48:16Z

Hi, could you try it with an older version of pymupdf, like before 1.25 (1.24.14 for eg)? I think something's messed up in the newer versions, and I'm trying to figure it out. Also, maybe install casparser with the fast extra? Just do pip install -U casparser[fast].
…
On Mon, 22 Sept, 2025, 2:36 pm Devaansh Bhandari, @.> wrote: brf153 left a comment (codereverser/casparser#115) <#115 (comment)> https://github.com/user-attachments/assets/9c054194-7cb7-4fa9-85ae-320070bc4c33 These are the packages — Reply to this email directly, view it on GitHub <#115 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBIEX7FATJDV3SWZ5TBIMD3T54EPAVCNFSM6AAAAACHDE2L7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGMJWG4ZTONZWG4 . You are receiving this because you were mentioned.Message ID: @.>

sure. I will try it and let you know

brf153 · 2025-09-22T04:53:18Z

It installs the older version of pymupdf when I use the casparser[fast] command

It’s working fine with the casparser[fast] package. Thank you!

codereverser · 2025-09-22T06:24:54Z

Thanks for checking!

Yeah. there's something broken with pymupdf 1.25+ . I'm working on a new version with fixes and shall release it soon.

brf153 · 2025-09-22T09:58:12Z

@codereverser sure. You can check the updated regex in this PR — it might help with the new changes in the latest version of the PyMuPDF package. Thanks for your help!

fix mutual fund data issue

3cbd74a

Signed-off-by: brf153 <153hsb@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix]: Fix NSDL CAS parser to correctly handle Mutual Fund Folios #115

[fix]: Fix NSDL CAS parser to correctly handle Mutual Fund Folios #115

Uh oh!

brf153 commented Sep 21, 2025 •

edited

Loading

Uh oh!

brf153 commented Sep 21, 2025

Uh oh!

codereverser commented Sep 21, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

codereverser commented Sep 22, 2025 via email

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

codereverser commented Sep 22, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[fix]: Fix NSDL CAS parser to correctly handle Mutual Fund Folios #115

Are you sure you want to change the base?

[fix]: Fix NSDL CAS parser to correctly handle Mutual Fund Folios #115

Uh oh!

Conversation

brf153 commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Impact

Uh oh!

brf153 commented Sep 21, 2025

Uh oh!

codereverser commented Sep 21, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

codereverser commented Sep 22, 2025 via email

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

codereverser commented Sep 22, 2025

Uh oh!

brf153 commented Sep 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brf153 commented Sep 21, 2025 •

edited

Loading