Skip to content

Conversation

@honzajavorek
Copy link
Collaborator

@honzajavorek honzajavorek commented Oct 17, 2025

Just a PoC related to #1243 Issues:

  • Many, many files have Python lint issues. For now I have the GitHub Action work just on top of changed files, but npm run lint:code:py can give you an idea. Actually, it cannot, because it stops on the first error: Support spilling out all the errors adamtheturtle/doccmd#569
  • In tutorials, code blocks are incomplete and follow up on each other. The doccmd tool supports grouping them, but that requires explicit marks in the Markdown file. Doable, but a lot of manual work to get the linter passing. OTOH, sometimes giving the reader more context, like repeating imports, etc., is actually desirable for better understanding. See apify_client.md as a showcase.
  • The syntax with title="filename.py" isn't supported as of now: Recognize parametrized code blocks simplistix/sybil#155
  • I'm unsure how it's gonna work with .mdx. For now I think it ignores it unless we explicitly tell it to look after those files. I don't think there's much Python code in .mdx files, though… (my unverified assumption)
  • For now I didn't include a fix option. The Python linter, ruff, does have a --fix option, but I'm unsure we want anyone to run it without thinking. Most of the unused import errors shouldn't be auto-fixed as let's remove the import, usually it's the case that it's a tutorial with several code blocks following up, and the correct solution is to group them.

@honzajavorek honzajavorek added the t-academy Issues related to Web Scraping and Apify academies. label Oct 17, 2025
@apify-service-account
Copy link

Preview for this PR was built for commit 3a9fab41 and is ready at https://pr-2027.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit ce4011d1 and is ready at https://pr-2027.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 1f87a79e and is ready at https://pr-2027.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 61b0d156 and is ready at https://pr-2027.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 9cb18f16 and is ready at https://pr-2027.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 7c01b700 and is ready at https://pr-2027.preview.docs.apify.com!

@adamtheturtle
Copy link

Very cool to see this @honzajavorek ! I'm happy to help you get onboarded to doccmd. For my own projects, it has been a lot of work but it pays dividends regularly.

Many, many files have Python lint issues. For now I have the GitHub Action work just on top of changed files, but npm run lint:code:py can give you an idea. Actually, it cannot, because it stops on the first error: adamtheturtle/doccmd#569

You can update doccmd and use --continue-on-error.

In my own projects, I instead used ignores for each linter, and then just slowly removed ignores.

In tutorials, code blocks are incomplete and follow up on each other. The doccmd tool supports grouping them, but that requires explicit marks in the Markdown file. Doable, but a lot of manual work to get the linter passing. OTOH, sometimes giving the reader more context, like repeating imports, etc., is actually desirable for better understanding. See apify_client.md as a showcase.

I'm open to ideas on how to make this better in doccmd.

I'm unsure how it's gonna work with .mdx. For now I think it ignores it unless we explicitly tell it to look after those files.

That's right. You can use --markdown-extension to tell doccmd which extensions to use to look for Markdown files.

@honzajavorek
Copy link
Collaborator Author

Hi @adamtheturtle, thank you for stepping in! This was a side quest and I'm now gathering feedback on this from the rest of the team. The --continue-on-error is a welcome change, I'll try that. It's not a blocker, while Sybil not picking up all code blocks (per MDX syntax) is.

I'm open to ideas on how to make this better in doccmd.

Regarding this one, I cannot think of a better way, tbh. I thought about it for a moment and while I tell the Sybil maintainer I'm okay if Sybil drops the rest of the first line of the code block after the language identifier, it would be actually cool if we could use these properties, e.g. like

Let's start with
```python group=example1.py
from pprint import pp
```
and then continue with
```python group=example1.py
pp({"hello": "world")
```

In my mind, that would be very idiomatic to how Docusaurus code blocks work and it would gave us explicit control which code examples get grouped. Still, this is just a slightly different way of what doccmd is already doing, and wouldn't take away the manual work denoting which code blocks should be put together before linted. It's a fuzzy thing and in some pages requires manual restructuring of the examples so they could be grouped and linted (or tested, etc.).

@adamtheturtle
Copy link

Very interesting! Could you please turn that into an issue on doccmd?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-academy Issues related to Web Scraping and Apify academies.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants