A utility to convert the navigation sidebar of GitHub documentation pages from HTML to a structured Markdown format.
https://gitlab.com/brlin/github-docs-nav-sidebar-to-markdown
#utility #github-docs #markdown
This script converts the GitHub Actions documentation sidebar HTML structure to a GitHub Flavored Markdown (GFM) unordered list format.
The script extracts:
- Text content from span elements with class "prc-ActionList-ItemLabel-TmBhn"
- URL links from href attributes of anchor elements
- Hierarchical structure based on CSS depth indicators (--subitem-depth)
Refer to the following instructions to run the conversion script:
-
Install the following runtime dependencies:
-
Copy the HTML snippet of the sidebar into a local file, e.g.,
sidebar.html, refer to the sample HTML file for an example. -
Edit the
html_to_markdown_converter.pyscript to:- Set the input HTML file path and output Markdown file path.
- Edit the
class_variables according to the actual HTML structure if necessary.
-
Launch a text terminal.
-
Change the working directory to where you saved the
html_to_markdown_converter.pyscript. -
Run the following command to run the utility:
python3 html_to_markdown_converter.py
The converted Markdown content will be saved to the specified output file.
The following materials are referenced during the development of this product:
- Beautiful Soup Documentation
Explains how to use the Beautiful Soup library for parsing HTML data.
Unless otherwise noted(comment headers/REUSE.toml), this product is licensed under the 3.0 version of the Affero General Public License, or any of its more recent versions of your preference.
This work complies to the REUSE Specification, refer to the REUSE - Make licensing easy for everyone website for info regarding the licensing of this product.