To create a Python script that processes an HTML file, identifies all Mermaid diagram code blocks, and attempts to convert each into a PNG image using a specified external command-line tool. If the initial conversion fails due to errors in the Mermaid code, the script will leverage a Generative AI (currently Google Gemini) to attempt to fix the code. The script will iteratively try to render the AI-corrected code up to a user-specified number of retries. Successfully generated PNGs will replace the original Mermaid your browser supports them or if you view it in an environment that processes Mermaid (like some Markdown editors or dedicated tools). ```html
Version: 1.0 (Reflecting current script capabilities as of last update)
Date: July 15, 2024
To create a Python script that automates the processing of HTML files containing Mermaid diagram code blocks. The script will:
tag pointing to the generated PNG.
graph TD
A[Start] --> B{Parse Command-Line Arguments};
B --> C{Read Input HTML File};
C --> D{Parse HTML (BeautifulSoup)};
D --> E{Find All 'div.mermaid' Blocks};
E -- No Diagrams Found --> F[Save Original HTML to Output];
E -- Diagrams Found --> G{Loop Through Each Diagram};
F --> Z[End];
G --> H[Extract Mermaid Code & Determine Name];
H --> I[Attempt 1: Render PNG with Executable];
I -- Success --> J[Replace 'div.mermaid' with '
' Tag];
I -- Failure (Exec Error or PNG Not Found) --> K{AI Retries Enabled? (ai_retries > 0 AND API Key Set?)};
K -- No --> L[Log Failure, Keep Original 'div.mermaid'];
L --> M{More Diagrams?};
J --> M;
K -- Yes --> N[Start AI Retry Loop (Max: ai_retries)];
N --> O{Current AI Attempt < Max Retries?};
O -- Yes --> P[Prepare Prompt for AI (Code, Error, History)];
P --> Q[Call Gemini AI API];
Q --> R{AI Suggests New, Different Code?};
R -- Yes --> S[Update .mmd with AI Code, Increment AI Attempt];
S --> I; // Re-attempt rendering with new code
R -- No (No Suggestion / Same Code / AI Error) --> T[Log AI Failure, Revert .mmd to Original];
T --> L; // Mark as failed for this diagram
O -- No (Max Retries Reached) --> T;
M -- Yes --> G;
M -- No --> U[Save Modified HTML to Output File];
U --> V[Log Processing Summary];
V --> Z;
The script must accept the following command-line arguments:
-i
/
--input-html
(Required): Path to the input HTML file.
-o
/
--output-html
(Required): Path to save the modified HTML file.
-d
/
--image-dir
(Required): Directory to store generated PNG images and intermediate
.mmd
files. The script will create this directory if it doesn't exist.
--executable
(Required): Full path to the external command-line executable (e.g.,
mmdc
,
genmermaid.sh
) that converts a
.mmd
file to a
.png
file.
--executable-args
(Optional, List): Additional arguments to pass to the external executable. Supports placeholders:
{input_file}
: Absolute path to the temporary
.mmd
file.
{output_dir}
: Absolute path to the image output directory.
{base_name}
: The generated base name for the diagram (e.g., "my_diagram_1"). This allows forming output filenames like
{output_dir}/{base_name}.png
.
.mmd
file path as its sole primary argument and outputs the PNG in the same directory as the
.mmd
file with the same base name.
--log-file
(Required): Path for the detailed log file.
--log-level
(Optional, Choices: INFO, DEBUG, Default: INFO): Sets the logging verbosity for console output. The log file will always capture code blocks in the HTML with appropriate
tags. The entire process, including AI interactions, must be thoroughly logged, and the script must be configurable via command-line options.
elements.
.mmd
file.
with an
tag in the HTML.
.mmd
file and re-attempt PNG generation.
block is retained in the HTML, and the
.mmd
file is reverted to the original code if AI modifications were made.
The script accepts the following command-line arguments:
-i, --input-html FILE_PATH
(
Required
): Path to the input HTML file.
-o, --output-html FILE_PATH
(
Required
): Path to save the modified HTML file.
-d, --image-dir DIRECTORY_PATH
(
Required
): Directory to store generated PNGs and intermediate
.mmd
files.
--executable EXECUTABLE_PATH
(
Required
): Path to the external executable (e.g.,
mmdc
) that converts
.mmd
to
.png
.
--executable-args [ARG ...]
(Optional): A list of arguments to pass to the external executable. Placeholders
{input_file}
,
{output_dir}
, and
{base_name}
will be substituted.
--executable-args -i {input_file} -o {output_dir}/{base_name}.png -w 1024
.mmd
file path as its sole primary argument and outputs a
.png
file (with the same base name as the input) in the directory specified by
-d
(which is also the CWD for the executable).
--log-file FILE_PATH
(
Required
): Path for the detailed log file.
--log-level {INFO,DEBUG}
(Optional, Default:
INFO
): Console logging detail level. The log file always captures
DEBUG
level.
--ai-retries INTEGER
(Optional, Default:
0
, Range:
0-10
): Maximum number of attempts for the AI to fix a broken Mermaid diagram.
0
disables AI fixing. Requires
GOOGLE_API_KEY
environment variable to be set.
--image-src-prefix PREFIX_STRING
(Optional, Default: ""): Prefix for the
src
attribute of generated
tags.
images/
) or an absolute URL (e.g.,
https://cdn.example.com/
).
Environment Variable:
GOOGLE_API_KEY
must be set if
--ai-retries
is greater than 0 for AI functionality.
AI Model Note:
The script currently defaults to using the
gemini-1.5-pro-latest
model. The target model is
gemini-2.5-pro-preview-05-06
, which can be configured in the script if available and preferred.
DEBUG
level).
INFO
or
DEBUG
as per
--log-level
).
GOOGLE_API_KEY
if AI retries are enabled.
-d
) if it doesn't exist.
-i
).
elements with the class
mermaid
.
:
tag.
sanitized_caption_1
or
diagram_1
) using the caption or a default naming scheme.
For each extracted Mermaid diagram and its generated base name:
.mmd
filename (e.g.,
{base_name}.mmd
) and
.png
filename (e.g.,
{base_name}.png
) in the image output directory.
current_mermaid_code
to the original extracted code.
.mmd
:
Write the
current_mermaid_code
(which might be original or AI-modified) to its
.mmd
file.
.png
file for this diagram to ensure a fresh generation check.
--executable
and processed
--executable DEBUG level and above.
--ai-retries
(Optional, Integer, Range: 0-10, Default: 0): Maximum number of times to ask the AI to fix a broken Mermaid diagram. 0 disables AI intervention.
--image-src-prefix
(Optional, String, Default: ""): A prefix for the
src
attribute of generated
tags.
"images/"
) or an absolute URL (e.g.,
"https://cdn.example.com/diagrams/"
).
AI Model:
The script currently uses
gemini-1.5-pro-latest
(defined by
AI_MODEL_NAME
internally). While the initial request specified
gemini-2.5-pro-preview-05-06
, a more generally available model is used for broader compatibility. This can be adjusted in the script if the specific preview model is accessible and preferred.
API Key:
For AI functionality (if
--ai-retries > 0
), the
GOOGLE_API_KEY
environment variable must be set. If not set, AI retries will be disabled even if specified.
--log-file
.
--log-level
.
--input-html
.
BeautifulSoup
library.
elements with the class
mermaid
. These elements are assumed to directly contain the Mermaid diagram code as their text content.
:
tag with the class
diagram-caption
that immediately follows the
. Use its text content as the diagram name idea.
.mmd
or
.png
extension) is unique within the current script run, appending counters if necessary (e.g., "my_diagram_1", "my_diagram_2").
For each extracted Mermaid diagram and its generated base name:
.mmd
file:
{base_name}.mmd
in the directory specified by
--image-dir
.
--executable
.
--executable-args
, replacing placeholders (
{input_file}
,
{output_dir}
,
{base_name}
) with their-args
(or default behavior if args not provided).
DEBUG
level.
0
(success).
.png
file exists in the image output directory.
stdout
and
stderr
from the executable into a single
last_error_message_from_executable
string. Log this detailed error.
max_ai_retries
) are exhausted for this diagram OR if AI retries are disabled (
max_ai_retries == 0
).
.mmd
file is reverted to
original_mermaid_code
if AI had modified it, and break from this render attempt loop.
current_mermaid_code
.
last_error_message_from_executable
.
DEBUG
level in the log file.
DEBUG
level.
INFO
) and log file (
DEBUG
).
current_mermaid_code
with the AI's suggestion.
ai_attempts_history
.
.mmd
). This counts as one AI retry.
.mmd
file is reverted to
original_mermaid_code
.
--executable-args
is not provided, assume the command is
.
subprocess.run()
. The script must wait for the command to complete.
stdout
,
stderr
, and the exit code from the executable.
{image_dir}/{base_name}.png
) must exist after the command finishes.
stdout
, and full
stderr
.
--ai-retries
is greater than 0 and the
GOOGLE_API_KEY
is set, initiate the AI-Assisted Correction Loop. Otherwise, mark the diagram as failed and proceed to the next diagram, leaving the original
in the HTML.
This loop is entered if the initial PNG generation fails and AI retries are enabled. It repeats up to
--ai-retries
times for the current diagram.
sequenceDiagram
participant Script
participant ExternalTool as External Diagram Tool
participant GeminiAI as Gemini AI
Note over Script: PNG generation failed for current Mermaid code.
Script->>Script: Prepare error message (stdout + stderr from ExternalTool).
Script->>Script: Construct prompt for AI (instructions, failing code, error message, AI attempt history for this diagram).
Script->>GeminiAI: Call API with prompt.
GeminiAI-->>Script: Return suggested Mermaid code / No suggestion / Error.
alt AI provides new, different, valid code
Script->>Script: Log AI suggestion (console & file).
Script->>Script: Overwrite .mmd file with AI's code.
Script->>Script:
tag.
src
attribute:
--image-src-prefix
is provided, use it (joined correctly with the PNG filename, handling URL or path cases).
alt
attribute using the diagram's sanitized caption/name.
with this new
tag.
stdout
and NOT modify this part of the HTML; leave the original
block.
-o
).
--log-file
) at
DEBUG
level.
--log-level
argument.
stderr
) from the external tool.
stdout
+
stderr
) from the *most recent* failed execution of the external tool.
stdout
+
stderr
) from the executable for the current failing code.
```mermaid ... ```
or
``` Loop];
S -- Yes --> U[AI Fix Sub-Process];
U --> V{New AI Suggestion?};
V -- Yes --> W[Update current_mermaid_code = AI suggestion, Log Suggest ... ```
).
.mmd
file with this new AI-suggested code.
.mmd
file was modified by any AI suggestion during the attempts for this diagram, revert its content back to the original Mermaid code extracted from the HTML.
tag.
src
attribute:
--image-src-prefix
is provided, use it combined with the PNG filename (e.g.,
{image_src_prefix}/{base_name}.png
). Handle URL joining correctly if the prefix is a URL.
--image-src-prefix
is empty, calculate the relative path from the location of the output HTML file to the generated PNG file.
alt
attribute using the diagram's sanitized caption/name.
for this diagram and replace it entirely with the new
tag.
block intact.
After attempting to process all found Mermaid diagrams:
--output-html
.
At the end of the script execution, log a summary to the console and log file, including:
s found.