
FlakeWarden: 90.7% Accuracy and a 0% Safety False-Positive Rate on Flaky-Test Triage, on UiPath Maestro
Flaky tests are the most corrosive failure mode in CI: a red build might be a real regression or just noise, and engineers eventually start ignoring red builds until a real bug ships. FlakeWarden answers the only question that matters (real defect, flaky, or environment) with a deterministic flake-scorer for the clear cases and a grounded UiPath Agent Builder classifier for the ambiguous ones, orchestrated through Maestro with a human approving every change. 90.7% accuracy on a 150-case corpus, with 0% safety false-positive rate enforced by mechanism.

























