PyRIT: Open-source framework to find risks in generative AI systems

PyRIT has been battle-tested by Microsoft’s AI red team. It started as a collection of individual scripts used during the team’s initial foray into red teaming generative AI systems in 2022. As they engaged with various generative AI systems and explored different risks, they incorporated new features they deemed beneficial. The tool should not be seen as a substitute for the manual red teaming of generative AI systems. Instead, it enhances the current domain expertise of an AI red teamer by automating the more mundane tasks. PyRIT helps identify potential risk areas, enabling security professionals to delve precisely into these critical spots. “The biggest advantage we have found so far using PyRIT is our efficiency gain. For instance, in one of our red teaming exercises on a Copilot system, we were able to pick a harm category, generate several thousand malicious prompts, and use PyRIT’s scoring engine to evaluate the output from the Copilot system all in the matter of hours instead of weeks,” wrote Ram Shankar Siva Kumar, Microsoft AI Red Team Lead. PyRIT enables researchers to refine and enhance their defenses against various harms. For instance, Microsoft uses the tool to iterate on different product versions (and its associated metaprompt), aiming to protect against prompt injection attacks. PyRIT goes beyond just a tool for generating prompts. It adapts its strategy based on the feedback from the generative AI system, creating subsequent inputs for the AI system. This process of automation persists until the security professional achieves their targeted objective.

Full story : Python Risk Identification Tool (PyRIT) is Microsoft’s open-source automation framework that enables security professionals and machine learning engineers to find risks in generative AI systems.

About OODA Analyst