Two “straight no chaser”, best-in-class explorations of how best to implement LLMs within your organization.
The Carnegie Mellon Software Engineering Institute (SEI) and the Special Competitive Studies Project (SCSP) – in collaboration with the John Hopkins University Applied Physics Laboratory (JHUAPL) – explore LLM use cases in software engineering and acquisition and “a framework for identifying highly consequential AI use cases”, respectively.
In this white paper, the authors examine how decision makers, such as technical leads and program managers, can assess the fitness of large language models (LLMs) to address software engineering and acquisition needs. They introduce exemplar scenarios in software engineering and software acquisition, and they identify common archetypes. The authors also describe common concerns involving the use of LLMs and enumerate tactics for mitigating those concerns. Using these common concerns and tactics, the authors demonstrate how decision makers can assess the fitness of LLMs for their own use cases through two examples:
“The idea of harnessing LLMs to enhance the efficiency of software engineering and acquisition activities holds special allure for organizations with large software operations, such as the Department of Defense (DoD), as doing so offers the promise of substantial resource optimization. Potential use cases for LLMs are plentiful, but knowing how to assess the benefits and risks associated with their use is nontrivial. Notably, to gain access to the latest advances, organizations may need to share proprietary data (e.g., source code) with service providers. Understanding such implications is central to intentional and responsible use of LLMs, especially for organizations managing sensitive information.”
This SEI report examines “how decision makers, such as technical leads and program managers, can assess the fitness of LLMs to address software engineering and acquisition needs . We first introduce exemplar scenarios in software engineering and software acquisition and identify common archetypes. We describe common concerns involving the use of LLMs and enumerate tactics for mitigating those concerns. Using these common concerns and tactics, we demonstrate how decision makers can assess the fitness of LLMs for their own use cases through two examples.
Capabilities of LLMs, risks concerning their use, and our collective understanding of emerging services and models are evolving rapidly [Brundage et al. 2022]. While this document is not meant to be comprehensive in covering all software engineering and acquisition use cases, their concerns, and mitigation tactics, it demonstrates an approach that decision makers can use to think through their own LLM use cases as this space evolves.”
For the full report, go to this link.
1 – Ozkaya, I. Application of Large Language Models to Software Engineering Tasks: Opportunities, Risks, and Implications. IEEE Software. Volume 40. Number 3. May-June 2023. Pages 4–8. https://ieeexplore.ieee.org/document/10109345
2 – Brundage, M.; Mayer, K.; Eloundou, T.; Agarwal, S.; Adler, S.; Krueger, G.; Leike, J.; & Mishkin, P. Lessons Learned on Language Model Safety and Misuse. OpenAI. March 2, 2022. https://openai.com/research/language-model-safety-and-misuse
This document sets forth a framework for identifying highly consequential AI use cases (ID HCAI Framework) that might have significant benefits or harms on society. The framework is a tool that can help regulators ensure that the development and use of AI systems align with democratic values. By using this template, regulators can focus their efforts on AI systems that are highly consequential to the public, standardize their approach across sectors, and adapt their approach to the specific needs of different sectors. Additionally, by documenting their processes and decision making, regulators can help to ensure accountability and transparency. This framework template should be adopted and tailored by regulators for sector-specific needs.
Some AI use cases will require regulatory focus, while others will not. This framework aims to help regulators identify AI use cases in the “gray area.” An initial high-level assessment must be made as to whether the AI use case under consideration warrants the resources to conduct a more thorough assessment of whether an AI system is highly consequential. The initial judgment will help determine whether the AI development or use case under consideration has foreseeable harms that could pose a significant impact on health, safety, or fundamental rights, or substantial benefits that should overwhelmingly incentivize the AI development and use. If not, then no further assessment is required and the AI use case is determined to not be of high consequence.
Otherwise, the complete framework should be applied to help determine whether the AI use case is highly consequential to individuals or society. A suggested best practice is to document the process and rationale used at every decision point in the ID HCAI Framework. It is also recommended that regulators establish a registry of evaluated AI use cases and their classifications with exceptions (e.g., for national security or justified industry secrecy). The registry should contain a mechanism by which the public can provide input on these evaluated AI system use cases. This will inform the public of assessments and classifications, allowing them to inform regulators of any contextual changes triggered by the continued use of the AI system that may affect periodic reassessments. It will also have the added benefit of informing industry about the regulators’ evaluation process.
The framework interprets AI as computational systems that do some of the predictions, recommendations, classifications, and other decision making that traditionally are in the province of humans. This definition includes systems which are not possible without AI, and those that make use of AI-based components, AI-enabled functions, or AI-derived algorithms. The framework is intended for assessments of AI systems as a whole, vice components, and their concrete impacts on society that result from how they change the context or condition of society. It further proposes that assessments be performed by regulators with input from multi-disciplinary experts, including the public, which is best positioned to evaluate impacts on society. In addition, societal impacts are those resulting both from the use of the AI system as well as its development (e.g., impacts on data workers and from environmental impacts).
Three AI lifecycle points at which the framework can be applied:
- Regulators foresee a new application for AI,
- A new application for AI is under development or proposed to a regulatory body, and
- An existing AI system has created a highly consequential impact that triggers an ex-post facto regulatory review.
The high-level steps to the framework are:
- Preliminary analysis: Determine whether the AI application has foreseeable harms or benefits that could impact, for example, health, safety, or fundamental rights, and consequently may need to be regulated. This is intended to be an initial filter to determine whether a fuller assessment is needed.
- Parallel analysis of harms and benefits: If there is foreseeable harm or benefit, conduct a more comprehensive harms/benefits analysis, which involves performing parallel harm and benefit assessments.
- Enumerate and evaluate the magnitude of foreseeable and actual harms from the AI system development and use.
- Enumerate and evaluate the magnitude of foreseeable and actual benefits from the AI system development and use.
- Final decision on high consequence: Using the magnitude assessment results, determine if the AI use case is of high consequence.
- If yes, a sector-specific regulator must determine how best to take next steps to regulate the AI development and/or use (e.g., whether to create incentives, mitigate harms, or establish bans).
- Periodic reassessment: Periodically monitor sectoral AI use to determine if the list of AI systems identified as highly consequential remains appropriate for that sector given contextual changes and whether revisions to classifications are necessary.
Categories of Harms and Benefits
The potential specific harms and benefits are grouped into ten corresponding categories (see Table 1 and Appendix 1). The framework provides specific harms and benefits for each category, as examples, with the recognition that specific harms/benefits will be unique to sectors. Harms and benefits are further characterized by magnitude (e.g., the scope of a harm or benefit). The framework provides factors to calculate the magnitude of an identified harm or benefit. Specifically, harms are characterized by four severity factors and four likelihood factors. Benefits are characterized by four impact factors and two likelihood factors. Lastly, the document offers ways to make high consequence determinations based on the quantification of a system analysis.
Table 1. A list of ten harm categories (left) and benefit categories (right). Corresponding categories of harm and benefit are identified on the same row.
Table 1. ID HCAI Framework Corresponding Categories of Harms and Benefits
Appendix 1 provides tables of specific harms and benefits for each category, with corresponding descriptions. These tables are intended to guide the framework user (e.g., a sector-specific regulator) through consideration of examples of the types of specific harms or benefits associated with each category. Note that potential violation of fundamental rights is incorporated into the specific harms lists. The tables are meant to be illustrative and not exhaustive lists.
As noted in the high-level steps above, this process begins with an AI development or use case for review. While no application will be completely free of any potential harm and all presumably have some potential benefit, the framework assumes that this process has been employed because the possibility of some significant AI related harm or benefit has been identified as a reasonable outcome. To determine whether the AI development or use should be regulated, a framework user should explore the extent of those potential harms and benefits.
For the entire SCSP report, go to this link.
Additional OODA Loop Resources
Computer Chip Supply Chain Vulnerabilities: Chip shortages have already disrupted various industries. The geopolitical aspect of the chip supply chain necessitates comprehensive strategic planning and risk mitigation. See: Chip Stratigame
Technology Convergence and Market Disruption: Rapid advancements in technology are changing market dynamics and user expectations. See: Disruptive and Exponential Technologies.
The New Tech Trinity: Artificial Intelligence, BioTech, Quantum Tech: Will make monumental shifts in the world. This new Tech Trinity will redefine our economy, both threaten and fortify our national security, and revolutionize our intelligence community. None of us are ready for this. This convergence requires a deepened commitment to foresight and preparation and planning on a level that is not occurring anywhere. The New Tech Trinity.
AI Discipline Interdependence: There are concerns about uncontrolled AI growth, with many experts calling for robust AI governance. Both positive and negative impacts of AI need assessment. See: Using AI for Competitive Advantage in Business.
Benefits of Automation and New Technology: Automation, AI, robotics, and Robotic Process Automation are improving business efficiency. New sensors, especially quantum ones, are revolutionizing sectors like healthcare and national security. Advanced WiFi, cellular, and space-based communication technologies are enhancing distributed work capabilities. See: Advanced Automation and New Technologies
Emerging NLP Approaches: While Big Data remains vital, there’s a growing need for efficient small data analysis, especially with potential chip shortages. Cost reductions in training AI models offer promising prospects for business disruptions. Breakthroughs in unsupervised learning could be especially transformative. See: What Leaders Should Know About NLP