Difference between revisions of "Artificial Intelligence Software Agent"

From MIT Technology Roadmapping
Jump to navigation Jump to search
Line 68: Line 68:


==Technical Model: Morphological Matrix==
==Technical Model: Morphological Matrix==
[[File:Morphological matrix ai sde agents.png|1200px]]
This morphological matrix maps out five critical decision variables (Quality, Cost, Speed, Latency, and Model Size) across 14 different LLM models, revealing key trade-offs in the current AI landscape. The matrix illustrates a clear performance-resource trade-off pattern: models with higher parameter counts (like o1 Preview at 750B and o1 Mini at 500B) generally achieve superior quality scores (0.95 and 0.93 respectively) but at significantly higher costs ($75 and $15 per 1M tokens) and increased latency (31.49s and 15.32s). Conversely, smaller models like Gemini 1.5 Flash and GPT-4o Mini demonstrate impressive efficiency with lower latency (0.38s and 0.45s) and costs ($0.37 and $0.76 per 1M tokens), while maintaining respectable quality scores (0.84 and 0.86).

Revision as of 00:25, 4 November 2024

Roadmap Creators


Artificial Intelligence (AI) Software Agent Roadmap

  • 2AISA - Artificial Intelligence (AI) Software Agent

The technology we selected is an AI Software Agent capable of receiving natural language prompts, generating complex software development strategies, executing all programming tasks to build end-to-end software applications, and deploying those applications for business or leisure purposes. This is a Level 2 Technology Roadmap. Level 1 encapsulates the ecosystem of all AI System Technologies, while Levels 3 and 4 would include foundational technologies central to the form and function of AI, including but not limited to Machine Learning, Neural Networks, Large Language Models, and Graphics Processing Units (GPUs).


Roadmap Overview

Since ChatGPT-4 was released in 2022, it has amazed people worldwide and marked the beginning of a new era in AI innovation. The prosperity has expanded from hardware advancements like GPUs to breakthroughs in large language models (LLM), natural language processing (NLP), and machine learning (ML). This year, 2024, has been nominated the year of AI agents due to the rapid advancements in AI technologies, with emerging trends such as multi-agent systems and agentic AI. These agents are reshaping industries by automating processes, enhancing productivity, and enabling more multi-model interactions. Software AI agents are becoming increasingly sophisticated, offering new possibilities for automation, decision support, and human-AI collaboration across various domains.

AI agents possess several core capabilities:

  • Perception: gather data and documents from database, APIs
  • Reasoning: analyze data, identify patterns, and make informed decision using advanced algorithms and machine learning
  • Action: autonomously perform tasks, from answering queries to executing complex processes
  • Learning: continuously learn from experience and improve performance over time


Framework of AI Agents - Source: [1]
AI Software Agent - Source: [2]


According to the AI Benchmarking Report by CodeSignal, while AI agents are increasingly powerful, the creativity and intuition of human engineers demonstrate when dealing with complex or cutting-edge problems still marks a weakness of AI agents. This technology roadmap explores the potential of AI software agent in automation, decision making, and human-AI collaborations.

Reference:

  1. https://www.leewayhertz.com/ai-agents/
  2. https://yellow.ai/blog/ai-agents/
  3. https://codesignal.com/blog/engineering/ai-coding-benchmark-with-human-comparison/


Design Structure Matrix (DSM) Allocation

2AISA DSM & Relation to other Technologies

Roadmap Model using OPM

The Object-Process-Model (OPM) of the 2AISA AI Software Agent is provided in the figure below. This diagram captures the main object of the roadmap, its various processes and instrument objects, and its characterization of two relevant Figures of Merit (FOMs): Productivity and Accuracy.


2AISA OPM

Figures of Merit (FOM)

FOM PSET2 part2.png


Alignment with “Company” Strategic Drivers: FOM Targets

Our “hypothetical” company provides software AI agents to help facilitate the software development process for individual users and business clients. The following strategic drivers are essential for ensuring our product meets market needs and stays competitive.

The first and second drivers align closely with our technology roadmap, as the AI industry is still in the early, rapid-growth phase of the S-curve. In this stage, innovation cycles are short, and all companies are focused on advancing product performance and expanding the market. The third driver, however, will become increasingly important as the market approaches saturation and companies shift their focus from pure technology advancement to competing for market share through added-value features and services around the core technology.

# Strategic Driver Alignment and Targets
1 To create value for our users by increasing their productivity and reducing development time at a reasonable price The 2AISA technology roadmap prioritizes the productivity and cost-effectiveness of the software AI agent as its primary FOMs. The goal is to enhance productivity by 20% while achieving a 20% cost reduction per project.
2 To ensure the quality and accuracy of the generated codes that our users can trust and rely on The 2AISA technology roadmap will continually advance in achieving 95% completion accuracy and reduce the number of violations.
3 To deliver a seamless user experience with robust compatibility for business clients, ensuring smooth integration with other systems. The 2AISA technology roadmap currently does not prioritize user application enhancement because we want to build a solid foundation before expanding into user-centric features.

Technical Model: Morphological Matrix

Morphological matrix ai sde agents.png

This morphological matrix maps out five critical decision variables (Quality, Cost, Speed, Latency, and Model Size) across 14 different LLM models, revealing key trade-offs in the current AI landscape. The matrix illustrates a clear performance-resource trade-off pattern: models with higher parameter counts (like o1 Preview at 750B and o1 Mini at 500B) generally achieve superior quality scores (0.95 and 0.93 respectively) but at significantly higher costs ($75 and $15 per 1M tokens) and increased latency (31.49s and 15.32s). Conversely, smaller models like Gemini 1.5 Flash and GPT-4o Mini demonstrate impressive efficiency with lower latency (0.38s and 0.45s) and costs ($0.37 and $0.76 per 1M tokens), while maintaining respectable quality scores (0.84 and 0.86).