Delivery Risk (Delivery and Technical Risk) Agent

Delivery Risk (Delivery and Technical Risk) Agent

Overview

(Last revision: 11/26/25)

Allstacks Deep Research Agents offer AI-powered analysis tools that examine your engineering data to identify risks, provide context-aware insights, and deliver actionable recommendations.

Alpha Preview Notice: The Delivery Risk Agent is under active development. Features, output formats, and capabilities are subject to change based on customer feedback and ongoing improvements.

The Delivery Risk Report performs multi-pass analysis to identify delivery risks and explain the "why" behind potential delivery issues. Unlike traditional dashboards that simply display metrics, this report understands patterns, detects anomalies, and provides specific, actionable guidance to help teams stay on track.

 

image-20251126-002045.png

 

What Makes It Different

Context-Aware Intelligence

The Delivery Risk Report doesn't analyze items in isolation. It considers the overall project context, team dynamics, historical patterns, and interdependencies to provide meaningful risk assessments that reflect the real situation your team faces.

Progressive Understanding

The report learns from previously generated reports, tracking what has changed since the last analysis and highlighting new risks or improvements. This helps teams focus on what matters most right now rather than repeatedly reviewing the same information.

Metric-Informed Analysis

By integrating key metrics like velocity trends, cycle time patterns, and forecasting data, the report provides risk assessments grounded in your team's actual performance data rather than generic assumptions.

Balanced Risk Assessment

The report now includes both positive findings and risk indicators, providing a complete picture that helps contextualize real risks and identifies strengths to build upon.


Understanding the Delivery Risk Report

Purpose

The Delivery Risk Report assesses whether an effort is likely to be delivered on time or face significant delays.

What It Analyzes

  • The parent work item (epic, initiative, milestone, etc.)

  • All child tickets under that parent

  • Pull requests associated with those tickets

  • Commits linked to the work

  • Team contribution patterns and role assignments

  • Historical delivery patterns

Important: The report does NOT analyze raw source code directly, but it does examine commit messages, PR comments, code change patterns, and review behaviors.

Is it Limited by Team or Workspace Configurations?

No. The Delivery Risk Report investigates all relevant data for the selected parent item from all the tools you have connected to Allstacks, regardless of team or workspace settings. Information about teams and projects, however, are used in the final report when talking about which teams or roles are responsible for which items, etc.

When to Use

  • Optimal: Run on in-flight epics that are actively being worked on

  • Generate reports regularly (weekly or bi-weekly) during active development

  • Use when you need to understand if deadlines are at risk

  • Review before stakeholder updates or planning adjustments


Understanding Risk Scores

Delivery Risk (0-10 scale)

Assesses whether work will be completed on time:

  • Focuses on timeline and deadline feasibility

  • Looks at patterns suggesting delays, where longer delays equate to higher scores

  • Considers velocity, scope, and progress trends

  • Accounts for team capacity and competing priorities

Technical Risk (0-10 scale)

Examines coding behaviors and practices:

  • Identifies patterns where technical issues could arise

  • Reviews code change patterns and development habits

  • Flags potential quality, maintainability, or architectural concerns

  • Considers test coverage, review practices, and infrastructure changes

Risk Status Indicators

  • High (Red): Immediate attention required - significant delivery or technical issues identified

  • Medium (Yellow): Watch closely - concerning patterns that could escalate

  • Low (Green): On track - normal project challenges within acceptable range

  • Risk Trend Arrows: ▲▲ (increasing), ▲ (slightly increasing), ► (stable), ▼ (improving)

Important Note on Risk Scoring: The system is actively being improved to provide more accurate and meaningful risk differentiation at the individual ticket level.


Report Output Structure

The Delivery Risk Report delivers a PDF as the primary report for stakeholders, with supporting materials that provide transparency and enable deeper analysis.

PDF Report (Primary Deliverable)

The main report format for executive review and stakeholder presentations, featuring:

  • Visual risk indicators and color-coding

  • Executive summary with key findings

  • Analysis of highest-risk items

  • Prioritized action recommendations

Supporting Materials

Per-Item Risk Details CSV: Complete analysis data with one row per analyzed ticket, including risk scores, summaries, suggested actions with actors, and ticket links. Use this to:

  • Analyze items beyond the Top 5 in the PDF

  • Track trends over time

  • Create custom views and filters

  • See what data properties the AI used in its assessment

Project Actors Summary: Identifies team members based on system activity to provide context for action assignments and work allocation decisions.

Plain Text Format (In Development): Unformatted version for dropping into AI tools, rapid prototyping, or custom integrations.

The CSV files serve as both detailed backup data and the agent's "show your work" capability—providing full transparency into the analysis.


Report Sections Explained

Summary Section

Epic Overview: The report begins with the epic name, current status, and overall risk indicator (Low/Medium/High).

Overall Risk Scores: Delivery Risk and Technical Risk scores (rated 0-10) with clear definitions of what each score measures.

Risk Status Trend: A Visual indicator showing whether the risk is increasing or decreasing compared to previous reports.

Top 5-10 High-Impact Actions

This section prioritizes the most critical interventions needed:

Each action includes:

  • Associated ticket ID and title

  • Clear description of what needs to be done and why

  • Assigned role with identified team member (e.g., "Engineering Manager [likely Camilo Romero]")

  • Risk reduction impact score (e.g., ↓3.2 indicates a 3.2-point risk reduction on the 10-point scale)

  • Specific deliverables or artifacts expected from each action

  • Action type classification (resource/scope/process/technical/testing/etc.)

Role Assignments: Actions specify who should take the lead:

  • Lead Developer: Technical specifications, code reviews, architecture decisions

  • Engineering Manager: Process improvements, team coordination, resource allocation

  • QA Team: Test planning, validation strategies, coverage improvements

  • PM/Product Manager: Requirements clarification, stakeholder communication, scope management

  • DevOps Engineer: Infrastructure planning, deployment strategies, operational considerations

How Actors Are Identified: The system discovers role assignments from actual contribution patterns:

  • Commit and PR activity

  • Ticket updates and comments

  • Review patterns

  • Team roster information (when available and consistent with behavior)

  • Reconciliation between stated roles and observed behavior

When confidence is low, the system provides "best guess" assignments with confidence indicators.

Top At-Risk Child Items

Detailed analysis of the highest-risk tickets within the epic:

Each item displays:

  • Separate Delivery Risk and Technical Risk scores

  • Comprehensive risk narrative explaining specific factors

  • Multiple prioritized action items with role assignments

  • Detailed instructions and expected deliverables

  • Risk reduction impact scores

  • Direct link to the ticket

Analysis Considerations:

  • Ticket type (e.g., subtasks are evaluated differently than full stories)

  • Ticket status and lifecycle stage

  • Appropriate expectations for different ticket types

  • Historical context and completion status

Pattern Recognition

The report identifies systematic issues across multiple tickets, such as:

  • Insufficient code review practices for infrastructure changes

  • Status discrepancies indicating communication gaps

  • Large code additions without proper review context

  • Coordination challenges across multiple repositories

  • Performance implications of database changes

  • Workflow shortcuts approaching deadlines

These patterns help teams address root causes rather than just symptoms, leading to more sustainable improvements.

Balanced Findings

New Feature: The report now includes positive findings alongside risks:

  • Strengths and effective practices to build upon

  • Well-executed work that can serve as examples

  • Areas where the team is performing well

  • Context that helps distinguish real risks from false alarms


How to Read Action Items

Each recommended action in the report includes:

Priority Ordering: Actions are sorted by impact, with the highest-value interventions first. At least one "do this today" critical action is included.

Risk Reduction Scores: The number shows how much implementing this action will reduce overall risk. For example, ↓3.2 means the action reduces risk by 3.2 points on the 10-point scale. Actions include predicted impact on both Delivery Risk and Technical Risk.

Role Assignments: Each action specifies the role best suited to lead (e.g., Lead Developer, Engineering Manager).

Actor Identification: Where possible, specific team members are suggested based on their activity patterns, with confidence levels indicated.

Specific Deliverables: Each action describes concrete outputs expected (documentation, test plans, specifications, meetings, etc.).

Success Criteria: Guidance on how to validate that the action achieved its intended outcome.


Choosing the Right Work Items

What Types of Epics/Projects/Initiatives Work Best?

The best candidates are:

  • Epics - Groups of related stories working toward a common goal

  • Initiatives - Large strategic efforts spanning multiple epics

  • Features - Substantial functionality broken into smaller work items

  • Milestones - Time-boxed collections of related work

Important: Work items must be containers with child tickets. Simple labels (like "BUGS") or flat lists of unrelated items are not good candidates for Deep Research analysis.

Does the Work Item Need to Be Complete?

No, the work item doesn't need to be complete, but it should have child tickets associated with it. The analysis becomes more valuable as more work accumulates:

  • New work items (recently created with minimal history) are unlikely to yield meaningful insights

  • In-flight work (actively being worked on) provides the most actionable risk assessment

  • Nearly complete work offers valuable retrospective insights but less actionable guidance

The more data available (completed tickets, commit history, PR activity), the more patterns and insights the analysis can uncover.

Scope of Analysis

Current Behavior:

  • The system analyzes all tickets linked to the parent item

  • Risk analysis prioritizes the most impactful items for deep analysis

  • The top 50 highest-risk items receive a detailed assessment in the final report

Future Improvements:

  • Broader initial scope with less detailed analysis of older/inactive tickets

  • Improved handling of temporary links and removed items

  • Impact scoring (not just risk scoring) to include positive signals

  • Summarization of similar or low-impact tickets as single units


Data Quality and Known Limitations

Alpha Preview Status

The Delivery Risk Report is currently in Alpha Preview, which means:

The AI analysis and recommendations are continuously improvingReport format and features may evolve based on customer needsYou may encounter occasional inconsistencies or areas for improvementYour feedback directly shapes the product development roadmap

Current Known Limitations

Risk Score Calibration: Individual ticket risk scores are being refined. Currently, you may observe:

  • Many tickets with similar or identical risk scores

  • High baseline risk scores across most items

  • More reliable relative comparisons than absolute scores

Overall epic-level risk assessment and relative prioritization within a body of work remain valuable.

Code Linkage: The system is actively improving how it links code changes to specific tickets:

  • Erroneous code attribution may occur through parent-child relationships

  • Some tickets may incorrectly show large code changes

  • Master branch syncs may not be fully excluded yet

Ticket Type Handling: Different ticket types (stories, tasks, subtasks, bugs) are being evaluated with more nuanced expectations:

  • Subtasks don't require the same code review or QA processes as full stories

  • Short-lived subtasks should not be flagged as concerning

  • Status-specific risk analysis is being enhanced

Dependency Claims: The AI may occasionally speculate about dependencies or integrations. Speculative risks are being better differentiated from evidenced risks in the report.

Scope Boundaries:

  • Temporarily linked or quickly descoped items may still appear in analysis

  • The boundary of "in scope" vs "out of scope" is being refined


AI-Generated Content Disclaimer

CRITICAL REMINDER:

All reports are generated using Large Language Models (LLMs) and artificial intelligence. AI can and does make mistakes, including:

  • Hallucinations or inaccurate assessments

  • Misattribution of code or contributions

  • Incorrect dependency claims

  • Over- or under-estimation of risks

Always:

  • Review recommendations carefully

  • Verify critical information before taking action

  • Use the report as a starting point for investigation, not definitive guidance

  • Apply your team's context and expertise when evaluating recommendations


Best Practices

Regular Cadence

For in-flight epics, weekly or bi-weekly reports are most valuable:

  • Align with your sprint schedules

  • Track trends and catch issues early

  • Shows trajectory and improvement over time

  • Enables comparison of risk evolution

Use Real Data

The report generates the most valuable insights when analyzing real project data with actual development activity:

  • Test data or inactive epics won't provide meaningful analysis

  • The AI learns from patterns in commits, PRs, ticket updates, and team behaviors

  • All of which require genuine development work

Provide Feedback

As you use the report, note what insights were helpful and which areas could be improved. Work with your Customer Success representative to share observations about:

  • Insights that led to valuable actions

  • Risks that weren't relevant to your context

  • Missing information you wished the report included

  • Formatting or presentation improvements

  • Accuracy of actor identification and role assignments

Integrate with Workflows

Don't let the report sit in isolation. Incorporate findings into:

  • Sprint planning and retrospectives

  • One-on-one meetings with team members

  • Stakeholder status updates and executive briefings

  • Technical design reviews and architecture decisions

  • Risk mitigation planning sessions

  • Backlog refinement and grooming sessions


Current Capabilities and Roadmap

Coming VERY soon

  • Flexible plain text output for rapid iteration and customization

  • Improved contextual awareness making individual item risk assessments more intelligent about overall project context

  • Enhanced actor discovery with better reconciliation of roles and contributions

  • Bug fixes for code linkage, master sync exclusion, and ticket type handling

Coming Soon

  • Historical learning: Deeper integration of previous report findings to track trends and changes more effectively

  • Metric integration: More sophisticated use of velocity, forecasting, and other key metrics in risk assessment

  • Multi-pass risk analysis: Identifying potential risks, gathering evidence, then confirming evidenced risks vs. speculative concerns

  • Source citations: Every risk finding will cite specific evidence (commit, PR, ticket update, etc.)

  • Workflow optimization: Streamlined processes for running reports and acting on findings

  • Additional risk dimensions: Enhanced code analysis capabilities

  • Improved scope handling: Better management of removed items, temporary links, and scope boundaries

Future Vision

  • Interactive AI assistant that can explain report findings and answer questions

  • Action tracking with validation of risk reduction

  • Configurable risk definitions to match organizational standards

  • Raw source code analysis (currently not included)


FAQ

Getting Started

Q: What type of tickets should I use for the Delivery Risk Report? A: Use higher-level work items that serve as containers for child tickets. Good candidates include epics, initiatives, features, and milestones. Avoid using simple labels (like "BUGS") or flat lists of unrelated items, as these won't provide meaningful analysis.

Q: Does the work item need to have child tickets? A: Yes, the Delivery Risk Report analyzes parent work items and their children. The parent must have associated child tickets, though it doesn't need to be complete. The more child tickets and associated work (commits, PRs) present, the more valuable the insights.

Q: Is there a minimum size where the report becomes valuable? A: The more data available, the more patterns and insights the analysis can find. Very new work items with minimal history are unlikely to yield actionable findings. In-flight epics with active development and some completed work provide the most valuable insights.

Q: Does the Delivery Risk Report respect workspace configurations? A: No. The Delivery Risk Report focuses on specific parent items (epics, initiatives) and everything connected to them, regardless of workspace settings.

Report-Specific Questions

Q: How long does it take to generate a report? A: Report generation time depends on the scope and amount of data being analyzed, but typically completes within a few minutes.

Q: What's the difference between Delivery Risk and Technical Risk? A:

  • Delivery Risk assesses whether an effort is likely to be 6 months or more late, focusing on timeline, velocity, and progress patterns

  • Technical Risk examines coding behaviors and habits to identify patterns where technical issues (quality, maintainability, architecture) could arise

Q: Why do many tickets have similar risk scores? A: Individual ticket risk scoring is being refined. Currently, relative comparisons between tickets and overall epic-level risk assessment are most reliable. This is a known limitation being actively addressed.

Q: Can code analysis be turned off? A: Code-level analysis as an optional feature is coming soon. This will allow faster report generation and the ability to compare results with and without technical analysis.

Data and Analysis

Q: What data sources does the Delivery Risk Report analyze? A: The report analyzes:

  • The parent work item

  • All child tickets (with smart handling of ticket types and status)

  • Pull requests associated with those tickets

  • Commits linked to the work

  • Team contribution patterns and role assignments

  • Historical delivery patterns

It does NOT currently analyze raw source code.

Q: How are team members identified for action items? A: The system discovers contributors through:

  • Commit and PR activity

  • Ticket comments and updates

  • Review patterns

  • Team roster information (when available)

  • Reconciliation of stated roles with observed behavior

When confidence is low, assignments include confidence indicators or "best guess" qualifiers.

Q: How does the system handle different ticket types? A: The system is actively being enhanced to evaluate different ticket types with appropriate expectations:

  • Subtasks don't require extensive code review or QA

  • Quick-turnaround subtasks aren't flagged as concerning

  • Stories and epics have higher expectations for process rigor

  • Status-specific analysis adjusts expectations based on ticket lifecycle stage

Best Practices

Q: How often should I run delivery risk reports? A: For in-flight epics, weekly or bi-weekly reports are most valuable, aligned with your sprint schedules. This allows you to track trends and catch issues early. Regular cadence on the same epic shows trajectory and improvement.

Q: Can I run reports on multiple epics simultaneously? A: Currently, each report focuses on a single work item (epic or initiative) and analyzes all connected child tickets, PRs, and commits. This allows for deep, contextual analysis of that specific body of work.

Q: What should I do if I find errors or inconsistencies? A: Please report them to your Customer Success representative or to jeff.keyes@allstacks.com. Your feedback directly improves the system. Specific examples of issues help the team refine the AI's analysis patterns.

Q: How accurate is the AI analysis? A: The Delivery Risk Report is AI-generated and should be used as a starting point for investigation rather than definitive guidance. Always verify critical information and apply your team's context and expertise. The system is continuously improving based on real-world usage and feedback.


Getting Help

Customer Success Team

Your CS team can help you:

  • Set up your first reports

  • Interpret findings specific to your organization

  • Customize analysis parameters for your use case

  • Schedule regular reviews and check-ins

  • Guide best practices

  • Report issues and track resolutions

Feedback and Feature Requests

The Delivery Risk Report is actively being refined based on customer feedback. If you have suggestions, encounter issues, or want to discuss specific use cases, reach out to your customer success representative or submit feedback through the platform.

Direct Contact: For questions, feedback, or assistance with the Delivery Risk Report, contact:

We're here to help!