Leveraging AI for Legacy Code Migration and Specification Validation: A Practical Guide

Overview

Recent retreats among software professionals have surfaced transformative insights about the future of development, particularly around agentic programming—systems where AI agents perform tasks autonomously. This guide distills key lessons from one such gathering, held under the Chatham House Rule, into actionable steps for modernizing legacy systems, validating specifications, and adopting a pragmatic approach to code porting. You'll learn how Large Language Models (LLMs) can accelerate migration, how to use AI to verify complex documents, and why organizational change-control boards hold vital historical context.

Leveraging AI for Legacy Code Migration and Specification Validation: A Practical Guide — Source: martinfowler.com

Prerequisites

Basic familiarity with software development lifecycles and legacy system challenges.
Understanding of what LLMs are and their capabilities (e.g., code generation, question answering).
Access to an LLM platform (e.g., OpenAI, open-source models) for practical experimentation.
Knowledge of test-driven development or regression testing practices.

Step-by-Step Instructions

1. Using LLMs for Code Porting

One team at the retreat created a behavioral clone of the GNU Cobol compiler in Rust—70,000 lines in just three days. This demonstrates how LLMs can efficiently translate codebases between languages while preserving behavior. To replicate this success:

Identify a target source codebase with well-defined behavior (e.g., a compiler or library).
Gather or create a robust regression test suite. If the existing project lacks tests, and you have access to an older implementation, generate tests by comparing outputs for various inputs.
Use an LLM to port code modularly. Break the code into functions or classes, then ask the LLM to translate each unit. For example, prompt: "Translate this COBOL function to Rust, preserving behavior exactly."
Run regression tests after each unit. This catches errors early and builds confidence in the new code.
Iterate on failures. Feed failing test cases back to the LLM with context to fix translations.

Hypothetical code example:
// Original COBOL: ADD 1 TO WS-COUNT.
// LLM-generated Rust: ws_count += 1;

The key is that regression tests are extremely valuable—they serve as both verification and a safety net.

2. Speeding Up Specification Verification with an Interrogatory LLM

Large specification documents are hard for humans to review comprehensively. A clever method shared at the retreat flips the process: have the LLM interview a human expert to validate correctness. This 'Interrogatory LLM' approach works as follows:

Feed the specification into an LLM with instructions to act as a curious auditor.
Prompt the LLM to generate questions that probe ambiguous, incomplete, or contradictory sections. For example: "In section 4.1, the timeout value is 30 seconds, but section 5.2 implies 60 seconds. Which is correct?"
Have a domain expert answer the generated questions. This is far more efficient than asking the expert to read the entire spec.
Update the spec based on answers and re-run the LLM to confirm consistency.

This method reduces manual review time and catches oversights that human reviewers often miss.

3. Understanding Organizational History via Change-Control Boards

One experienced consultant starts every engagement by reading the guidelines of the client's change-control board. These guidelines are the 'scar tissue' of past failures—they reveal what went wrong and why processes exist. To apply this:

Request the change-control documentation early in any modernization project.
Analyze rules and restrictions. For instance, if every change requires three approvals, there may have been a past incident with unvetted code.
Interview team members to connect guidelines to historical events. Ask: "What incident led to this rule?"
Use this insight to design a migration plan that respects valid concerns while streamlining where possible.

This practice reveals that understanding why things are the way they are is crucial to making sound technical decisions.

4. Rethinking Lift-and-Shift as the First Step in Legacy Migration

Traditionally, 'lift and shift'—porting a legacy system to a new platform with unchanged features—was criticized for missing the opportunity to rationalize bloated code. However, the retreat highlighted a new perspective enabled by LLMs:

Step 1: Lift and shift using LLMs for fast, cheap porting. The cost is now much lower than before, and a new platform (e.g., cloud, modern runtime) makes future changes easier.
Step 2: Don't stop there. Once on the new platform, analyze usage metrics to identify unused features (Standish Group in 2014 found about 50% of features are unused). Remove or replace them based on current user needs and business outcomes.
Step 3: Prioritize. Instead of preserving all old business processes, focus on what users actually need.

Important: The LLM-assisted lift-and-shift should include comprehensive testing—ideally from an existing test suite—to ensure behavioral parity. After that, you can safely evolve the system.

Common Mistakes

Skipping regression tests during code porting. Without them, you cannot trust LLM output, leading to subtle bugs.
Treating lift-and-shift as the end goal. Just moving to a new platform without later optimization wastes the opportunity to reduce technical debt.
Ignoring business context. Even with AI, you must involve domain experts to validate specs and prioritize features.
Overlooking change-control rules. These are not bureaucratic hurdles; they encode lessons from past failures.
Assuming LLMs understand your specific domain without verification. Always test generated code against real requirements.

Summary

This guide has shown how insights from a professional retreat can reshape your approach to legacy modernization. By leveraging LLMs for fast code porting, using interrogatory techniques for spec review, learning from change-control histories, and adopting a two-step lift-and-shift strategy, you can modernize more efficiently while avoiding common pitfalls. The future of agentic programming offers remarkable new capabilities—but they must be paired with rigorous testing and human oversight to succeed.

Tags: