Alignment: Semantics in the AI/ML era
A common question we encounter is: what is different about AI/ML development today, especially after the advent of large-language models (LLMs)?
Beyond the obvious changes - increased adoption, hype cycles, and an influx of new engineers - one of the most profound shifts is the new level of semantics that models introduce into the coding process; a shift which has been termed alignment within the industry.
Semantics refers to the meaning of code, distinct from its form (syntax). In traditional development, semantics could often be verified by checking whether the code’s behaviour matched its intended function. Syntax errors were straightforward; semantic validation was often just a matter of confirming that the system “did the thing we asked”.
Historically, the relationship between syntax and semantics was tight. In lower-level languages like C, developers wrote libraries that ran directly on the hardware, with machine code operating in a tightly coupled way to the written source. As higher-level languages like Python and JavaScript emerged, we introduced virtual machines and runtime environments, stretching the abstraction further, but still largely within systems that we could fully inspect and control.
Now, with large models often accessed via APIs, we extend the abstraction even further. The model itself becomes a semi-autonomous participant: it interprets our inputs, executes complex behaviours, and produces outputs based on internal structures that are not fully visible to the developer.
In this environment, it is common to submit code that is both syntactically and semantically correct (by traditional standards), receive a syntactically and semantically valid response, and still find that the output is wrong.
Here, the syntax is correct; The immediate semantics (at the API boundary) are correct; Yet the goal semantics - the deeper alignment with what we intend the system to accomplish - are misaligned.
Model alignment is the AI concept that captures this. Often, it is discussed in the context of safety concerns (ensuring the model does not produce harmful or offensive content). But alignment issues also occur in much more subtle and practical ways. Some examples:
- A model tasked with processing radiology reports might focus globally rather than on a specified region.
- A customer support chatbot might answer generically instead of following brand-specific communication guidelines.
- A data extraction model might capture structured fields incorrectly, despite the input being clearly labeled.
- A summarization model for legal documents might omit critical clauses because it prioritizes brevity over risk coverage.
- A model assisting in financial forecasting might overfit to short-term trends, misaligning with strategic planning horizons.
- An AI used for candidate screening might bias toward superficial resume keywords rather than deeper role fit.
- A recommendation system might optimize for click-through rate, while the actual goal is long-term user trust.
In building AI-enabled systems today, engineers must go beyond syntax and surface-level semantic checks. They must actively verify:
- Model origins: Are the developers transparent about how the model was built?
- Model training data: Was our domain (or a closely related one) represented in the training?
- Model capabilities: Can we properly condition inputs to guide outputs?
- Model responses: Does the model’s actual behaviour match product or system goals - and can we adapt when it doesn’t?
Understanding and managing this new layer of semantics is essential for building reliable, goal-aligned AI systems. As abstraction deepens, so does the responsibility to test, verify, and align models not just to tasks, but to the true goals behind the tasks.
We’re always interested in hearing how others are approaching these challenges. If you’ve encountered alignment issues in your AI projects or have questions about managing them feel free to share or reach out.