Sub-group Meeting 2nd February 2026

Sam Williamson led the meeting.

Sam thought it went quite well and we talked for over two hours, which I take as a good sign, there were five people present. Here are the notes used:

Basic description of How AI Works

1. The Core Concept: Pattern Recognition

Unlike traditional software that follows rigid "if-then" rules, AI learns from experience.

Traditional Programming: You write a million rules for a million scenarios.
Artificial Intelligence: You show the computer a million examples, and it identifies the underlying patterns to make its own decisions.

2. The Two Main Phases: Training vs. Inference

Think of this as the difference between a student studying for an exam and then actually taking it.

Phase 1: Training (The "Study" Phase): This is highly resource-intensive and can take days or weeks. The model "learns" by analyzing massive datasets (text, images, or sensor data) and adjusting its internal parameters (weights and biases) to minimize errors.
Phase 2: Inference (The "Work" Phase): This is the execution stage where the value is realized. The trained model is applied to brand-new, unseen data to generate instant predictions, such as a self-driving car identifying a stop sign on a road it has never been on before.

3. The "Engine": Neural Networks

Modern AI uses Neural Networks, which are computational models inspired by the human brain.

Input Layer: Receives the raw data (e.g., pixels of an image).
Hidden Layers: These are the "black box" where complex math happens. Each layer extracts more abstract features—for an image, the first layer might find edges, the next shapes, and the last objects like a face.
Output Layer: Delivers the final result, such as a probability score (e.g., "95% chance this is a dog").

4. How Generative AI Specifically Works

Generative AI (like ChatGPT) takes this a step further by predicting the next "token" (part of a word) based on a massive corpus of data.

Sequence Prediction: It doesn't "know" the truth; it sequentially generates the most likely continuation of a prompt based on probability patterns it learned during training.
Human Alignment: Most models undergo a final step called Reinforcement Learning from Human Feedback (RLHF) to make the outputs more helpful and safer for business use.

5. Critical Corporate Considerations

Data Quality: AI is only as good as its training data. Biased or poor-quality data leads to inaccurate "hallucinations".
Infrastructure: Training requires expensive high-performance hardware like GPUs.
Explainability: As models become "deeper," it becomes harder to retrace how they reached a specific conclusion, creating a need for Explainable AI (XAI) to build stakeholder trust.

Would you like to focus the next part of your presentation on the ROI and business value or the security and risk side of these systems?

How is AI validated

AI validation is the process of confirming that a model performs accurately, reliably, and fairly under real-world conditions. Unlike traditional software, AI requires a continuous lifecycle of checks because its behaviour can change over time as data evolves.

1. Technical Performance Validation

These methods test if the AI's internal "engine" is working correctly:

Holdout Validation: Splitting your data so the model is trained on one part and tested on a completely separate "unseen" set.
K-Fold Cross-Validation: Dividing data into multiple parts (folds) and rotating which one is used for testing to ensure the model generalizes well and hasn't just "memorized" the training data.
Stress & Scenario Testing: Intentionally feeding the model "noise," rare edge cases, or extreme values to see if it breaks.

2. Operational & Security Validation

This ensures the model is safe for a production environment:

Bias Audits: Testing outcomes across different demographic groups (e.g., race, gender, age) to ensure equitable treatment and compliance with anti-discrimination laws.
Adversarial Red-Teaming: Proactively attacking the model with malicious inputs to identify vulnerabilities, such as prompt injections or data leakage.
Explainability Audits: Using tools like SHAP or LIME to "open the black box" and verify that the model is making decisions for the right reasons, not just coincidental patterns.

3. Business & Regulatory Validation

For corporate stakeholders, validation focuses on impact and compliance:

Human-in-the-Loop (HITL): Having subject matter experts (SMEs) review and score AI outputs for accuracy and "groundedness".
Regulatory Frameworks: Adhering to standards like the EU AI Act, NIST, or industry-specific rules like FDA/GxP for healthcare.
A/B & Shadow Testing: Running the new AI in parallel with existing systems to compare real-world outcomes before fully switching over.

4. Continuous Monitoring (Post-Deployment)

Validation doesn't end at launch. Teams must use observability tools to track:

Model Drift: Detecting when the AI's performance starts to drop because the real-world data has changed since it was trained.
Hallucination Rates: Measuring how often generative AI creates false or nonsensical information.
list of useful AI tools
Modern AI tools are categorized by their specific utility in professional workflows, ranging from deep technical research to high-level corporate reporting.
1. Development & Engineering
GitHub Copilot: A leading tool for real-time code assistance and autocomplete.
Cursor: An AI-integrated development environment (IDE) that many developers prefer for deep coding tasks.
Tabnine: A pioneer in AI-powered development, recently awarded for its enterprise-grade software tools.
Claude: Frequently cited by developers as the top AI assistant for coding due to its strong logical reasoning.
2. Data Analysis & Visualization
ThoughtSpot: Named a 2025 leader by Gartner for its search-driven analytics and AI-augmented dashboards.
Power BI (Microsoft): Includes Copilot for conversational data exploration and automated report generation within the Microsoft ecosystem.
Julius AI: Best for teams needing code-free insights and automated data visualizations from spreadsheets or databases.
Tableau (Salesforce): Uses Einstein AI to offer predictive analytics and natural language explanations for data patterns.
3. Research & Verification
Perplexity AI: A cutting-edge search engine that provides direct answers with inline citations for transparent research.
Consensus: An AI tool that synthesizes findings specifically from peer-reviewed academic papers.
Sourcely: Designed for academic source verification, allowing searches using entire paragraphs of text rather than just keywords.
Scite: Used to verify claims by showing whether a specific scientific citation supports or disputes a research paper's findings.
4. Marketing & Content Operations
Jasper AI: A versatile tool for scaling content creation while maintaining a consistent brand voice.
Canva AI: Continues to democratize design with its Magic Studio for AI-generated layouts and assets.
Synthesia: A specialized tool for creating AI-generated videos using realistic avatars for sales and e-learning.
Surfer SEO: Automates content audits and real-time optimization to help content rank better in search engines.
Would you like to explore a specific workflow, such as how to integrate these tools into an existing software development pipeline?
AI responses may include mistakes. Learn more

Some News Items
In early 2026, the technical landscape of AI has shifted from "chatting" with large models to
building autonomous agentic systems and energy-efficient hardware architectures.
1. The Rise of Agentic AI & Multi-Agent Orchestration
AI has evolved from a passive tool into a "digital colleague" capable of autonomous, multi-step execution.
Agentic Workflows: Systems now independently set goals, plan tasks, and self-correct without constant human prompting. Gartner predicts that by the end of 2026, 40% of enterprise applications will embed these agents.
Multi-Agent Systems (MAS): Instead of one monolithic model, teams are using orchestrated "squads" of specialized agents (e.g., a "Researcher" agent, a "Coder" agent, and an "Analyst" agent) that collaborate to solve complex problems.
Standardized Protocols: New industry standards like the Model Context Protocol (MCP) and Agent-to-Agent (A2A) allow agents from different vendors (like Microsoft and Anthropic) to communicate and share tools seamlessly.
2. Reasoning-First Models & Self-Verification
Research has pivoted from sheer scale to improving logical reasoning and reliability.
Self-Verification: Models are now being equipped with internal feedback loops to "auto-judge" their own work, identifying and fixing hallucinations before delivering a result.
Advanced Reasoning: 2025 saw breakthroughs where AI achieved gold-medal performance in the International Mathematical Olympiad, demonstrating an ability to generate novel proofs rather than just pattern-matching.
Long-Term Memory: New architectures for episodic and semantic memory allow agents to learn from past interactions and maintain context over months rather than just a single conversation.
3. The Hardware & Energy "Squeeze"
As data centers face an energy crisis, innovation is focused on computational density and sustainability.
NVIDIA "Rubin" Architecture: Unveiled in early 2026, the Rubin platform (H300 GPUs) offers a 10x reduction in inference costs and is specifically built to handle trillion-parameter models with extreme power efficiency.
Sovereign AI & Local Inference: Countries like India and the UK are building national "superfactories" (e.g., the Isambard-AI supercomputer) to host their own data and reduce reliance on foreign clouds.
Sustainable Power: Big tech is bypassing grid congestion by investing in Small Modular Reactors (SMRs) and fusion research to power AI data centers "behind-the-meter".
4. Applied "Physical AI"
AI is moving beyond the screen and into the material world through Physical AI.
Robotic Foundation Models: Large-scale models trained in simulation are enabling robots with human-like dexterity for manufacturing and elder care.
Augmented Sensory Tools: Updates to AI-powered glasses (like
Meta's Hear Better
Not used but guided my thinking:
To lead a high-quality technical discussion on AI, it's best to pivot between the "internal" engineering challenges and the "external" societal impacts
.

1. Architectural & Engineering Challenges
The Scalability Dial: Discuss the unresolved debate on whether simply increasing scale (compute and data) will continue to yield intelligence gains, or if we are hitting a "diminishing returns" wall.
Neural Cellular Automata: Explore niche but growing topics like self-organizing systems and self-assembling neural networks that mimic biological growth.
Explainability (XAI): Analyze the black box problem—how do we build transparency into models where even the creators cannot fully explain a specific decision?.
Data Integration: Tackle the logistical reality that 80% of AI training is data preparation, including handling "garbage" data and integrating fragmented sources.
2. Emerging AI Paradigms
Agentic AI: Move the conversation from "chatbots" to "agents"—systems that can autonomously execute multi-step tasks rather than just predicting the next word.
Generative Adversarial Networks (GANs): Discuss the competitive tension between the Generator (creating data) and the Discriminator (identifying fakes).
Organoid Intelligence: Introduce the cutting-edge research area of biological computing using brain-cell-derived structures.
3. Ethical & Operational Risks
Adversarial Attacks: Debate how simple physical alterations—like stickers on a stop sign—can catastrophically fool computer vision systems.
Value Alignment: Tackle the "grandma in the pharmacy" problem: how do we prevent AI from taking requests too literally and causing harm through efficiency (e.g., a car driving dangerously fast because it was told to get there "as fast as possible")?.
Environmental Cost: Discuss the paradox that while AI can optimize energy, data centers are projected to consume 3–8% of global electricity by 2030.
4. Applied Industry Trends
AI in Code Auditing: Discuss using AI to find bugs or audit questions for fairness and compliance in technical hiring.
AI-Optimized Hardware: The shift toward GPUs and customized appliances designed specifically for high-performance computational workloads.

Here is a link kindly provided by Daivd Burnham:

AI has supercharged scientists—but may have shrunk science https://www.science.org/content/article/ai-has-supercharged-scientists-may-have-shrunk-science?utm_source=sfmc&utm_medium=email&utm_content=alert&utm_campaign=SCIeToc&et_rid=325103282&et_cid=5855546

A recent article from CACM by Peter Denning

Forecasting – Communications of the ACM Download

u3a

Macclesfield