What is AI psychology?

AI psychology is the science of designing, measuring, and assuring AI systems that make decisions about human beings. It applies psychometric validation, harm analysis, and lifecycle monitoring to any AI system whose outputs affect people: hiring tools, wellbeing platforms, mental-health chatbots, performance management systems, and AI coaching. The discipline sits between AI engineering and the people the AI touches, and asks one repeated question: is this system valid, fair, and defensible for the human beings on the receiving end?

How does AI psychology differ from AI ethics?

AI ethics is a values discussion. AI psychology is a measurement discipline. AI ethics asks whether a system should exist, what principles it should honour, and what trade-offs are acceptable. AI psychology asks whether the system actually measures what it claims to, whether it produces equivalent results across populations, and whether it stays valid after deployment. The two are complementary. Most commercial AI systems pass an ethics review but fail a psychometric audit, which is where the harm enters.

What is psychological safety in AI systems?

Psychological safety in AI systems means the system does not produce psychological harm at deployment scale. Three concrete failure modes count: it does not collapse a complex human construct into a brittle proxy, it does not erode the user's autonomy or independent judgement, and it does not create dependence that the user cannot exit. Every AI system that touches human decisions or wellbeing should have a named owner accountable for each of these three.

How do you audit an AI assessment for psychological safety?

Run the AI-IARA audit. The audit scores the system on six capacities (Awareness, Interpretation, Intention, Action, Relational Agency, Autonomy), produces a prioritised list of design weaknesses, and ties each weakness to an evidence requirement. The full methodology is in the 2026 paper in The Journal of Positive Psychology. The interactive audit at /ai-iara-audit walks any product team through it in about 15 minutes and produces a sharable risk dashboard.

Are AI-driven personality assessments legal under the EU AI Act?

Defer legal interpretation to counsel. What can be said factually is that the EU AI Act treats AI systems used in employment, education, and access to essential services as high-risk. High-risk AI systems require documented evidence of validity, fairness, and post-deployment monitoring. An AI personality assessment that has no psychometric validation, no measurement-invariance testing, and no drift monitoring will not produce the documentation the Act asks for. The compliance path is the audit-evidence path. They are the same path.

What is construct drift in deployed AI?

Construct drift is the gradual shift in what an AI system is actually measuring after deployment, even when the model weights are frozen. The cause is usually feedback contamination: the system shapes the behaviour of its users, the users feed new data back to the system, and the construct quietly migrates away from what it was validated on. A wellbeing tool that started measuring flourishing can end up measuring engagement with the tool itself. Drift is detectable. Most teams just are not looking for it.

How do you detect bias in AI hiring tools?

Bias detection in hiring AI requires three measurements at minimum: differential validity (does the score predict performance equally well across demographic groups), differential prediction (does the score over- or under-predict performance for any group), and measurement invariance (does the test mean the same thing across groups). Adverse impact ratios alone are insufficient and can hide real bias. The AI psychology audit produces all three plus a named owner for each.

What does psychometric validity mean for AI?

Psychometric validity for AI has the same five components as for any psychological measurement: construct (does the system measure what it claims), content (does it sample the construct adequately), criterion (does the score predict the outcome it should), discriminant (does the score differ from things it should be different from), and consequential (do the decisions taken from the score produce just outcomes). Most AI products marketed as wellbeing or assessment tools have evidence on at most one or two of these. A defensible system has evidence on all five.

Who is responsible when AI makes decisions about people?

A named human owner for each decision class. AI psychology rejects the diffusion-of-responsibility pattern where engineering points at the model, the model points at the data, and the data points at the user. For every decision an AI system can take about a person, an audit-defensible deployment names the human who reviews it, the threshold under which they are notified, the escalation path, and the rollback authority. Procurement contracts that do not specify these are a red flag.

How is AI psychology different from behavioural science or persuasive technology?

Behavioural science studies how humans behave. Persuasive technology designs systems that change behaviour. AI psychology studies the measurement and decision systems that AI produces about humans, and assures those systems for validity, fairness, and contestability. The three fields touch but the audit object is different: behavioural science audits a human, persuasive technology audits a behaviour-change intervention, AI psychology audits a measurement-and-decision system. People-impact AI needs all three; AI psychology is the layer most often missing.

What is the difference between AI psychology and AI governance?

AI governance is the operating model that makes responsible AI possible: roles, policies, escalation paths, audit cadences. AI psychology is one of the audit disciplines that AI governance applies. A mature AI governance programme has psychometric validation, fairness testing, drift monitoring, and contestability mechanisms in its toolkit, and it calls those tools when the AI system in scope makes decisions about people. AI psychology is governance with measurement teeth.

Where can I learn more or get an audit?

Three places. The 2026 paper in The Journal of Positive Psychology defines the AI-IARA framework formally. The interactive audit at /ai-iara-audit walks a product team through their own system in about 15 minutes and produces a risk report. For a full audit, contact via the offerings page; engagements run four to twelve weeks depending on scope and produce an audit-ready evidence pack.

AI Psychology: The Science of AI That Decides About People

AI Psychology

AI psychology is the science of designing, measuring, and assuring AI systems that make decisions about human beings. It treats people-impact AI as a measurement problem first and a technology problem second. The work asks four questions before a system is allowed near a hiring panel, a wellbeing programme, or a clinical workflow: does the system perceive the right context, interpret it correctly, act with proportionate authority, and preserve the user's autonomy and social agency. When any of those fail, the system harms people quietly and at scale. AI psychology names the failure modes, audits them, and turns the gaps into a defensible fix list.

AI psychology is not AI ethics, AI governance, or behavioural science. It is the measurement discipline behind all three when the decisions are about people.

The Method

AI-IARA. Six capacities every people-impact AI must demonstrate.

The AI-IARA framework (Awareness, Interpretation, Intention, Action, Relational Agency, Autonomy) is the canonical methodology this site teaches and applies. Each capacity surfaces a specific class of audit signal. Together they form the validity stack that determines whether an AI system that decides about people will hold up under scrutiny.

Run the audit Read the paper

AI-IARAFramework

Awareness

Interpretation

Intention

Action

Relational Agency

Autonomy

The Validity Stack

Five layers of audit evidence

Every AI psychology audit produces evidence at five layers. A system that passes one layer but fails another is not deployable. The layers are sequential, not optional.

ConstructWhat does it claim to measure?

CalibrationDoes the score mean the same thing across people?

CohortDoes it work for the people it will touch?

DriftHow does it fail after launch?

ContestabilityCan the person decided about push back?

01

Construct

Define the construct the system claims to measure or decide on, in language that an independent psychometrician can review. Wellbeing, engagement, fit, risk, and burnout are not interchangeable. The construct must be named, scoped, and tied to a published theoretical model. Without this layer the rest of the audit has nothing to anchor on.

Step 01

Construct

Define the construct the system claims to measure or decide on, in language that an independent psychometrician can review. Wellbeing, engagement, fit, risk, and burnout are not interchangeable. The construct must be named, scoped, and tied to a published theoretical model. Without this layer the rest of the audit has nothing to anchor on.

Step 02

Calibration

Step 03

Cohort

Step 04

Drift

Step 05

Contestability

Step	Title	Description
01	Construct	Define the construct the system claims to measure or decide on, in language that an independent psychometrician can review. Wellbeing, engagement, fit, risk, and burnout are not interchangeable. The construct must be named, scoped, and tied to a published theoretical model. Without this layer the rest of the audit has nothing to anchor on.
02	Calibration	Test whether the system's scores are equivalent across the populations who will be measured. A 70 in one demographic should mean what a 70 means in another, or the system is not measuring what it says. This is the layer where most commercial wellbeing and assessment tools fail and never get re-audited.
03	Cohort	Validate the system in samples that match the deployment population, not just the convenience sample it was trained on. Differential validity, measurement invariance, and floor and ceiling effects all live here. If the deployment population has not been tested, the deployment is uncontrolled.
04	Drift	Specify what proxy collapse, construct drift, and feedback-loop contamination look like for this system, with thresholds that trigger pause or rollback. Most people-impact AI degrades silently because no one is watching for it. The drift layer puts named owners and concrete signals on the watch.
05	Contestability	Specify how a human subject of the AI system can see, question, and appeal a decision. Contestability is the audit layer that converts measurement validity into procedural fairness. Without it the system is not deployable in any high-stakes setting in any jurisdiction with a meaningful AI Act.

Common questions about AI psychology

Proof Stack

The Authority Behind This Page

Every claim on this page is anchored in two or more independent proof types: peer-reviewed publications, third-party speaking engagements, formal standards, and named institutional roles.