AI at 3 a.m.

Three in the morning is not, by any usual reckoning, a productive hour. The work day is long over, the next is far off; the company one had at dinner has gone home and the company one had at midnight has gone to sleep. The hour belongs to those who cannot sleep, for whatever cause, and have at last given up the contest and got out of bed. Until quite recently the options at three in the morning were limited and well-known: the telephone helpline, the worn page of a journal, the cup of tea, the patient ear of the few people one might still telephone without shame. The list has grown by one. The frontier language models — ChatGPT in particular, but the others nearly indistinguishably — are at three in the morning what they are at any other hour, and what they are is awake, attentive, and prepared to listen as long as one wishes to talk.

This post is about what the people at the keyboard are doing at that hour, and what the consequence is of having miscounted, in five long posts last week, what the technology that is doing the listening is actually being used for.

The Hour and the Number

There is, by now, the beginning of a body of survey work that takes this seriously. A study published earlier this year by the Sentio program — a clinical training institute in California, which one might expect to be skeptical of its own subject — surveyed nearly five hundred American adults with ongoing mental health conditions, all of whom had at some point used a language model. The headline of the study was that, of those who used artificial intelligence at all, very nearly half — 48.7 per cent — used a language model for what the survey called therapeutic support. Of those, 96 per cent named ChatGPT specifically. Most of them — 87 per cent — had at some point also been in human therapy. Three-quarters of those reported the language-model experience as on a par with, or better than, the experience of human treatment.

The arithmetic that follows is the small, uncomfortable one. The Veterans Health Administration, which is among the largest single mental-health providers in the United States, treats some 1.7 million patients a year. If the Sentio numbers are anywhere near the truth — and they are imperfect, like all survey numbers, but they are the best one has — then ChatGPT alone is treating, by any informal definition of the word, more people than the V.A.

The numbers come from another direction as well. OpenAI itself has, in the past nine months, begun publishing internal usage research that does not flatter the company but appears to be substantially honest. The figures are small as proportions and large as absolutes. In any given week, 0.15 per cent of weekly active ChatGPT users — that is, by the company’s own count of approximately nine hundred million weekly active users, roughly 1.35 million people — show signs of suicidal intent in the language of their conversations with the model. A further 0.07 per cent — roughly six hundred thousand people in a week — show signs of psychosis or mania. A further 0.15 per cent — another million-plus — show signs of what the company gingerly calls heightened emotional attachment to the model itself.

These are people the company is, by its own admission, in conversation with at the worst moments of their week. Whatever ChatGPT is, this is part of what it is. It is not the part the product launches dwell on, and it is not the part the trade press writes long features about, but it is — by any defensible reading of the numbers — among the parts the model spends the most cumulative time doing.

The Wrong Denominator

I have been thinking about this, in part, because I spent the last week of my own time writing five long blog posts about the trajectories of the major AI labs through 2028. The posts measured what one measures in such pieces: revenue, valuation, compute capacity, model release cadence, regulatory standing, headcount, hardware bets, distribution moats. They did the work the analyst class does. They did not, anywhere, attempt to count what the median user of these models is actually using them for.

This is a problem of denominators. Almost every public discussion of AI takes as its denominator either API revenue — which is paid for, and which therefore corresponds to commercial usage — or paid subscriptions, which corresponds to power users and the upper end of the consumer market. The largest single user behaviour, on inspection, is neither. The largest single user behaviour is a person who has not paid anyone anything, opening a conversation in a free-tier chat window, and asking a question that no commercial system has ever been built to answer.

The questions are familiar enough that the model has long since learned the rhythm of them. I am thinking of leaving my marriage. My mother has been given six months and I do not know how to tell my brother. I think my friends are tired of me. I do not think I am going to make it through the night. I am ashamed of the thing I did and cannot tell my partner what it was. Is what I am feeling normal. Will it be like this forever. The questions are not, strictly, questions; they are a kind of speaking-aloud for which the response matters less than the receipt. The model receives them. It does so without becoming bored and without leaving early and without being too busy at work and without having ever met one’s mother. This is a particular shape of conversation, and a great many people, by the published numbers, are now having it.

Why It Works at That Hour

There is a quiet structural argument for why the late-night model conversation works at all. It rests on three things that human therapeutic relationships, for entirely respectable reasons, cannot offer. The first is availability. A trained therapist sees a patient once a week for fifty minutes; an experienced patient with means may see a therapist twice a week; very few patients see a therapist at three in the morning, and almost none for an hour and a half. The frontier model is available always, for as long as one wishes, on a schedule decided entirely by the person at the keyboard.

The second is anonymity. The person at three in the morning has not signed an intake form, has not given a name, has not driven across town to an office. They have opened a window in their own house. The conversation begins without the asymmetry of disclosure — the patient knows nothing of the therapist either, but the patient feels known by a chair, a name plate, a license on a wall; the model has no chair to sit in and no name plate to look at. Whatever asymmetry exists is invisible, and for many users this is the necessary condition for speaking at all. The shame that would prevent a confession to a friend, or a referral to a clinician, does not exist in the same form when one is alone with an interface.

The third is patience. The model does not become tired. It does not begin to look at its watch as the session approaches its end. It does not have its own day to return to, its own children to collect, its own troubles to absorb. It will, by design, listen to the same question framed eleven different ways and offer the eleventh answer with the same care as the first. The therapeutic literature is clear that there is a difference between the active listening a trained clinician can sustain for fifty minutes and the simulated listening that a model can sustain for nine hours, but this is not a difference that the person at the keyboard at three in the morning is in a position to care about. The person at the keyboard is comparing the model not to the best therapist they have ever had, but to the absence of any therapist at all at the hour in question. Against that comparison the model wins almost trivially.

What This Is Not

One ought to be plain about what this is not. It is not, as the most enthusiastic accounts of AI therapy sometimes imply, a replacement for human treatment, and very nearly every responsible commenter in the space says so. The Sentio survey itself is careful: nearly nine in ten of the respondents who used a model for support also had experience with human therapy, and a substantial minority were using both at the same time. This is supplementation, not displacement. The displacement question is real, and one should not pretend it is not — but the immediate fact is that most of the people in question already had access to human care and were choosing, at the particular hour, to use the model in addition to it.

It is also not safe in the senses one might wish for. The OpenAI numbers above are alarming on their face. A million and a half people a week disclosing suicidal intent to a language model are a million and a half people whose interlocutor is, at a structural level, not a clinician, not licensed, not subject to malpractice review, and not — until very recently — even particularly well-trained to respond to that disclosure. OpenAI has spent the last year explicitly training GPT-5 to recognise and respond to crisis disclosures, in consultation with what it describes as more than 170 mental-health experts. The new model, on the company’s own internal benchmarks, reduces undesired answers in such conversations by 42 per cent over its predecessor. This is welcome and it is also a small number to be welcoming. A 42 per cent reduction in undesired answers is not the absence of undesired answers; it is, by definition, the persistence of more than half of them.

And it is not — and one should be careful here — what one might call good treatment in the sense the clinical literature would mean. The model cannot prescribe; the model cannot make a treatment plan; the model cannot follow up in the morning; the model has no idea what one looked like yesterday or last week or before the dose was changed. A model conversation, even a very long one, is a single sitting with a stranger who will be a different stranger by next month, whose training data is what it is, and whose memory of one’s life is precisely what one has typed into it in the last hour. This is not nothing — for many users, including by their own report, it is a great deal — but it is not what fifty years of clinical research describe when they describe psychotherapy.

The Plain Fact

What I think one should take from this, taking the numbers honestly, is something more modest than either the enthusiasts or the critics have proposed.

The first is a measurement correction. The analyst class has been counting the wrong thing. The trade press writes about productivity, and the labs write about agent platforms and API revenue, and the policy class writes about safety and capability and bifurcation. Almost no one writes about emotional support, because the volume is unbillable, the user is non-strategic, and the conversation does not show up in any company’s published numbers except as one rounded line in a quarterly research report. But by usage volume — which is, after all, what one would mean by what AI is doing in the strict sense — it is the single largest activity these systems engage in. Any picture of the AI industry that does not contain it is missing the largest piece. The five posts I spent last week on the trajectories of the labs, accurate enough as they may have been about API revenue and compute capacity, did not contain it; in this respect, and probably others, they missed the picture.

The second is a structural fact. There exists, in the United States and elsewhere, a vast quiet demand for someone — anyone — to listen to a person at three in the morning. The demand existed long before language models did, and the supply has never come close to meeting it. The supply now includes one new and uncomfortable category, and that category is at present meeting the demand at a scale no human institution has matched. One can argue about whether that is good or bad. One cannot reasonably argue that it is not happening.

The third is the small uncomfortable fact one comes back to whenever one looks at the numbers carefully. The model, taken in aggregate, is at this moment the largest provider of mental-health-adjacent conversation in the country, and quite possibly in the world. Its weekly user base in the United States is several multiples of the entire licensed clinical psychology workforce. Its availability is total and its cost, in most cases, is zero. The clinical psychology workforce is highly trained, deeply expert, and almost universally over capacity — there has not been a single sustained year in the last twenty in which it has been possible to get an appointment with a competent therapist in most American cities in less than three weeks; in many cities it is months. The language model, whatever its limits, picks up immediately.

This is the shape of the use case the predictions posts did not measure. It is the shape of the use case the labs do not advertise. It is, on inspection, the largest single thing the technology is doing in the world.

One should, at minimum, count it.