The Universal Language Of Facial Expressions

· 11 min read

Darwin's Hypothesis and Ekman's Evidence

The idea that emotional expressions might be universal goes back to Darwin. In The Expression of the Emotions in Man and Animals (1872), Darwin argued that facial expressions — like other biological features — had evolutionary origins and would therefore be consistent across members of the same species. He gathered observations from cross-cultural correspondents, from naturalists, from his own observations of infants. His conclusion: human emotional expressions were biologically grounded and broadly universal.

The scientific establishment largely ignored this for nearly a century, as behaviorism and social constructionism dominated the field. The dominant model by the mid-20th century held that emotional expression was a learned behavior, shaped by culture. Different cultures, different rules. No transcendent language of the face.

Paul Ekman set out to test this. His early work in the 1960s showed photographs of emotional expressions to people in literate, Western-influenced cultures and found high cross-cultural agreement in emotion recognition. Critics dismissed this: people who had been exposed to American movies and television had learned American expressions. The finding proved nothing about biology.

So Ekman went to the Fore people of Papua New Guinea — a group who had been largely isolated from Western contact and media. He showed them photographs of emotional expressions and asked them to match the photographs to stories describing emotional situations. The results: Fore people matched expressions to emotional situations in the same pattern as Western samples, with one partial exception (they had difficulty distinguishing fear from surprise, which turned out to replicate in some other cross-cultural comparisons as well). He then filmed the Fore making expressions in response to stories and showed those films to American college students, who identified the emotions accurately.

The conclusion Ekman drew: six emotions — happiness, sadness, anger, fear, disgust, and surprise — were expressed and recognized universally, by people who had no shared cultural history of contact. The expressions were biologically determined, not culturally learned.

This research was enormously influential and became the basis for decades of applied work in forensics, clinical psychology, security, and animation.

The Neuroscience of Face Processing

The human visual system gives faces extraordinary priority. The fusiform face area (FFA), identified by Nancy Kanwisher and colleagues in the late 1990s using fMRI, is a region in the fusiform gyrus of the temporal lobe that responds selectively and robustly to faces. It activates for faces faster and more strongly than for comparably complex non-face images. Damage to this region produces prosopagnosia — the inability to recognize faces, even of loved ones — while leaving other visual recognition largely intact.

The specificity is remarkable. Infants as young as a few hours old preferentially orient toward face-like stimuli over matched non-face stimuli. The amygdala processes the emotional content of faces extremely rapidly — studies using backward masking (showing a face for milliseconds before masking it with another image so it never enters conscious awareness) find that the amygdala still registers fear faces subliminally. You respond to a fearful face before you know you've seen it.

This is not a learned response. It is wired in.

Beyond static recognition, humans are exquisitely sensitive to dynamic facial information. The facial action coding system (FACS), which Ekman developed with Wallace Friesen and published in 1978, codes facial movements into Action Units (AUs) — each corresponding to the contraction of a specific muscle or muscle group. FACS has 44 action units plus additional codes for head movement and eye behavior. Researchers trained in FACS can code every visible muscular movement on the face with high inter-rater reliability, providing a language for describing facial expression that is more precise than any ordinary verbal description.

The utility of FACS for basic research is substantial — it allows researchers to describe expressions without categorizing them by emotion, which keeps the coding independent of theoretical assumptions. It has been used to document facial expressions in infants (who cannot report their emotional states), in clinical populations, in cross-cultural comparisons, and in real-time analysis of social interaction.

Micro-Expressions: The Involuntary Signal

One of Ekman's most practically consequential findings is the phenomenon of micro-expressions: brief, involuntary facial expressions that leak genuine emotional states before deliberate control can suppress them. Duration is typically 1/25th to 1/5th of a second — below the threshold of conscious control and, without training, below the threshold of conscious detection.

Micro-expressions occur when someone is trying to conceal or mask an emotional state. The genuine expression breaks through briefly before the performed expression (or neutral expression) reasserts itself. Ekman documented this first in footage of patients who were attempting to conceal distress during psychiatric interviews, and subsequently in studies of deception.

The practical implications are significant: micro-expressions are detectable with training. Ekman's micro-expression training tools (METT) and subtler expression training tools (SETT) improve detection rates. Law enforcement agencies, including customs and border protection, have used FACS-based training. However, the translation to applied deception detection is more complicated than Ekman's early claims suggested — the relationship between concealed emotion and deception is not one-to-one, and significant research has questioned the reliability of using micro-expressions as lie detection indicators in real-world forensic contexts. A concealed emotion could indicate deception, or discomfort, or grief, or dozens of other states.

What micro-expressions do reliably indicate: genuine emotional activation that is being managed. The face is harder to control than we think.

Barrett's Challenge: The Theory of Constructed Emotion

Lisa Feldman Barrett's research program, summarized in her 2017 book How Emotions Are Made, represents the most sophisticated challenge to Ekman's universality thesis.

Barrett argues that emotions are not discrete, natural categories with fixed expressions, but are constructed by the brain. The constructionist account runs roughly as follows: the brain receives a constant stream of afferent signals from the body and sensory input from the environment. It predicts what these signals mean based on prior experience and cultural learning — a process called predictive coding or active inference. What we call "emotions" are the brain's constructed predictions about the meaning of body states, built on categories learned through development and socialization.

On this view, there is no single "anger" expression, no single "fear" expression — what counts as the expression of an emotion depends on what the brain, drawing on culturally shaped categories, predicts is being expressed. Barrett has documented substantial variability in facial expressions even within single emotion categories, and argues that the recognition rates in cross-cultural studies are lower than Ekman claimed when methodological issues are controlled for — particularly the forced-choice paradigm, which artificially inflates agreement rates by limiting the options available to participants.

Barrett's 2019 paper with colleagues (Cowen, Keltner, and others) analyzing a large cross-cultural dataset found substantial cross-cultural agreement in the mapping of facial expressions to emotion categories — but also substantial variation, and argued that the pattern was better characterized as a family of overlapping, contextually-sensitive signals than as a set of discrete, context-independent universals.

The debate is not resolved and is genuinely ongoing. The practical takeaway: the original Ekman thesis in its strongest form — discrete, context-independent, perfectly readable universal expressions — is probably too simple. The culturally relativist counter-thesis — all expression is culturally learned with no biological universals — is also too simple. The actual picture is: significant biological signal, shaped by cultural experience in its expression and interpretation, and requiring contextual information for accurate decoding.

For the purposes of this book, what matters most is the middle ground that both sides of this debate occupy: human faces are extraordinarily expressive, they carry substantial cross-cultural emotional information, and we are all, as a species, running on facial processing hardware that is oriented toward social reading of other humans.

What Faces Actually Do in Social Life

Beyond the academic debate, consider what faces do functionally in human social life.

Coordination. Faces synchronize between interaction partners. Research on facial mimicry shows that humans automatically and often unconsciously mirror the facial expressions of people they're interacting with — and this synchrony correlates with feelings of rapport and empathy. Zaki and colleagues (2009) showed that people who mimicked more were rated as more empathic. The face is not just broadcasting; it's part of a feedback loop.

Regulation. Faces help regulate emotional states in others. The face of a calm caregiver can literally soothe an infant's autonomic nervous system — this is the basis of the still-face paradigm, where infants show significant distress when a normally responsive caregiver suddenly becomes expressionless. The still face communicates absence of connection, and infants respond to it with real physiological distress.

Status and dominance. Faces signal social status. Research by Todorov and colleagues (Princeton) has shown that people make rapid inferences about leadership potential, competence, and dominance from faces — inferences that predict election outcomes at rates better than chance. These inferences are often wrong, but they're fast, automatic, and socially consequential.

Trust. Facial expression influences trust decisions in economic games. People given trustworthy-seeming faces invest more, cooperate more, extend more credit. The face shapes the social ecology of a room.

Pain. Facial expression of pain has been documented to influence how much care others provide. Patients whose facial expressions of pain were judged as more expressive received more analgesic medication. Your face can determine whether you are taken seriously as someone who is suffering.

The Cost of Face-Free Communication

The replacement of face-to-face interaction with text-mediated communication has happened at a speed that outpaced any ability to study it prospectively. We adopted smartphones, text messaging, and social media platforms without first asking what would be lost.

What we know now:

Misunderstanding rates are higher in text. Studies consistently find that the emotional tone of text messages and emails is misread more often than the sender expects. The sender, who knows their own intent, assumes it is legible in the text. It often isn't. The absence of vocal tone, facial expression, and real-time feedback removes most of the bandwidth through which intent is normally transmitted.

Social media use correlates with loneliness. Multiple large-scale longitudinal studies have found associations between heavy social media use and loneliness, particularly in adolescents. The causation is debated — lonely people may use social media more — but some experimental studies that randomly assigned participants to reduce social media use found reductions in loneliness and depression. The mediation is not fully understood, but the loss of face-based connection is a plausible mechanism.

Screen-mediated interaction changes the nature of conflict resolution. Research on online argumentation consistently finds it escalates faster and de-escalates slower than equivalent in-person arguments. The absence of facial information removes many of the cues that signal genuine distress, vulnerability, or softening — cues that in person would naturally de-escalate conflict.

Children's emotional literacy development may be affected. There is emerging (though still preliminary) evidence that children who spend less time in face-to-face interaction and more time in screen-mediated interaction show differences in their ability to recognize facial expressions of emotion. A widely cited study by Uhls and colleagues (2014) showed that sixth-graders who spent five days at a camp without screens showed significant improvement in emotion recognition from faces and voices compared to controls. The mechanism: practice.

Video calls are not equivalent to in-person contact. Despite the explosion of video calling, research suggests it is experienced as more cognitively demanding and less socially satisfying than in-person contact. "Zoom fatigue" is documented in the literature — the effort of maintaining eye contact and reading degraded facial signals through a screen is taxing in a way that in-person contact is not.

None of this is an argument for technological abstinence. It is an argument for being honest about the trade-offs, and for being deliberate about protecting in-person face time, especially for children, especially in conflict, especially in anything that matters.

The Face as Evidence of Shared Humanity

Here is the philosophical point that grounds all the empirical material above.

When you look at a human face making an expression you recognize — grief, joy, fear, the flash of shame — you are receiving a signal that is older than language. Older than civilization. Older than agriculture or writing or the first city. You are receiving a biological broadcast from another member of your species, using hardware that evolution built specifically for that purpose.

The face is one of the most powerful arguments against the idea that other people are fundamentally different from you. Not because it proves they're having the same experience — they aren't, and experience is always particular, always filtered through context and history and body and culture. But because it shows you the emotional architecture underneath the experience. And that architecture is the same.

Ekman, late in his career, wrote about what he called the "evil imagination" — the psychological work required to sustain systematic cruelty toward others. One of the key mechanisms: you have to stop reading their faces. Torturers are trained, explicitly or implicitly, to stop responding to the facial signals of suffering. Prison guards who become abusive report developing a kind of perceptual shutoff — they stop seeing the face as a face and start seeing it as a task or an obstacle. Genocide perpetrators describe the same perceptual shift: the target group stops having faces that broadcast legible humanity.

The face is, quite literally, one of the primary biological mechanisms through which we know that someone is a person like us.

To be disconnected from that mechanism — through screen mediation, through dehumanizing ideology, through psychological armor built up around old wounds — is to be disconnected from the most basic available evidence of human unity.

Practice: Re-Engaging the Face

The 5-Second Face Practice. Once a day, with someone you'd normally be task-focused with — a colleague, a cashier, a family member at dinner — pause and actually look at their face for five seconds before speaking. Not staring, not analysis — just looking. Notice what's there. This sounds trivial. It isn't. Most of the time we're not actually looking at the faces in front of us.

The Slow Re-Watch. Take a video of a conversation you were part of — or use footage of any real human interaction. Watch it once at normal speed, then watch it again at half speed with the sound off. Notice what you see. The gap between what the face says and what the words say. The flash of something real before the composed response.

The Screen Replacement. Pick one conversation per week that you'd normally have by text or email — one that has any emotional content at all — and do it in person or by video call instead. Notice the difference in what you learn about the other person, and what they learn about you.

The Expression Inventory. Sit with a trusted person and each describe a recent emotional experience while the other watches your face, not to analyze it, but to reflect back what they see. Then switch. This is surprisingly vulnerable. It is also the clearest demonstration of what faces do that words don't.

Why This Is Law 1 Territory

We argue about almost everything. Race, religion, politics, economics, history, who owns what, who wronged whom. These arguments are often intractable in the abstract — in text, in theory, at a distance.

They become harder to sustain in close proximity to a face.

Not impossible. People commit atrocities looking into eyes. The capacity to override the face-reading signal is real and well-documented. But it requires active work. Active dehumanization. You have to do something to get there.

The default, with a face in front of you, is to see a person.

That default is biological. It is built into the hardware. And it is the most basic available evidence that we are not fundamentally separate from each other — that human unity is not an ideology or a belief system but a description of what we actually are.

Every face you look at and actually read is a small act of recognition. And recognition, at scale, is what changes the world.

◆

Cite this:

View edit history

← PreviousMoral Injury — What Happens When You Violate Your Own Sense Of Unity Continue →How Hunger And Fatigue Shrink The Circle Of Who You Care About

Comments

Be the first to share how this landed.