The Big Five are word vectors

Mar 17, 2022

The deep connection between language models and personality models

18 Comments

Having read your interesting article here, you might find my work on the Adaptive Bifurcated Big-Five useful. It’s a different application of vector analysis to the topic than this, but I think it’s very promising. I think I’ve done a pretty good job modeling the cross-correlations in the Big Five Aspect Scale by applying the theory (a minor rotation aligning it closer to HEXACO, followed by a bifurcation into competing adaptive cognitive systems within a single evolutionary domain) to map each assessment question within a 5d vector-space, aggregate them into aspects and factors, and then use the dot-product to compute their similarity. As an engineer though rather than a scientist, I’ve taken the ideas about as far as I can on my own. I hope you’ll take a look. It has a lot of practical applications, as well as adding significant parsimony to the science.

https://abbf.quora.com/

Thanks,

Eric

Expand full comment

Great, you've put a ton into this! What do you mean by Bifurcated? Bifactor as described in this?

https://www.frontiersin.org/articles/10.3389/fpsyg.2020.01357/full

Expand full comment

I don’t entirely understand what I was reading there, but I don’t think it’s related. The bifurcation I’m referring to is the idea that you can define each of the five personality axes by a pair of competing but fundamentally different adaptive cognitive systems. For example, one axis I use is “Improvisation versus Strategizing” (by which I mean something technical with respect to imagination, closely related to Openness). Zero on the axis is the point of equal bias between the two systems, and it’s bounded by 100% bias in either direction.

Since each of the five axes represent orthogonal evolutionary domains, it allows you to construct a straightforward cartesian coordinate system with a precisely defined origin - the point of equal bias between all 10 poles. Then you can map basically any personality construct you want within that coordinate system with a 5d vector, and make direct geometric comparisons between them.

It’s an important distinction from the traditional Big Five, which to my knowledge has no objective way to specify 100% Introversion, 100% Extraversion, or the precise median between them for example. Those are simply artifacts of the particular assessment, or referenced to the average in a population. They don’t represent a fixed behavioral expression, so you can’t construct an objectively defined cartesian coordinate system from them.

Expand full comment

Hi I know I'm late, but I'm very interested in this and had a question: Do you know of any studies that a dynamic approach to this topic -- i.e., attempt to trace the change of personality components over time? I was inspired by this piece (https://arxiv.org/pdf/1711.08412.pdf) that looked at stereotypes of women and minorities using the Google Books corpus. I'm interested in the extent to which gender differences in the big 5 hold up historically, which of course raises the question about whether the entire construct holds up historically. I have strong priors that it does NOT, but I can be convinced. Thanks for your time.

Expand full comment

Not late at all, the paper is just picking up steam! And there is absolutely an excellent line of research on the lexical hypothesis and gender differences in the Big Five. Just waiting for someone to do it!

Here is a paper that shows people respond to personality tests by comparing themselves to their own gender, which mitigates the measured gender differences in personality. I will say that I am punctual/sensitive/creative/tough *for a man*; the "ground truth" is relative. https://www.sciencedirect.com/science/article/abs/pii/S0092656613001165

>I'm interested in the extent to which gender differences in the big 5 hold up historically, which of course raises the question about whether the entire construct holds up historically.

The most powerful way find structure from natural language is to extract word vectors from enormous pre-trained models. Check out my Deep Lexical Hypothesis paper and you'll see that the quality of the language model matters a lot. As in DeBERTa (Meta's LLM released in 2019) finds the structure much better than BERT (Meta's 2018 LLM). GPT3 does even better.

The reason this is relevant to your question, is that it seems you want to extract personality structure (and relation to gender) from natural language at different time periods. But even with internet-scale text, larger models still do a better job finding personality structure. If we built a LLM on text from 1950 it would find different structure, but maybe in part because there is simply not enough text.

There might be ways around this, but that is a big if. My preferred way to ask if the Big Five are robust to changing culture is to do a comparative study: compare personality structure extracted from Chinese vs English LLMs. I do a bit of multi-lingual work here: https://www.vectorsofmind.com/p/personality-around-the-world

One project I have wanted to do is correlate the word vectors for "masculine" and "feminine" with the Big Five. Those two adjectives are even used to define the Big Five! I have actually done the experiments just not written it up. Feminine loads much more on PC1, which is partly what gave me the idea for the Eve Theory of Consciousness. PC1 is the axis of self-domestication.

"Masculine" loads much more on PC3, which is about being future goal oriented. You can see that by looking at my code which I link in this piece.

Expand full comment

Thank you for your response! So regarding the "is there enough data for a stable model over time" problem, what about google books? That's what this uses: https://www.pnas.org/doi/full/10.1073/pnas.2121798119

I haven't read the paper closely enough, but I don't think they rely on the Big 5 -- they're measuring the cosine similarity between 'man' and 'woman' and a buttload of adjectives.

>One project I have wanted to do is correlate the word vectors for "masculine" and "feminine" with the Big Five. Those two adjectives are even used to define the Big Five!

Yeah, that was kind of what I was thinking, except with a broader "gender" dimension that includes pairs like he/she, him/her, men/women etc. Kind of like what they do here: https://doi.org/10.1177/0003122419877135

Admittedly I don't know very much about the Big 5 because it's not my area. I am, however, quite interested in (and skeptical of) dimensionality reduction methods in general. For instance, I didn't know that "masculine" and "feminine" actually *defined* the big 5. If so, how are any claims to gender differences on these scales at all meaningful? It seems so tautological to me.

I'll definitely check out the multi-lingual work. I'm doing some projects now using word vectors and cross-linguistic comparisons, but the area is moving so darn fast it's hard to keep up!

Expand full comment

>I haven't read the paper closely enough, but I don't think they rely on the Big 5 -- they're measuring the cosine similarity between 'man' and 'woman' and a buttload of adjectives.

The formula to derive the Big Five is to find the pairwise cosine similarity between each adjective to make a NxN matrix and then do PCA. This is basically looking at all of the comparisons in aggregate, and it seems that to stably produce this for modern langauge you need a TON of text. 1000x as much as you get in Google books. So you could do it, but it's not a home run. A critic can say "well, you just didn't have enough to train a model as good as BERT or GPT"

I also don't think the results from 1950 will be more different than Big Five survey results in different samples in the 1990s. It's already understood there is some variation from sample to sample; I don't think this method would demonstrate more variation (undercutting the foundation of the Big Five).

>If so, how are any claims to gender differences on these scales at all meaningful? It seems so tautological to me.

Yeah, there really is no ground truth. The Lexical Hypothesis is the closest we get to solving that problem. I actually take the word vectors for "feminine" and "masculine" to be something of a ground truth. They are a measure of societal expectations, but they are also a measure of societal observations. People notice men and women are different, and that shows up in language. I don't think it's just "bias." At any rate, even if someone disagrees on to interpret the signal as expectations vs observations, it is interesting to see how much the vector for "feminine" changes over time. Obviously more change implies it is more about expectations.

What's your background?

Expand full comment

>The formula to derive the Big Five is to find the pairwise cosine similarity between each adjective to make a NxN matrix and then do PCA.

It makes sense we might not have enough text to *derive* the big 5. But why not just take our *current* lexical definitions of Big 5 and compute their correlation with (or projection on) a gender dimension for each decade. It seems to me we would have enough text to do that, right? Or am I missing something?

> At any rate, even if someone disagrees on to interpret the signal as expectations vs observations, it is interesting to see how much the vector for "feminine" changes over time. Obviously more change implies it is more about expectations.

Interesting. Do you have any further reading on this?

> What's your background?

I'm a social science professor. Unfortunately I can't say much more than that because I'd like to stay anonymous :)

Thanks again!

Expand full comment

Aug 18, 2023Edited

>It seems to me we would have enough text to do that, right? Or am I missing something?

That is actually a great idea that I hadn't considered. Would be more robust than comparing masculine/feminine to single adjectives which would tend to move more.

I'm peeved that this paper didn't cite me (even though one of the authors reviewed my work), but here is another group that is interested in your idea: https://journals.sagepub.com/doi/abs/10.1177/09637214221149737

>Interesting. Do you have any further reading on this?

I put a lot of stock into the unrotated first PC, often called the GFP. There is a big debate in personality psychologic on whether it represents response bias or real personality signal. Deriving PC1 from word vectors answers that question: it's personality signal, and the most important trait. Two interesting things about this trait: it's correlated r = 0.85 with EQ, and "feminine" loads about 1SD (compared to all words) more than "masculine." IMO lots of reasons to believe that women evolved more EQ, as they occupied a social niche (at least during times of dependency like pregnancy or with a small child).

I plot a few words that define PCs 1 and 2 in this post, including masculine and feminine: https://www.vectorsofmind.com/i/130101130/visualizing-forms-from-language

Getting back to interpreting word vectors as social expectation vs observation, the (long) point I make in the article is that this axis is a measure of self-domestication; social expectations are what caused us to evolve into what we are today. Including sex differences. Less philosophically, one can look to twin studies to see how heritable GFP and EQ are. The more heritable, the less inclined I am to believe that men are simply not socialized for EQ. I mean, there are huge sex differences in facial recognition ability. (See the plot in this section: https://www.vectorsofmind.com/i/114650037/women-lead-the-way)

>Yeah, that was kind of what I was thinking, except with a broader "gender" dimension that includes pairs like he/she, him/her, men/women etc.

I had a similar approach defining Kin Altruism and Reciprocal Altruism: https://www.vectorsofmind.com/i/51210419/results

I'm not super happy with the method of using a basket of word vectors; definitely an area where progress can be made

>I'm a social science professor.

Your idea is definitely worth publishing on. Best of luck!

Expand full comment

Thanks again!

Expand full comment

Aug 21, 2022Edited

I should probably write a blog post somewhere on this, but it really doesn't surprise me that these are vectors. I think what we call "personality" is just emotional responses to common situations.

In Plutchik's Wheel of Emotions model there are 8 emotions that can be split into 4 pillars.

Prediction👁️: Anticipation and Surprise

Threat🗡️: Anger (Dominion) and Fear (Submission)

Morality⚖️: Respect and Disgust

Affect😂: Joy and Sadness

From a biological perspective there is only really "the self" and the environment "the world". Personality is our emotional interactions between those two.

Openness/Intellect: Surprise between the Self and the World

Openness: "do you find surprise in the world unthreatening".

Intellect: "does surprise in the world produce joy".

Extroversion: Joy and Anger between the Self and the World

Enthusiasm is "do people bring you joy"

Assertiveness is "when people threaten you do you dominate or submit"

Conscientiousness (tribe relations): Morality between the Self and the World Industriousness/Orderliness is "will coworkers find you respectable (Morality)"

Agreeableness (Affect and the World): Sadness and Fear in the World and the Self's response. Compassion/Politeness: When people are sad/afraid does that make you sad/afraid?

Neuroticism: Are you more governed by the Carrots or the Sticks of emotions? High Neuroticism means you react strongly to the sticks.

Expand full comment

What are the three factors extracted in the unrotated model? Are they like the ones Eysenck first noticed or something else?

Expand full comment

There's no Psychotism in the three NLP factors. But Eysenck was working with traumatized soldiers so it makes sense that factor would be more salient. The three I find are close to Affiliation, Dynamism and Order described here: https://journals.sagepub.com/doi/10.1002/per.1953

Expand full comment

Comment removed

Comment removed

Expand full comment

Lots of big ideas here and I heartily agree that academics are too siloed and personality structure may be different outside of WEIRD culture.

It may be of interest to you that someone from Heinrich's lab reviewed my paper on personality structure, and made comments about generalizability given I only derived the structure from English words (see my article here, The Big Five are word vectors). Before I left academia, it was a plan to use non-English word vectors to compare structure in an internet-sized data sample of language. But...that's quite involved and involves speaking other languages and I never got around to it. Still a good idea, if any grad students are reading this.

I also like that you combine psychometrics and life history with some Jaynesian ideas. It is conceptualizing the GFP as an evolutionary force that I came to the Bicameral Mind.

Expand full comment

Comment removed

Comment removed

Expand full comment

Well, I just have a PhD, never any academic job after that. My training both undergrad and grad was in Electrical Engineering. Dissertation was on natural language processing and personality structure, hence the jump to these topics. I now work doing ML at a VR company where we measure reaction time and other neural processes.

I'm an autodidact in personality psychology, never took course. However I did make a theoretical contribution to the field with one paper.

You are obviously good at taking in information, but haven't really addressed what I think are the main claim of Eve Theory of Consciousness: that we became conscious and that caused the Neolithic revolution. This is a much more powerful idea than Jaynes explaining Bronze Age Collapse, in my opinion.

Expand full comment

Comment removed

Comment removed

Expand full comment

I'm not aware of any mega-fauna die-off that is not preceded by a symbolic revolution. My argument is the human condition does not exist until we have evidence of abstract thought. And then we use abstract thought to kill the megafauna + invent agriculture. The timeline doesn't work for killing megafauna to be causal on the symbols.

Expand full comment

Comment removed

Comment removed

Expand full comment

Continue thread →

Comment removed

Comment removed

Expand full comment

Continue thread →

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts