Stepping back from grand theories, this post revisits the mystery lexical factors. From word loadings, can you describe the general principle that holds a factor together? This exercise gets at the ultimately qualitative nature of personality models. Despite the statistical methods used to produce word loadings, a model eventually has to be communicated using words, not numbers. These descriptions are then used to generate inventories (eg. BFI) that approximate each construct.
The data for the factors in question comes from two sources. One factor is from word vectors, and the other from the traditional survey approach. Both processes are described here. If you want to take a stab at naming the factors on an unlabeled plot, check out this post before reading below.
Mystery Factor 1
Top words: exacting, strict, decisive, stern, firm vs. meek, lax, wishy-washy, gullible, naive
Commenters described this as: conscientiousness, assertiveness, industriousness, and Attended Action
Good work, team! This basically covers the factor. It was derived as the unrotated third component of data from word vectors. In the past I have called this Order to distinguish it from the Big Five Conscientiousness. They are similar, but this has a bit of an edge (eg. exacting, vengeful). Order is about capturing your own goals, come hell or high water. The Big Five takes variance from PC1 (the PFP) and mixes it with Order to produce Conscientiousness. Hence unforgiving is strongly associated with Order, but neutral on Conscientiousness.
In my opinion, a construct that combines ability to accomplish one’s own goals with the desire to accomplish society’s goals confuses the situation. I think IO research, for example, would be better off using Order than either Conscientiousness or grit. The correlates of interest (like promotions) are surely more directly related to Order. Never met someone at the top who was purely a team player. There’s always a personal edge.
Mystery Factor 2
Top words: unimaginative, unsophisticated, moral, empathic, principled vs. crafty, cunning, sly, creative, clever
Commenters described this as: negative openness, industriousness, and Trust of Word.
This one is trickier because it is actually the 6th unrotated component from data used in a classic paper originally used to define the Big Five. THIS IS WHAT THEY ARE HIDING FROM YOU! This is what didn’t make the cut. It’s interesting that both the fifth and sixth factors in that data have to do with openness to experience. Sure, statistically there may be reasons to cut the model off at five factors. But qualitatively that last factor is split over two factors, and only half the signal was included. This is more of an oddity than anything else. No idea if that holds in other datasets.
Hope you enjoyed this little exercise. In grad school when developing the method to extract word loadings from word vectors I had no idea if the results were signal or noise. At the beginning, it was more often noise, and I spent many hours reading the tea leaves in word loadings, hoping to see a unifying psychological construct. Time was running out and only the first 2-3 NLP factors could be located in the Big Five, but you had to squint. The emotional release of discovering the Big Two—which I could recover perfectly—is part of my affinity for them. That is a much more compelling story to put into a dissertation than kind-of-sort-of recovering half of the Big Five.
I wonder if you've thought about trying to capture energy. Underrated individual difference, captured poorly by Big 5.
Some posts/threads on this:
1. https://stephenmalina.com/post/2021-07-01-energetic-aliens-among-us/
2. https://twitter.com/Willyintheworld/status/1385794190886461441