Does it even exist?
I would be interested to know your feelings about the importance of non-verbal communication. I understand the relevance and significance of the lexical hypothesis, and I largely agree with your postulations. My questions is: What is the significance of non-verbal means of communication? Things like facial expression and body movements. Are these behaviors significant as far as determining a person's personality? I cannot help but feel as though these behaviors are not captured by latent semantic analysis? Is it the case that any non-verbal behaviors can be accurately detailed through lexical analysis?
1. I tend to strongly prefer fitting general factors through hierarchical factor analysis than through taking the first principal component. My issue with taking the first principal component is that it seems extremely sensitive to the universe of item content; if e.g. a dimension by accident gets 2x more items than other dimensions, then PCA will tend to turn that dimension into PC1, whereas hierarchical factor analysis can still easily distinguish it from the general factor (if such a general factor exists).
Of course a big issue with this point is that it is questionable whether there even is an "objectively correct" way of selecting items.
I'm holding my hope out for genomics as I think it can completely cut through these issues, because (in PCA terminology) genetic variants are discrete and so give you a privileged basis.
2. What do you think about the Halo model by Anusic and Schimmack? https://psycnet.apa.org/record/2009-22579-009
Key point: "The most important finding was that the halo factors of different raters were unrelated to each other (r .08, SE .07) and that the 95% CI suggests that the true parameter is likely to be small, ranging from .06 to .22 (see Tables 1 and 2)."
I.e. they find that different people don't agree on the Halo factor, which seems like what you would expect if it's an evaluative artifact.
(One complication is that your general factor is a mixture of the traditional alpha and their general factor, and they do find agreement on alpha.)
A thought I had. The general factor of personality has high positive loadings for terms like considerate, helpful, etc. You see this as a sort of golden rule.
In WEIRD societies, it may be virtuous to be helpful, kind, considerate, etc., generally of all humans, but in certain societies, people may value reciprocity to one's own tribe/ethnic group rather than the trait just generally. I wonder if this would be reflected in the language.
Ashton et al (2015) also acquired data on Alpha. Ordered from highest to lowest loadings, the terms loading on Alpha are:
[Snip: Terms with absolute values below .300]
Thanks. I have always been well-disposed to a general factor of personality, if only because personality questionnaires are coy about some people being a real pain to work with, and adopt a "all types are necessary and welcome" when the reality is that the uncooperative are a social drag.
Are there any Big Three personality tests (Affiliation, Dynamism, Order) currently available to take or is it too early days for that yet? Would be cool.
You provide 30 words for each pole of "social regulation". Based on which metric are these the closest, and which word embedding are they from?
The reason I'm asking is that word2vec style embeddings use a noisy metric (an unmotivated weighted distance within a document) and even noisier reference data (e.g. a collection of MSNBC articles from 2012) from which the embedding is computed. As a result such embeddings in practice give (I claim) quite distorted approximations to the language models people have in their heads today.
Perhaps as a result of this, I also feel that your lists of close words contain two quite different strands, so are not well matched to either a positive or negative phrasing of the Golden Rule, and also seem at odds with "Social Regulation". The one strand is something like "nice"/"awful", the other is something like "pushover"/"activist". These may well form a useful aggregate in your analysis for statistical purposes but when assessing someone's personality they are separate for me: many people want the charity they give to headed by nice people who are pushy, even if they prefer a friend to be gentle and obliging. This mismatch might be because you used a poor embedding and a better embedding would reflect different lists of words.
Very interesting stuff. Thank you for sharing.