An emotionally stable helpmeet (hold the meat)
I think it is best to ask it to provide the reasoning before the providing the score, since that allows the reasoning to affect the score. Also I think the assumptions behind persinality tests are better satisfied if each question is given in a separate session, since that makes the questions independent from each other.
I think we need to think more carefully about personality for machines. There's no reason to assume they have a single, stable personality, like we assume for people most of the time.
A language model is trained to be able to imitate any author on the Internet (more or less). One way to think about it is that it has all the personalities, and you can pick one by giving it the right prompt.
Reinforcement training sets a default, but the capability to imitate other personalities doesn't go away. You can still ask it to imitate someone else by asking in the right way, or with a suitable plot twist.
Also, if you ask it all the questions at once, it's going to pick a personality for the first question and change it a bit with each question, which might cause "drift" from the default personality.
Asking each question in a separate question will better get at the default personality if that's what you're interested in. It's what most people will see. But I think being aware that it's a default helps avoid confused conversations.
It might be interesting to see how much the personality can be changed with the right prompt. How well can this be fine-tuned? Could you get any answers you want on the survey with the right prompt?