Exposing biases, moods, personalities and abstract concepts hidden in large language models

By now, ChatGPT, Claude, and other large language models have accumulated so much human knowledge that they’re far from simple answer-generators; they can also express abstract concepts, such as certain tones, personalities, biases, and moods. However, it’s not obvious exactly how these models represent abstract concepts to begin with from the knowledge they contain.

This article is brought to you by this site.

Skip The Dishes Referral Code