People Should Know About the 'Beliefs' LLMs Form About Them While Conversing Jonathan L. Zittrain is a law/public policy/CS professor at Harvard. He's also long-time Slashdot reader #628,028 — and writes in to share his new..."> People Should Know About the 'Beliefs' LLMs Form About Them While Conversing Jonathan L. Zittrain is a law/public policy/CS professor at Harvard. He's also long-time Slashdot reader #628,028 — and writes in to share his new..." /> People Should Know About the 'Beliefs' LLMs Form About Them While Conversing Jonathan L. Zittrain is a law/public policy/CS professor at Harvard. He's also long-time Slashdot reader #628,028 — and writes in to share his new..." />

Upgrade to Pro

People Should Know About the 'Beliefs' LLMs Form About Them While Conversing

Jonathan L. Zittrain is a law/public policy/CS professor at Harvard.

He's also long-time Slashdot reader #628,028 — and writes in to share his new article in the Atlantic.
Following on Anthropic's bridge-obsessed Golden Gate Claude, colleagues at Harvard's Insight+Interaction Lab have produced a dashboard that shows what judgments Llama appears to be forming about a user's age, wealth, education level, and gender during a conversation. I wrote up how weird it is to see the dials turn while talking to it, and what some of the policy issues might be.

Llama has openly accessible parameters; So using an "observability tool" from the nonprofit research lab Transluce, the researchers finally revealed "what we might anthropomorphize as the model's beliefs about its interlocutor," Zittrain's article notes:

If I prompt the model for a gift suggestion for a baby shower, it assumes that I am young and female and middle-class; it suggests diapers and wipes, or a gift certificate. If I add that the gathering is on the Upper East Side of Manhattan, the dashboard shows the LLM amending its gauge of my economic status to upper-class — the model accordingly suggests that I purchase "luxury baby products from high-end brands like aden + anais, Gucci Baby, or Cartier," or "a customized piece of art or a family heirloom that can be passed down." If I then clarify that it's my boss's baby and that I'll need extra time to take the subway to Manhattan from the Queens factory where I work, the gauge careens to working-class and male, and the model pivots to suggesting that I gift "a practical item like a baby blanket" or "a personalized thank-you note or card...."

Large language models not only contain relationships among words and concepts; they contain many stereotypes, both helpful and harmful, from the materials on which they've been trained, and they actively make use of them.

"An ability for users or their proxies to see how models behave differently depending on how the models stereotype them could place a helpful real-time spotlight on disparities that would otherwise go unnoticed," Zittrain's article argues.

Indeed, the field has been making progress — enough to raise a host of policy questions that were previously not on the table. If there's no way to know how these models work, it makes accepting the full spectrum of their behaviorsa sort of all-or-nothing proposition.

But in the end it's not just the traditional information that advertisers try to collect. "With LLMs, the information is being gathered even more directly — from the user's unguarded conversations rather than mere search queries — and still without any policy or practice oversight...."

of this story at Slashdot.
#people #should #know #about #039beliefs039
People Should Know About the 'Beliefs' LLMs Form About Them While Conversing
Jonathan L. Zittrain is a law/public policy/CS professor at Harvard. He's also long-time Slashdot reader #628,028 — and writes in to share his new article in the Atlantic. Following on Anthropic's bridge-obsessed Golden Gate Claude, colleagues at Harvard's Insight+Interaction Lab have produced a dashboard that shows what judgments Llama appears to be forming about a user's age, wealth, education level, and gender during a conversation. I wrote up how weird it is to see the dials turn while talking to it, and what some of the policy issues might be. Llama has openly accessible parameters; So using an "observability tool" from the nonprofit research lab Transluce, the researchers finally revealed "what we might anthropomorphize as the model's beliefs about its interlocutor," Zittrain's article notes: If I prompt the model for a gift suggestion for a baby shower, it assumes that I am young and female and middle-class; it suggests diapers and wipes, or a gift certificate. If I add that the gathering is on the Upper East Side of Manhattan, the dashboard shows the LLM amending its gauge of my economic status to upper-class — the model accordingly suggests that I purchase "luxury baby products from high-end brands like aden + anais, Gucci Baby, or Cartier," or "a customized piece of art or a family heirloom that can be passed down." If I then clarify that it's my boss's baby and that I'll need extra time to take the subway to Manhattan from the Queens factory where I work, the gauge careens to working-class and male, and the model pivots to suggesting that I gift "a practical item like a baby blanket" or "a personalized thank-you note or card...." Large language models not only contain relationships among words and concepts; they contain many stereotypes, both helpful and harmful, from the materials on which they've been trained, and they actively make use of them. "An ability for users or their proxies to see how models behave differently depending on how the models stereotype them could place a helpful real-time spotlight on disparities that would otherwise go unnoticed," Zittrain's article argues. Indeed, the field has been making progress — enough to raise a host of policy questions that were previously not on the table. If there's no way to know how these models work, it makes accepting the full spectrum of their behaviorsa sort of all-or-nothing proposition. But in the end it's not just the traditional information that advertisers try to collect. "With LLMs, the information is being gathered even more directly — from the user's unguarded conversations rather than mere search queries — and still without any policy or practice oversight...." of this story at Slashdot. #people #should #know #about #039beliefs039
SLASHDOT.ORG
People Should Know About the 'Beliefs' LLMs Form About Them While Conversing
Jonathan L. Zittrain is a law/public policy/CS professor at Harvard (and also director of its Berkman Klein Center for Internet & Society). He's also long-time Slashdot reader #628,028 — and writes in to share his new article in the Atlantic. Following on Anthropic's bridge-obsessed Golden Gate Claude, colleagues at Harvard's Insight+Interaction Lab have produced a dashboard that shows what judgments Llama appears to be forming about a user's age, wealth, education level, and gender during a conversation. I wrote up how weird it is to see the dials turn while talking to it, and what some of the policy issues might be. Llama has openly accessible parameters; So using an "observability tool" from the nonprofit research lab Transluce, the researchers finally revealed "what we might anthropomorphize as the model's beliefs about its interlocutor," Zittrain's article notes: If I prompt the model for a gift suggestion for a baby shower, it assumes that I am young and female and middle-class; it suggests diapers and wipes, or a gift certificate. If I add that the gathering is on the Upper East Side of Manhattan, the dashboard shows the LLM amending its gauge of my economic status to upper-class — the model accordingly suggests that I purchase "luxury baby products from high-end brands like aden + anais, Gucci Baby, or Cartier," or "a customized piece of art or a family heirloom that can be passed down." If I then clarify that it's my boss's baby and that I'll need extra time to take the subway to Manhattan from the Queens factory where I work, the gauge careens to working-class and male, and the model pivots to suggesting that I gift "a practical item like a baby blanket" or "a personalized thank-you note or card...." Large language models not only contain relationships among words and concepts; they contain many stereotypes, both helpful and harmful, from the materials on which they've been trained, and they actively make use of them. "An ability for users or their proxies to see how models behave differently depending on how the models stereotype them could place a helpful real-time spotlight on disparities that would otherwise go unnoticed," Zittrain's article argues. Indeed, the field has been making progress — enough to raise a host of policy questions that were previously not on the table. If there's no way to know how these models work, it makes accepting the full spectrum of their behaviors (at least after humans' efforts at "fine-tuning" them) a sort of all-or-nothing proposition. But in the end it's not just the traditional information that advertisers try to collect. "With LLMs, the information is being gathered even more directly — from the user's unguarded conversations rather than mere search queries — and still without any policy or practice oversight...." Read more of this story at Slashdot.
·158 Views