WWW.TECHNOLOGYREVIEW.COM
Reckoning with generative AIs uncanny valley
Generative AI has the power to surprise in a way that few other technologies can. Sometimes thats a very good thing; other times, not so good. In theory, as generative AI improves, this issue should become less important. However, in reality, as generative AI becomes more human it can begin to turn sinister and unsettling, plunging us into what robotics has long described as the uncanny valley.It might be tempting to overlook this experience as something that can be corrected by bigger data sets or better training. However, insofar as it speaks to a disturbance in our mental model of the technology (e.g., I dont like what it did there) its something that needs to be acknowledged and addressed.Mental models and antipatternsMental models are an important concept in UX and product design, but they need to be more readily embraced by the AI community. At one level, mental models often dont appear because they are routine patterns of our assumptions about an AI system. This is something we discussed at length in the process of putting together the latest volume of the Thoughtworks Technology Radar, a biannual report based on our experiences working with clients all over the world.For instance, we called out complacency with AI generated code and replacing pair programming with generative AI as two practices we believe practitioners must avoid as the popularity of AI coding assistants continues to grow. Both emerge from poor mental models that fail to acknowledge how this technology actually works and its limitations. The consequences are that the more convincing and human these tools become, the harder it is for us to acknowledge how the technology actually works and the limitations of the solutions it provides us.Of course, for those deploying generative AI into the world, the risks are similar, perhaps even more pronounced. While the intent behind such tools is usually to create something convincing and usable, if such tools mislead, trick, or even merely unsettle users, their value and worth evaporates. Its no surprise that legislation, such as the EU AI Act, which requires of deep fake creators to label content as AI generated, is being passed to address these problems.Its worth pointing out that this isnt just an issue for AI and robotics. Back in 2011, our colleague Martin Fowler wrote about how certain approaches to building cross platform mobile applications can create an uncanny valley, where things work mostly like native controls but there are just enough tiny differences to throw users off.Specifically, Fowler wrote something we think is instructive: different platforms have different ways they expect you to use them that alter the entire experience design. The point here, applied to generative AI, is that different contexts and different use cases all come with different sets of assumptions and mental models that change at what point users might drop into the uncanny valley. These subtle differences change ones experience or perception of a large language models (LLM) output.For example, for the drug researcher that wants vast amounts of synthetic data, accuracy at a micro level may be unimportant; for the lawyer trying to grasp legal documentation, accuracy matters a lot. In fact, dropping into the uncanny valley might just be the signal to step back and reassess your expectations.Shifting our perspectiveThe uncanny valley of generative AI might be troubling, even something we want to minimize, but it should also remind us of generative AIs limitationsit should encourage us to rethink our perspective.There have been some interesting attempts to do that across the industry. One that stands out is Ethan Mollick, a professor at the University of Pennsylvania, who argues that AI shouldnt be understood as good software but instead as pretty good people.Therefore, our expectations about what generative AI can do and where its effective must remain provisional and should be flexible. To a certain extent, this might be one way of overcoming the uncanny valleyby reflecting on our assumptions and expectations, we remove the technologys power to disturb or confound them.However, simply calling for a mindset shift isnt enough. There are various practices and tools that can help. One example is the technique, which we identified in the latest Technology Radar, of getting structured outputs from LLMs. This can be done by either instructing a model to respond in a particular format when prompting or through fine-tuning. Thanks to tools like Instructor, it is getting easier to do that and creates greater alignment between expectations and what the LLM will output. While theres a chance something unexpected or not quite right might happen, this technique goes some way to addressing that.There are other techniques too, including retrieval augmented generation as a way of better controlling the context window. There are frameworks and tools that can help evaluate and measure the success of such techniques, including Ragas and DeepEval, which are libraries that provide AI developers with metrics for faithfulness and relevance.Measurement is important, as are relevant guidelines and policies for LLMs, such as LLM guardrails. Its important to take steps to better understand whats actually happening inside these models. Completely unpacking these black boxes might be impossible, but tools like Langfuse can help. Doing so may go a long way in reorienting the relationship with this technology, shifting mental models, and removing the possibility of falling into the uncanny valley.An opportunity, not a flawThese toolspart of a Cambrian explosion of generative AI toolscan help practitioners rethink generative AI and, hopefully, build better and more responsible products. However, for the wider world, this work will remain invisible. Whats important is exploring how we can evolve toolchains to better control and understand generative AI, even though existing mental models and conceptions of generative AI are a fundamental design problem, not a marginal issue we can choose to ignore.Ken Mugrage is the principal technologist in the office of the CTO at Thoughtworks. Srinivasan Raguraman is a technical principal at Thoughtworks based in Singapore.This content was produced by Thoughtworks. It was not written by MIT Technology Reviews editorial staff.
0 Comentários 0 Compartilhamentos 40 Visualizações