Can I use LLMs to translate or localize conversational AI experiences?
uxdesign.cc
How failing to engage a translator will negatively affect your users experience, adversely impacting your ROI andCSAT.A lot of companies want to expand conversational AI experiences in multiple languages and they want to do it fast. Thus, in this early era of LLMs, many companies are looking to use LLMs to accelerate the translation process. This is a natural assumption given the sheer language- generation capabilities ofLLMs.In the last 3 years Ive advised 50+ brands in designing and optimizing their conversational experiences. Given that Im multilingual and also have a PhD in linguistics, translation strategy is a frequent question Im asked about. I have seen various approaches when it comes to localization and translation.Keep reading to learn why you shouldnt rely only on LLMs for translations, and to discover whats the best practice you can follow that will allow you to still use LLMs but not compromise onquality.What does translating a botentail?Translating a bot may seem like a relatively simple task. Look at the image below. At first glance it seems like what needs translation are all the words you see in the imagebelow.But take a closer look above and youll see a lot more that needs translation/localization:the botsnameThe bots name Buddy is a casual term for friend. A good translation will capture the essence of the name in the other language.2. local/regional expressionsIn the example above, the bot greets with a Howdy as opposed to a Hello and it uses a specific idiom be fixin to do somethingyou will need to capture the persona and tone expressed with these word choices in the otherlanguage3. buttonsThose options or buttons you see, Buy me some boots, Get help with my order etc., often have a character limit depending on the interface you use, for example, Facebook buttons are limited to 20 characters, and Viber, 25. This means youll need to translate them within the character limit so they render well. In general different user interfaces have different constraints, look for resources online to learn more about the constraints and test them yourself.4. regex patterns and NLU setupTaking it one step further, what happens when a user types something into the bot? To capture the different things a user might say, youll need to add Regex patterns and set up all your NLU (intents, entities/slots, and training phrases) to make sure you understand the user correctly.Regex is very helpful in cases where the bot is looking for defined patterns such as email addresses, telephone numbers, flight numbers etc. If the order number for a specific company has a regular pattern of 2 letters plus 6 numbersAB123456, XY987654, etc, you can add the rule \b[a-zA-Z]{2}[09]{6}\b to capture an order number in a users utterance such as Can you check on RF345678?. This will help the bot understand the user better. While numbers are universal, how they are written in non-Latin scripts and in right to left writing systems is different. Regex can also be really helpful to set up affirmations or negations, so in addition to accepting yes in response to a question, you can add sure or yep and more affirmation options using Regex to make sure the bot understands the user. You will need to set up regex patterns based on the language in the bot and a translator can help you dothat.Setting up a strong NLU with intents, entities/slots, and training phrases helps understand the user correctly. For the intent track my order you will need to think of all the ways a user might ask to track an orderWheres my sandwich?, When will my order arrive?, Help track my food. All of these utterances can be used as training phrases to create the intent. Specifically in the utterance Wheres my sandwich?, depending on what the item is, the word sandwich could be replaced by pizza or soup. You can create a category called menu items and add each of those as an entity so the NLU wil capture Wheres my [menu item]? instead of specifically Wheres my sandwich?.Other than using Regex and setting up a strong NLU, youll also need to account for standard commands like back, repeat etc.; and all of this needs to be handled in the language you are translating the botinto.A quick note on voice bots before we go any further. If youre working on translating a voice bot, youll need to do all of the above plus youll need to make an informed decision on what voice to choose for the bot: the voice that the bot speaks with and the speech recognition system to understand what the user issaying.5. linguistic and cultural differencesIt goes without saying that there are numerous linguistic and cultural differences you need to be not just aware of but knowledgeable on to get the translations and localizations just right. Let me share two examples. The first example is illustrated below. In Tamil, there is an inclusive we and an exclusive we. In English if I said We are going to the movies, the listener would not know from the words I used if we means you and I or someone else and I. They would need to figure that out fromcontext.In Tamil, if I said We-inclusive (namba) are going to the movies, you would know that I mean you and I and if i said We-exclusive (naangu) are going to the movies it would be clear I mean someone else and I. The second example, in American English if you want to tell someone that youre on your way, you might say Im coming. In Spanish you would say Im going(Voy).And it doesnt end here. Once the translation is complete, pre-launch, you will need to carry out testing and post-launch, the bot needs to be maintained and iterated with a regular cadence. It is impossible to build and maintain a bot and to analyze and assess the quality of the bot in the native language without having a native/fluent speaker review these transcripts.Need more reasons not to rely on LLMs or machine translations?What does the researchsay?Several studies have shown that translations using machines are not reliableIm linking one, two, and three here. While machines may be fast and could be more easily trained to be consistent with lexical choices, machines struggle to handle context and ambiguity. LLMs are also not very accurate in their translations; one study determined accuracy varies greatly by language: Spanish 94%, Tagalog 90%, Korean 82.5%, Chinese 81.7%, Farsi 67.5%, Armenian55%).Look at the following examples using machine-only translations, taken from thesestudies:(1) English toArmenianEnglish: You can take over the counter ibuprofen as needed forpain.Translated: You may take anti-tank missile as much as you need forpain.(2) English toChineseEnglish: Your Coumadin level was too high today. Do not take any more Coumadin until your doctor reviews theresults.Translated: Your soybean level was too high today. Do not take anymore soybean until your doctor reviews theresults.(3) Arabic toEnglishArabic: Translated: what is the appropriate rate ofsuccessExample (1) is such a bad translation, the user will probably realize its not translated right, but example (2) could pass for a correct translation and might endanger the users life since they wouldnt stop taking Coumadin. In example (3), the issue is that the English translation is not a natural way of phrasing the statement in English and so it wouldnt be clear to the user what was actuallymeant.Look at the following image. This bot, powered by GPT-3, is built in India, a multilingual country with 18 officially recognized languages. When asked in multiple ways to speak Hindi, the bot doesnt switch to Hindi and even gaslights the user by saying Ill continue to respond in English as requested.At the end when it is asked a question in Hindi, Do you serve food on the flight?; it responds to the question in English, you can indulge in some munchies while you soar through the skies with us!; making it clear that it does understand Hindi but refuses to speak it. There was obviously a business decision made about whether to allow this bot to speak Hindi or not and perhaps for legal reasons it doesnt speak it. Personally, I think it would be more helpful if the bot let the user know where to get help inHindi.My point is if you decide to use an LLM, you will need to tell it how to handle situations like these in which a user requests the bot to speak another language. Theres no getting out of it, youll need a well-thought out strategy. If you let your LLM speak multiple languages, youll need to make sure it speaks those languages right.Given all the above, dont compromise on a great customer experience by using just LLMs for translation. Maybe in time, there could be more complex prompts or a reenvisioned UI that enables us to better translate all these nuances, but even that work needs to be completed by a human team and cannot be replaced by AI team. There are no shortcuts for a great customer experience.Best practice: How to get itrightThe ideal practice to translate an existing bot to another language is to hire an entire teamconversation designer, bot tuner, devs, testers, conversation analystsin the target language, that is, the language in which you are translating the botinto.And if the ideal practice is beyond reachlets be honest, most often the reality is that companies dont usually afford the luxury of having a full team in the target languageso heres the best workaround. Hire an expert translator; educate them on who the target users are, on conversation design, and bot tuning. Also educate and involve them on internal testing, user testing and have them help you with post launch analysis, iteration, and maintenance.If youre still unsure about when to use an LLM, thats ok. Ask the translator, even if they have not had experience with nor understand LLMs, when they try it, they will know to assess the resulting translation. Most likely they will use it to create a first draft translation and then they will polish up the draft with knowledge only they posses for a final version. If youre building a Gen AI bot, the conversation designer and translator will need to work closely on the prompt design. Together they can determine if they should write the prompt in the target language, or keep it in the original language and add extra details about the target language so that the bot has rules in the target language for the translation. Basically with the current technology, you cannot use an LLM without human involvement for good quality translations.But if this also seems like a lot of investment and you cannot do it right, then dont do it. If you dont do a thorough job, you arent respecting your customer nor are you being inclusive which most likely will result in a net negative CX. Instead, revisit the reason you wanted to make the bot available in another language in the first place. Assess: is there really a need to provide this experience in multiple languages?; will enough people use the experience to make it worth it?; will making the bot available in multiple languages improve the user experience significantly?; do you truly have the resources to make it worthwhile for your company and your customers? It is not worthwhile to treat translation activities as a box-checking activity.Alternative solutions to handling other languagesDont have bandwidth or time to do the above, but have people coming to your bot expecting it to work in another language? An alternative to translating the entire bot is to let the customers know where and how they can get help. So if you have many customers that speak only Spanish, its ok to add a button in the bot so these customers can select the button espaol and connect directly to a Spanish speaking agent, as shown in the imagebelow.Alternatively, if you do not have agents on that platform who can help, let the customer know where they can get the help they need in the language the customer uses. An example of how an LLM powered bot could handle this elegantly is illustrated in the image below; all you need to do is make sure you design your prompt to handle for multiple languages.If youre building conversational experiences in multiple languages, go ahead, use LLMs but dont skip engaging an expert translator to get it right. Remember a mediocre customer experience can really impact a brands reputation. Its worth the ROI and it is ethical, inclusive, and humane to respect your customers and do translations right.Many thanks to Meredith Schulz and Cathy Pearl for their valuable advice on drafts of this article. I originally presented a version of this article as a talk at the Unparsed 2024 conference.Can I use LLMs to translate or localize conversational AI experiences? was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story.
0 Reacties ·0 aandelen ·220 Views