9TO5MAC.COM
Apple details on-device Apple Intelligence training system using user data
Last month, Apple delayed the rollout of its more personal and powerful Siri features. As it looks to right the ship for future Apple Intelligence updates, Bloomberg highlights a shift that Apple is making in how it trains its artificial intelligence models. The report highlights a blog post from Apple’s Machine Learning Research website, explaining how Apple generally uses synthetic data to train its AI models. There are limitations to this strategy, however, including the fact that it’s hard for synthetic data to “understand trends” in features like summarization or writing tools that operate on longer sentences or entire email messages. To address this limitation, Apple highlights a new technology it will soon start using that compares the synthetic data to a small sample of recent user emails, but without compromising user privacy: To improve our models we need to generate a set of many emails that cover topics that are most common in messages. To curate a representative set of synthetic emails, we start by creating a large set of synthetic messages on a variety of topics. For example, we might create a synthetic message, “Would you like to play tennis tomorrow at 11:30AM?” This is done without any knowledge of individual user emails. We then derive a representation, called an embedding, of each synthetic message that captures some of the key dimensions of the message like language, topic, and length. These embeddings are then sent to a small number of user devices that have opted in to Device Analytics. Participating devices then select a small sample of recent user emails and compute their embeddings. Each device then decides which of the synthetic embeddings is closest to these samples. Using differential privacy, Apple can then learn the most-frequently selected synthetic embeddings across all devices, without learning which synthetic embedding was selected on any given device. These most-frequently selected synthetic embeddings can then be used to generate training or testing data, or we can run additional curation steps to further refine the dataset. For example, if the message about playing tennis is one of the top embeddings, a similar message replacing “tennis” with “soccer” or another sport could be generated and added to the set for the next round of curation (see Figure 1). This process allows us to improve the topics and language of our synthetic emails, which helps us train our models to create better text outputs in features like email summaries, while protecting privacy. Apple explains that these techniques allow it to “understand overall trends, without learning information about any individual. Bloomberg says that Apple will roll out this new system in a future beta of iOS 18.5 and macOS 15.5. You can read Apple’s full blog post for more details. Follow Chance: Threads, Bluesky, Instagram, and Mastodon.  Add 9to5Mac to your Google News feed.  FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
0 Kommentare 0 Anteile 62 Ansichten