• ByteDance Researchers Introduce DetailFlow: A 1D Coarse-to-Fine Autoregressive Framework for Faster, Token-Efficient Image Generation

    Autoregressive image generation has been shaped by advances in sequential modeling, originally seen in natural language processing. This field focuses on generating images one token at a time, similar to how sentences are constructed in language models. The appeal of this approach lies in its ability to maintain structural coherence across the image while allowing for high levels of control during the generation process. As researchers began to apply these techniques to visual data, they found that structured prediction not only preserved spatial integrity but also supported tasks like image manipulation and multimodal translation effectively.
    Despite these benefits, generating high-resolution images remains computationally expensive and slow. A primary issue is the number of tokens needed to represent complex visuals. Raster-scan methods that flatten 2D images into linear sequences require thousands of tokens for detailed images, resulting in long inference times and high memory consumption. Models like Infinity need over 10,000 tokens for a 1024×1024 image. This becomes unsustainable for real-time applications or when scaling to more extensive datasets. Reducing the token burden while preserving or improving output quality has become a pressing challenge.

    Efforts to mitigate token inflation have led to innovations like next-scale prediction seen in VAR and FlexVAR. These models create images by predicting progressively finer scales, which imitates the human tendency to sketch rough outlines before adding detail. However, they still rely on hundreds of tokens—680 in the case of VAR and FlexVAR for 256×256 images. Moreover, approaches like TiTok and FlexTok use 1D tokenization to compress spatial redundancy, but they often fail to scale efficiently. For example, FlexTok’s gFID increases from 1.9 at 32 tokens to 2.5 at 256 tokens, highlighting a degradation in output quality as the token count grows.
    Researchers from ByteDance introduced DetailFlow, a 1D autoregressive image generation framework. This method arranges token sequences from global to fine detail using a process called next-detail prediction. Unlike traditional 2D raster-scan or scale-based techniques, DetailFlow employs a 1D tokenizer trained on progressively degraded images. This design allows the model to prioritize foundational image structures before refining visual details. By mapping tokens directly to resolution levels, DetailFlow significantly reduces token requirements, enabling images to be generated in a semantically ordered, coarse-to-fine manner.

    The mechanism in DetailFlow centers on a 1D latent space where each token contributes incrementally more detail. Earlier tokens encode global features, while later tokens refine specific visual aspects. To train this, the researchers created a resolution mapping function that links token count to target resolution. During training, the model is exposed to images of varying quality levels and learns to predict progressively higher-resolution outputs as more tokens are introduced. It also implements parallel token prediction by grouping sequences and predicting entire sets at once. Since parallel prediction can introduce sampling errors, a self-correction mechanism was integrated. This system perturbs certain tokens during training and teaches subsequent tokens to compensate, ensuring that final images maintain structural and visual integrity.
    The results from the experiments on the ImageNet 256×256 benchmark were noteworthy. DetailFlow achieved a gFID score of 2.96 using only 128 tokens, outperforming VAR at 3.3 and FlexVAR at 3.05, both of which used 680 tokens. Even more impressive, DetailFlow-64 reached a gFID of 2.62 using 512 tokens. In terms of speed, it delivered nearly double the inference rate of VAR and FlexVAR. A further ablation study confirmed that the self-correction training and semantic ordering of tokens substantially improved output quality. For example, enabling self-correction dropped the gFID from 4.11 to 3.68 in one setting. These metrics demonstrate both higher quality and faster generation compared to established models.

    By focusing on semantic structure and reducing redundancy, DetailFlow presents a viable solution to long-standing issues in autoregressive image generation. The method’s coarse-to-fine approach, efficient parallel decoding, and ability to self-correct highlight how architectural innovations can address performance and scalability limitations. Through their structured use of 1D tokens, the researchers from ByteDance have demonstrated a model that maintains high image fidelity while significantly reducing computational load, making it a valuable addition to image synthesis research.

    Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.
    NikhilNikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.Nikhilhttps://www.marktechpost.com/author/nikhil0980/Teaching AI to Say ‘I Don’t Know’: A New Dataset Mitigates Hallucinations from Reinforcement FinetuningNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces LLaDA-V: A Purely Diffusion-Based Multimodal Large Language Model for Visual Instruction Tuning and Multimodal ReasoningNikhilhttps://www.marktechpost.com/author/nikhil0980/NVIDIA AI Introduces Fast-dLLM: A Training-Free Framework That Brings KV Caching and Parallel Decoding to Diffusion LLMsNikhilhttps://www.marktechpost.com/author/nikhil0980/Meet NovelSeek: A Unified Multi-Agent Framework for Autonomous Scientific Research from Hypothesis Generation to Experimental Validation
    #bytedance #researchers #introduce #detailflow #coarsetofine
    ByteDance Researchers Introduce DetailFlow: A 1D Coarse-to-Fine Autoregressive Framework for Faster, Token-Efficient Image Generation
    Autoregressive image generation has been shaped by advances in sequential modeling, originally seen in natural language processing. This field focuses on generating images one token at a time, similar to how sentences are constructed in language models. The appeal of this approach lies in its ability to maintain structural coherence across the image while allowing for high levels of control during the generation process. As researchers began to apply these techniques to visual data, they found that structured prediction not only preserved spatial integrity but also supported tasks like image manipulation and multimodal translation effectively. Despite these benefits, generating high-resolution images remains computationally expensive and slow. A primary issue is the number of tokens needed to represent complex visuals. Raster-scan methods that flatten 2D images into linear sequences require thousands of tokens for detailed images, resulting in long inference times and high memory consumption. Models like Infinity need over 10,000 tokens for a 1024×1024 image. This becomes unsustainable for real-time applications or when scaling to more extensive datasets. Reducing the token burden while preserving or improving output quality has become a pressing challenge. Efforts to mitigate token inflation have led to innovations like next-scale prediction seen in VAR and FlexVAR. These models create images by predicting progressively finer scales, which imitates the human tendency to sketch rough outlines before adding detail. However, they still rely on hundreds of tokens—680 in the case of VAR and FlexVAR for 256×256 images. Moreover, approaches like TiTok and FlexTok use 1D tokenization to compress spatial redundancy, but they often fail to scale efficiently. For example, FlexTok’s gFID increases from 1.9 at 32 tokens to 2.5 at 256 tokens, highlighting a degradation in output quality as the token count grows. Researchers from ByteDance introduced DetailFlow, a 1D autoregressive image generation framework. This method arranges token sequences from global to fine detail using a process called next-detail prediction. Unlike traditional 2D raster-scan or scale-based techniques, DetailFlow employs a 1D tokenizer trained on progressively degraded images. This design allows the model to prioritize foundational image structures before refining visual details. By mapping tokens directly to resolution levels, DetailFlow significantly reduces token requirements, enabling images to be generated in a semantically ordered, coarse-to-fine manner. The mechanism in DetailFlow centers on a 1D latent space where each token contributes incrementally more detail. Earlier tokens encode global features, while later tokens refine specific visual aspects. To train this, the researchers created a resolution mapping function that links token count to target resolution. During training, the model is exposed to images of varying quality levels and learns to predict progressively higher-resolution outputs as more tokens are introduced. It also implements parallel token prediction by grouping sequences and predicting entire sets at once. Since parallel prediction can introduce sampling errors, a self-correction mechanism was integrated. This system perturbs certain tokens during training and teaches subsequent tokens to compensate, ensuring that final images maintain structural and visual integrity. The results from the experiments on the ImageNet 256×256 benchmark were noteworthy. DetailFlow achieved a gFID score of 2.96 using only 128 tokens, outperforming VAR at 3.3 and FlexVAR at 3.05, both of which used 680 tokens. Even more impressive, DetailFlow-64 reached a gFID of 2.62 using 512 tokens. In terms of speed, it delivered nearly double the inference rate of VAR and FlexVAR. A further ablation study confirmed that the self-correction training and semantic ordering of tokens substantially improved output quality. For example, enabling self-correction dropped the gFID from 4.11 to 3.68 in one setting. These metrics demonstrate both higher quality and faster generation compared to established models. By focusing on semantic structure and reducing redundancy, DetailFlow presents a viable solution to long-standing issues in autoregressive image generation. The method’s coarse-to-fine approach, efficient parallel decoding, and ability to self-correct highlight how architectural innovations can address performance and scalability limitations. Through their structured use of 1D tokens, the researchers from ByteDance have demonstrated a model that maintains high image fidelity while significantly reducing computational load, making it a valuable addition to image synthesis research. Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. NikhilNikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.Nikhilhttps://www.marktechpost.com/author/nikhil0980/Teaching AI to Say ‘I Don’t Know’: A New Dataset Mitigates Hallucinations from Reinforcement FinetuningNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces LLaDA-V: A Purely Diffusion-Based Multimodal Large Language Model for Visual Instruction Tuning and Multimodal ReasoningNikhilhttps://www.marktechpost.com/author/nikhil0980/NVIDIA AI Introduces Fast-dLLM: A Training-Free Framework That Brings KV Caching and Parallel Decoding to Diffusion LLMsNikhilhttps://www.marktechpost.com/author/nikhil0980/Meet NovelSeek: A Unified Multi-Agent Framework for Autonomous Scientific Research from Hypothesis Generation to Experimental Validation #bytedance #researchers #introduce #detailflow #coarsetofine
    WWW.MARKTECHPOST.COM
    ByteDance Researchers Introduce DetailFlow: A 1D Coarse-to-Fine Autoregressive Framework for Faster, Token-Efficient Image Generation
    Autoregressive image generation has been shaped by advances in sequential modeling, originally seen in natural language processing. This field focuses on generating images one token at a time, similar to how sentences are constructed in language models. The appeal of this approach lies in its ability to maintain structural coherence across the image while allowing for high levels of control during the generation process. As researchers began to apply these techniques to visual data, they found that structured prediction not only preserved spatial integrity but also supported tasks like image manipulation and multimodal translation effectively. Despite these benefits, generating high-resolution images remains computationally expensive and slow. A primary issue is the number of tokens needed to represent complex visuals. Raster-scan methods that flatten 2D images into linear sequences require thousands of tokens for detailed images, resulting in long inference times and high memory consumption. Models like Infinity need over 10,000 tokens for a 1024×1024 image. This becomes unsustainable for real-time applications or when scaling to more extensive datasets. Reducing the token burden while preserving or improving output quality has become a pressing challenge. Efforts to mitigate token inflation have led to innovations like next-scale prediction seen in VAR and FlexVAR. These models create images by predicting progressively finer scales, which imitates the human tendency to sketch rough outlines before adding detail. However, they still rely on hundreds of tokens—680 in the case of VAR and FlexVAR for 256×256 images. Moreover, approaches like TiTok and FlexTok use 1D tokenization to compress spatial redundancy, but they often fail to scale efficiently. For example, FlexTok’s gFID increases from 1.9 at 32 tokens to 2.5 at 256 tokens, highlighting a degradation in output quality as the token count grows. Researchers from ByteDance introduced DetailFlow, a 1D autoregressive image generation framework. This method arranges token sequences from global to fine detail using a process called next-detail prediction. Unlike traditional 2D raster-scan or scale-based techniques, DetailFlow employs a 1D tokenizer trained on progressively degraded images. This design allows the model to prioritize foundational image structures before refining visual details. By mapping tokens directly to resolution levels, DetailFlow significantly reduces token requirements, enabling images to be generated in a semantically ordered, coarse-to-fine manner. The mechanism in DetailFlow centers on a 1D latent space where each token contributes incrementally more detail. Earlier tokens encode global features, while later tokens refine specific visual aspects. To train this, the researchers created a resolution mapping function that links token count to target resolution. During training, the model is exposed to images of varying quality levels and learns to predict progressively higher-resolution outputs as more tokens are introduced. It also implements parallel token prediction by grouping sequences and predicting entire sets at once. Since parallel prediction can introduce sampling errors, a self-correction mechanism was integrated. This system perturbs certain tokens during training and teaches subsequent tokens to compensate, ensuring that final images maintain structural and visual integrity. The results from the experiments on the ImageNet 256×256 benchmark were noteworthy. DetailFlow achieved a gFID score of 2.96 using only 128 tokens, outperforming VAR at 3.3 and FlexVAR at 3.05, both of which used 680 tokens. Even more impressive, DetailFlow-64 reached a gFID of 2.62 using 512 tokens. In terms of speed, it delivered nearly double the inference rate of VAR and FlexVAR. A further ablation study confirmed that the self-correction training and semantic ordering of tokens substantially improved output quality. For example, enabling self-correction dropped the gFID from 4.11 to 3.68 in one setting. These metrics demonstrate both higher quality and faster generation compared to established models. By focusing on semantic structure and reducing redundancy, DetailFlow presents a viable solution to long-standing issues in autoregressive image generation. The method’s coarse-to-fine approach, efficient parallel decoding, and ability to self-correct highlight how architectural innovations can address performance and scalability limitations. Through their structured use of 1D tokens, the researchers from ByteDance have demonstrated a model that maintains high image fidelity while significantly reducing computational load, making it a valuable addition to image synthesis research. Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. NikhilNikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.Nikhilhttps://www.marktechpost.com/author/nikhil0980/Teaching AI to Say ‘I Don’t Know’: A New Dataset Mitigates Hallucinations from Reinforcement FinetuningNikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper Introduces LLaDA-V: A Purely Diffusion-Based Multimodal Large Language Model for Visual Instruction Tuning and Multimodal ReasoningNikhilhttps://www.marktechpost.com/author/nikhil0980/NVIDIA AI Introduces Fast-dLLM: A Training-Free Framework That Brings KV Caching and Parallel Decoding to Diffusion LLMsNikhilhttps://www.marktechpost.com/author/nikhil0980/Meet NovelSeek: A Unified Multi-Agent Framework for Autonomous Scientific Research from Hypothesis Generation to Experimental Validation
    Like
    Love
    Wow
    Sad
    Angry
    821
    0 Σχόλια 0 Μοιράστηκε
  • Manus has kick-started an AI agent boom in China

    Last year, China saw a boom in foundation models, the do-everything large language models that underpin the AI revolution. This year, the focus has shifted to AI agents—systems that are less about responding to users’ queries and more about autonomously accomplishing things for them. 

    There are now a host of Chinese startups building these general-purpose digital tools, which can answer emails, browse the internet to plan vacations, and even design an interactive website. Many of these have emerged in just the last two months, following in the footsteps of Manus—a general AI agent that sparked weeks of social media frenzy for invite codes after its limited-release launch in early March. 

    These emerging AI agents aren’t large language models themselves. Instead, they’re built on top of them, using a workflow-based structure designed to get things done. A lot of these systems also introduce a different way of interacting with AI. Rather than just chatting back and forth with users, they are optimized for managing and executing multistep tasks—booking flights, managing schedules, conducting research—by using external tools and remembering instructions. 

    China could take the lead on building these kinds of agents. The country’s tightly integrated app ecosystems, rapid product cycles, and digitally fluent user base could provide a favorable environment for embedding AI into daily life. 

    For now, its leading AI agent startups are focusing their attention on the global market, because the best Western models don’t operate inside China’s firewalls. But that could change soon: Tech giants like ByteDance and Tencent are preparing their own AI agents that could bake automation directly into their native super-apps, pulling data from their vast ecosystem of programs that dominate many aspects of daily life in the country. 

    As the race to define what a useful AI agent looks like unfolds, a mix of ambitious startups and entrenched tech giants are now testing how these tools might actually work in practice—and for whom.

    Set the standard

    It’s been a whirlwind few months for Manus, which was developed by the Wuhan-based startup Butterfly Effect. The company raised million in a funding round led by the US venture capital firm Benchmark, took the product on an ambitious global roadshow, and hired dozens of new employees. 

    Even before registration opened to the public in May, Manus had become a reference point for what a broad, consumer‑oriented AI agent should accomplish. Rather than handling narrow chores for businesses, this “general” agent is designed to be able to help with everyday tasks like trip planning, stock comparison, or your kid’s school project. 

    Unlike previous AI agents, Manus uses a browser-based sandbox that lets users supervise the agent like an intern, watching in real time as it scrolls through web pages, reads articles, or codes actions. It also proactively asks clarifying questions, supports long-term memory that would serve as context for future tasks.

    “Manus represents a promising product experience for AI agents,” says Ang Li, cofounder and CEO of Simular, a startup based in Palo Alto, California, that’s building computer use agents, AI agents that control a virtual computer. “I believe Chinese startups have a huge advantage when it comes to designing consumer products, thanks to cutthroat domestic competition that leads to fast execution and greater attention to product details.”

    In the case of Manus, the competition is moving fast. Two of the most buzzy follow‑ups, Genspark and Flowith, for example, are already boasting benchmark scores that match or edge past Manus’s. 

    Genspark, led by former Baidu executives Eric Jing and Kay Zhu, links many small “super agents” through what it calls multi‑component prompting. The agent can switch among several large language models, accepts both images and text, and carries out tasks from making slide decks to placing phone calls. Whereas Manus relies heavily on Browser Use, a popular open-source product that lets agents operate a web browser in a virtual window like a human, Genspark directly integrates with a wide array of tools and APIs. Launched in April, the company says that it already has over 5 million users and over million in yearly revenue.

    Flowith, the work of a young team that first grabbed public attention in April 2025 at a developer event hosted by the popular social media app Xiaohongshu, takes a different tack. Marketed as an “infinite agent,” it opens on a blank canvas where each question becomes a node on a branching map. Users can backtrack, take new branches, and store results in personal or sharable “knowledge gardens”—a design that feels more like project management softwarethan a typical chat interface. Every inquiry or task builds its own mind-map-like graph, encouraging a more nonlinear and creative interaction with AI. Flowith’s core agent, NEO, runs in the cloud and can perform scheduled tasks like sending emails and compiling files. The founders want the app to be a “knowledge marketbase”, and aims to tap into the social aspect of AI with the aspiration of becoming “the OnlyFans of AI knowledge creators”.

    What they also share with Manus is the global ambition. Both Genspark and Flowith have stated that their primary focus is the international market.

    A global address

    Startups like Manus, Genspark, and Flowith—though founded by Chinese entrepreneurs—could blend seamlessly into the global tech scene and compete effectively abroad. Founders, investors, and analysts that MIT Technology Review has spoken to believe Chinese companies are moving fast, executing well, and quickly coming up with new products. 

    Money reinforces the pull to launch overseas. Customers there pay more, and there are plenty to go around. “You can price in USD, and with the exchange rate that’s a sevenfold multiplier,” Manus cofounder Xiao Hong quipped on a podcast. “Even if we’re only operating at 10% power because of cultural differences overseas, we’ll still make more than in China.”

    But creating the same functionality in China is a challenge. Major US AI companies including OpenAI and Anthropic have opted out of mainland China because of geopolitical risks and challenges with regulatory compliance. Their absence initially created a black market as users resorted to VPNs and third-party mirrors to access tools like ChatGPT and Claude. That vacuum has since been filled by a new wave of Chinese chatbots—DeepSeek, Doubao, Kimi—but the appetite for foreign models hasn’t gone away. 

    Manus, for example, uses Anthropic’s Claude Sonnet—widely considered the top model for agentic tasks. Manus cofounder Zhang Tao has repeatedly praised Claude’s ability to juggle tools, remember contexts, and hold multi‑round conversations—all crucial for turning chatty software into an effective executive assistant.

    But the company’s use of Sonnet has made its agent functionally unusable inside China without a VPN. If you open Manus from a mainland IP address, you’ll see a notice explaining that the team is “working on integrating Qwen’s model,” a special local version that is built on top of Alibaba’s open-source model. 

    An engineer overseeing ByteDance’s work on developing an agent, who spoke to MIT Technology Review anonymously to avoid sanction, said that the absence of Claude Sonnet models “limits everything we do in China.” DeepSeek’s open models, he added, still hallucinate too often and lack training on real‑world workflows. Developers we spoke with rank Alibaba’s Qwen series as the best domestic alternative, yet most say that switching to Qwen knocks performance down a notch.

    Jiaxin Pei, a postdoctoral researcher at Stanford’s Institute for Human‑Centered AI, thinks that gap will close: “Building agentic capabilities in base LLMs has become a key focus for many LLM builders, and once people realize the value of this, it will only be a matter of time.”

    For now, Manus is doubling down on audiences it can already serve. In a written response, the company said its “primary focus is overseas expansion,” noting that new offices in San Francisco, Singapore, and Tokyo have opened in the past month.

    A super‑app approach

    Although the concept of AI agents is still relatively new, the consumer-facing AI app market in China is already crowded with major tech players. DeepSeek remains the most widely used, while ByteDance’s Doubao and Moonshot’s Kimi have also become household names. However, most of these apps are still optimized for chat and entertainment rather than task execution. This gap in the local market has pushed China’s big tech firms to roll out their own user-facing agents, though early versions remain uneven in quality and rough around the edges. 

    ByteDance is testing Coze Space, an AI agent based on its own Doubao model family that lets users toggle between “plan” and “execute” modes, so they can either directly guide the agent’s actions or step back and watch it work autonomously. It connects up to 14 popular apps, including GitHub, Notion, and the company’s own Lark office suite. Early reviews say the tool can feel clunky and has a high failure rate, but it clearly aims to match what Manus offers.

    Meanwhile, Zhipu AI has released a free agent called AutoGLM Rumination, built on its proprietary ChatGLM models. Shanghai‑based Minimax has launched Minimax Agent. Both products look almost identical to Manus and demo basic tasks such as building a simple website, planning a trip, making a small Flash game, or running quick data analysis.

    Despite the limited usability of most general AI agents launched within China, big companies have plans to change that. During a May 15 earnings call, Tencent president Liu Zhiping teased an agent that would weave automation directly into China’s most ubiquitous app, WeChat. 

    Considered the original super-app, WeChat already handles messaging, mobile payments, news, and millions of mini‑programs that act like embedded apps. These programs give Tencent, its developer, access to data from millions of services that pervade everyday life in China, an advantage most competitors can only envy.

    Historically, China’s consumer internet has splintered into competing walled gardens—share a Taobao link in WeChat and it resolves as plaintext, not a preview card. Unlike the more interoperable Western internet, China’s tech giants have long resisted integration with one another, choosing to wage platform war at the expense of a seamless user experience.

    But the use of mini‑programs has given WeChat unprecedented reach across services that once resisted interoperability, from gym bookings to grocery orders. An agent able to roam that ecosystem could bypass the integration headaches dogging independent startups.

    Alibaba, the e-commerce giant behind the Qwen model series, has been a front-runner in China’s AI race but has been slower to release consumer-facing products. Even though Qwen was the most downloaded open-source model on Hugging Face in 2024, it didn’t power a dedicated chatbot app until early 2025. In March, Alibaba rebranded its cloud storage and search app Quark into an all-in-one AI search tool. By June, Quark had introduced DeepResearch—a new mode that marks its most agent-like effort to date. 

    ByteDance and Alibaba did not reply to MIT Technology Review’s request for comments.

    “Historically, Chinese tech products tend to pursue the all-in-one, super-app approach, and the latest Chinese AI agents reflect just that,” says Li of Simular, who previously worked at Google DeepMind on AI-enabled work automation. “In contrast, AI agents in the US are more focused on serving specific verticals.”

    Pei, the researcher at Stanford, says that existing tech giants could have a huge advantage in bringing the vision of general AI agents to life—especially those with built-in integration across services. “The customer-facing AI agent market is still very early, with tons of problems like authentication and liability,” he says. “But companies that already operate across a wide range of services have a natural advantage in deploying agents at scale.”
    #manus #has #kickstarted #agent #boom
    Manus has kick-started an AI agent boom in China
    Last year, China saw a boom in foundation models, the do-everything large language models that underpin the AI revolution. This year, the focus has shifted to AI agents—systems that are less about responding to users’ queries and more about autonomously accomplishing things for them.  There are now a host of Chinese startups building these general-purpose digital tools, which can answer emails, browse the internet to plan vacations, and even design an interactive website. Many of these have emerged in just the last two months, following in the footsteps of Manus—a general AI agent that sparked weeks of social media frenzy for invite codes after its limited-release launch in early March.  These emerging AI agents aren’t large language models themselves. Instead, they’re built on top of them, using a workflow-based structure designed to get things done. A lot of these systems also introduce a different way of interacting with AI. Rather than just chatting back and forth with users, they are optimized for managing and executing multistep tasks—booking flights, managing schedules, conducting research—by using external tools and remembering instructions.  China could take the lead on building these kinds of agents. The country’s tightly integrated app ecosystems, rapid product cycles, and digitally fluent user base could provide a favorable environment for embedding AI into daily life.  For now, its leading AI agent startups are focusing their attention on the global market, because the best Western models don’t operate inside China’s firewalls. But that could change soon: Tech giants like ByteDance and Tencent are preparing their own AI agents that could bake automation directly into their native super-apps, pulling data from their vast ecosystem of programs that dominate many aspects of daily life in the country.  As the race to define what a useful AI agent looks like unfolds, a mix of ambitious startups and entrenched tech giants are now testing how these tools might actually work in practice—and for whom. Set the standard It’s been a whirlwind few months for Manus, which was developed by the Wuhan-based startup Butterfly Effect. The company raised million in a funding round led by the US venture capital firm Benchmark, took the product on an ambitious global roadshow, and hired dozens of new employees.  Even before registration opened to the public in May, Manus had become a reference point for what a broad, consumer‑oriented AI agent should accomplish. Rather than handling narrow chores for businesses, this “general” agent is designed to be able to help with everyday tasks like trip planning, stock comparison, or your kid’s school project.  Unlike previous AI agents, Manus uses a browser-based sandbox that lets users supervise the agent like an intern, watching in real time as it scrolls through web pages, reads articles, or codes actions. It also proactively asks clarifying questions, supports long-term memory that would serve as context for future tasks. “Manus represents a promising product experience for AI agents,” says Ang Li, cofounder and CEO of Simular, a startup based in Palo Alto, California, that’s building computer use agents, AI agents that control a virtual computer. “I believe Chinese startups have a huge advantage when it comes to designing consumer products, thanks to cutthroat domestic competition that leads to fast execution and greater attention to product details.” In the case of Manus, the competition is moving fast. Two of the most buzzy follow‑ups, Genspark and Flowith, for example, are already boasting benchmark scores that match or edge past Manus’s.  Genspark, led by former Baidu executives Eric Jing and Kay Zhu, links many small “super agents” through what it calls multi‑component prompting. The agent can switch among several large language models, accepts both images and text, and carries out tasks from making slide decks to placing phone calls. Whereas Manus relies heavily on Browser Use, a popular open-source product that lets agents operate a web browser in a virtual window like a human, Genspark directly integrates with a wide array of tools and APIs. Launched in April, the company says that it already has over 5 million users and over million in yearly revenue. Flowith, the work of a young team that first grabbed public attention in April 2025 at a developer event hosted by the popular social media app Xiaohongshu, takes a different tack. Marketed as an “infinite agent,” it opens on a blank canvas where each question becomes a node on a branching map. Users can backtrack, take new branches, and store results in personal or sharable “knowledge gardens”—a design that feels more like project management softwarethan a typical chat interface. Every inquiry or task builds its own mind-map-like graph, encouraging a more nonlinear and creative interaction with AI. Flowith’s core agent, NEO, runs in the cloud and can perform scheduled tasks like sending emails and compiling files. The founders want the app to be a “knowledge marketbase”, and aims to tap into the social aspect of AI with the aspiration of becoming “the OnlyFans of AI knowledge creators”. What they also share with Manus is the global ambition. Both Genspark and Flowith have stated that their primary focus is the international market. A global address Startups like Manus, Genspark, and Flowith—though founded by Chinese entrepreneurs—could blend seamlessly into the global tech scene and compete effectively abroad. Founders, investors, and analysts that MIT Technology Review has spoken to believe Chinese companies are moving fast, executing well, and quickly coming up with new products.  Money reinforces the pull to launch overseas. Customers there pay more, and there are plenty to go around. “You can price in USD, and with the exchange rate that’s a sevenfold multiplier,” Manus cofounder Xiao Hong quipped on a podcast. “Even if we’re only operating at 10% power because of cultural differences overseas, we’ll still make more than in China.” But creating the same functionality in China is a challenge. Major US AI companies including OpenAI and Anthropic have opted out of mainland China because of geopolitical risks and challenges with regulatory compliance. Their absence initially created a black market as users resorted to VPNs and third-party mirrors to access tools like ChatGPT and Claude. That vacuum has since been filled by a new wave of Chinese chatbots—DeepSeek, Doubao, Kimi—but the appetite for foreign models hasn’t gone away.  Manus, for example, uses Anthropic’s Claude Sonnet—widely considered the top model for agentic tasks. Manus cofounder Zhang Tao has repeatedly praised Claude’s ability to juggle tools, remember contexts, and hold multi‑round conversations—all crucial for turning chatty software into an effective executive assistant. But the company’s use of Sonnet has made its agent functionally unusable inside China without a VPN. If you open Manus from a mainland IP address, you’ll see a notice explaining that the team is “working on integrating Qwen’s model,” a special local version that is built on top of Alibaba’s open-source model.  An engineer overseeing ByteDance’s work on developing an agent, who spoke to MIT Technology Review anonymously to avoid sanction, said that the absence of Claude Sonnet models “limits everything we do in China.” DeepSeek’s open models, he added, still hallucinate too often and lack training on real‑world workflows. Developers we spoke with rank Alibaba’s Qwen series as the best domestic alternative, yet most say that switching to Qwen knocks performance down a notch. Jiaxin Pei, a postdoctoral researcher at Stanford’s Institute for Human‑Centered AI, thinks that gap will close: “Building agentic capabilities in base LLMs has become a key focus for many LLM builders, and once people realize the value of this, it will only be a matter of time.” For now, Manus is doubling down on audiences it can already serve. In a written response, the company said its “primary focus is overseas expansion,” noting that new offices in San Francisco, Singapore, and Tokyo have opened in the past month. A super‑app approach Although the concept of AI agents is still relatively new, the consumer-facing AI app market in China is already crowded with major tech players. DeepSeek remains the most widely used, while ByteDance’s Doubao and Moonshot’s Kimi have also become household names. However, most of these apps are still optimized for chat and entertainment rather than task execution. This gap in the local market has pushed China’s big tech firms to roll out their own user-facing agents, though early versions remain uneven in quality and rough around the edges.  ByteDance is testing Coze Space, an AI agent based on its own Doubao model family that lets users toggle between “plan” and “execute” modes, so they can either directly guide the agent’s actions or step back and watch it work autonomously. It connects up to 14 popular apps, including GitHub, Notion, and the company’s own Lark office suite. Early reviews say the tool can feel clunky and has a high failure rate, but it clearly aims to match what Manus offers. Meanwhile, Zhipu AI has released a free agent called AutoGLM Rumination, built on its proprietary ChatGLM models. Shanghai‑based Minimax has launched Minimax Agent. Both products look almost identical to Manus and demo basic tasks such as building a simple website, planning a trip, making a small Flash game, or running quick data analysis. Despite the limited usability of most general AI agents launched within China, big companies have plans to change that. During a May 15 earnings call, Tencent president Liu Zhiping teased an agent that would weave automation directly into China’s most ubiquitous app, WeChat.  Considered the original super-app, WeChat already handles messaging, mobile payments, news, and millions of mini‑programs that act like embedded apps. These programs give Tencent, its developer, access to data from millions of services that pervade everyday life in China, an advantage most competitors can only envy. Historically, China’s consumer internet has splintered into competing walled gardens—share a Taobao link in WeChat and it resolves as plaintext, not a preview card. Unlike the more interoperable Western internet, China’s tech giants have long resisted integration with one another, choosing to wage platform war at the expense of a seamless user experience. But the use of mini‑programs has given WeChat unprecedented reach across services that once resisted interoperability, from gym bookings to grocery orders. An agent able to roam that ecosystem could bypass the integration headaches dogging independent startups. Alibaba, the e-commerce giant behind the Qwen model series, has been a front-runner in China’s AI race but has been slower to release consumer-facing products. Even though Qwen was the most downloaded open-source model on Hugging Face in 2024, it didn’t power a dedicated chatbot app until early 2025. In March, Alibaba rebranded its cloud storage and search app Quark into an all-in-one AI search tool. By June, Quark had introduced DeepResearch—a new mode that marks its most agent-like effort to date.  ByteDance and Alibaba did not reply to MIT Technology Review’s request for comments. “Historically, Chinese tech products tend to pursue the all-in-one, super-app approach, and the latest Chinese AI agents reflect just that,” says Li of Simular, who previously worked at Google DeepMind on AI-enabled work automation. “In contrast, AI agents in the US are more focused on serving specific verticals.” Pei, the researcher at Stanford, says that existing tech giants could have a huge advantage in bringing the vision of general AI agents to life—especially those with built-in integration across services. “The customer-facing AI agent market is still very early, with tons of problems like authentication and liability,” he says. “But companies that already operate across a wide range of services have a natural advantage in deploying agents at scale.” #manus #has #kickstarted #agent #boom
    WWW.TECHNOLOGYREVIEW.COM
    Manus has kick-started an AI agent boom in China
    Last year, China saw a boom in foundation models, the do-everything large language models that underpin the AI revolution. This year, the focus has shifted to AI agents—systems that are less about responding to users’ queries and more about autonomously accomplishing things for them.  There are now a host of Chinese startups building these general-purpose digital tools, which can answer emails, browse the internet to plan vacations, and even design an interactive website. Many of these have emerged in just the last two months, following in the footsteps of Manus—a general AI agent that sparked weeks of social media frenzy for invite codes after its limited-release launch in early March.  These emerging AI agents aren’t large language models themselves. Instead, they’re built on top of them, using a workflow-based structure designed to get things done. A lot of these systems also introduce a different way of interacting with AI. Rather than just chatting back and forth with users, they are optimized for managing and executing multistep tasks—booking flights, managing schedules, conducting research—by using external tools and remembering instructions.  China could take the lead on building these kinds of agents. The country’s tightly integrated app ecosystems, rapid product cycles, and digitally fluent user base could provide a favorable environment for embedding AI into daily life.  For now, its leading AI agent startups are focusing their attention on the global market, because the best Western models don’t operate inside China’s firewalls. But that could change soon: Tech giants like ByteDance and Tencent are preparing their own AI agents that could bake automation directly into their native super-apps, pulling data from their vast ecosystem of programs that dominate many aspects of daily life in the country.  As the race to define what a useful AI agent looks like unfolds, a mix of ambitious startups and entrenched tech giants are now testing how these tools might actually work in practice—and for whom. Set the standard It’s been a whirlwind few months for Manus, which was developed by the Wuhan-based startup Butterfly Effect. The company raised $75 million in a funding round led by the US venture capital firm Benchmark, took the product on an ambitious global roadshow, and hired dozens of new employees.  Even before registration opened to the public in May, Manus had become a reference point for what a broad, consumer‑oriented AI agent should accomplish. Rather than handling narrow chores for businesses, this “general” agent is designed to be able to help with everyday tasks like trip planning, stock comparison, or your kid’s school project.  Unlike previous AI agents, Manus uses a browser-based sandbox that lets users supervise the agent like an intern, watching in real time as it scrolls through web pages, reads articles, or codes actions. It also proactively asks clarifying questions, supports long-term memory that would serve as context for future tasks. “Manus represents a promising product experience for AI agents,” says Ang Li, cofounder and CEO of Simular, a startup based in Palo Alto, California, that’s building computer use agents, AI agents that control a virtual computer. “I believe Chinese startups have a huge advantage when it comes to designing consumer products, thanks to cutthroat domestic competition that leads to fast execution and greater attention to product details.” In the case of Manus, the competition is moving fast. Two of the most buzzy follow‑ups, Genspark and Flowith, for example, are already boasting benchmark scores that match or edge past Manus’s.  Genspark, led by former Baidu executives Eric Jing and Kay Zhu, links many small “super agents” through what it calls multi‑component prompting. The agent can switch among several large language models, accepts both images and text, and carries out tasks from making slide decks to placing phone calls. Whereas Manus relies heavily on Browser Use, a popular open-source product that lets agents operate a web browser in a virtual window like a human, Genspark directly integrates with a wide array of tools and APIs. Launched in April, the company says that it already has over 5 million users and over $36 million in yearly revenue. Flowith, the work of a young team that first grabbed public attention in April 2025 at a developer event hosted by the popular social media app Xiaohongshu, takes a different tack. Marketed as an “infinite agent,” it opens on a blank canvas where each question becomes a node on a branching map. Users can backtrack, take new branches, and store results in personal or sharable “knowledge gardens”—a design that feels more like project management software (think Notion) than a typical chat interface. Every inquiry or task builds its own mind-map-like graph, encouraging a more nonlinear and creative interaction with AI. Flowith’s core agent, NEO, runs in the cloud and can perform scheduled tasks like sending emails and compiling files. The founders want the app to be a “knowledge marketbase”, and aims to tap into the social aspect of AI with the aspiration of becoming “the OnlyFans of AI knowledge creators”. What they also share with Manus is the global ambition. Both Genspark and Flowith have stated that their primary focus is the international market. A global address Startups like Manus, Genspark, and Flowith—though founded by Chinese entrepreneurs—could blend seamlessly into the global tech scene and compete effectively abroad. Founders, investors, and analysts that MIT Technology Review has spoken to believe Chinese companies are moving fast, executing well, and quickly coming up with new products.  Money reinforces the pull to launch overseas. Customers there pay more, and there are plenty to go around. “You can price in USD, and with the exchange rate that’s a sevenfold multiplier,” Manus cofounder Xiao Hong quipped on a podcast. “Even if we’re only operating at 10% power because of cultural differences overseas, we’ll still make more than in China.” But creating the same functionality in China is a challenge. Major US AI companies including OpenAI and Anthropic have opted out of mainland China because of geopolitical risks and challenges with regulatory compliance. Their absence initially created a black market as users resorted to VPNs and third-party mirrors to access tools like ChatGPT and Claude. That vacuum has since been filled by a new wave of Chinese chatbots—DeepSeek, Doubao, Kimi—but the appetite for foreign models hasn’t gone away.  Manus, for example, uses Anthropic’s Claude Sonnet—widely considered the top model for agentic tasks. Manus cofounder Zhang Tao has repeatedly praised Claude’s ability to juggle tools, remember contexts, and hold multi‑round conversations—all crucial for turning chatty software into an effective executive assistant. But the company’s use of Sonnet has made its agent functionally unusable inside China without a VPN. If you open Manus from a mainland IP address, you’ll see a notice explaining that the team is “working on integrating Qwen’s model,” a special local version that is built on top of Alibaba’s open-source model.  An engineer overseeing ByteDance’s work on developing an agent, who spoke to MIT Technology Review anonymously to avoid sanction, said that the absence of Claude Sonnet models “limits everything we do in China.” DeepSeek’s open models, he added, still hallucinate too often and lack training on real‑world workflows. Developers we spoke with rank Alibaba’s Qwen series as the best domestic alternative, yet most say that switching to Qwen knocks performance down a notch. Jiaxin Pei, a postdoctoral researcher at Stanford’s Institute for Human‑Centered AI, thinks that gap will close: “Building agentic capabilities in base LLMs has become a key focus for many LLM builders, and once people realize the value of this, it will only be a matter of time.” For now, Manus is doubling down on audiences it can already serve. In a written response, the company said its “primary focus is overseas expansion,” noting that new offices in San Francisco, Singapore, and Tokyo have opened in the past month. A super‑app approach Although the concept of AI agents is still relatively new, the consumer-facing AI app market in China is already crowded with major tech players. DeepSeek remains the most widely used, while ByteDance’s Doubao and Moonshot’s Kimi have also become household names. However, most of these apps are still optimized for chat and entertainment rather than task execution. This gap in the local market has pushed China’s big tech firms to roll out their own user-facing agents, though early versions remain uneven in quality and rough around the edges.  ByteDance is testing Coze Space, an AI agent based on its own Doubao model family that lets users toggle between “plan” and “execute” modes, so they can either directly guide the agent’s actions or step back and watch it work autonomously. It connects up to 14 popular apps, including GitHub, Notion, and the company’s own Lark office suite. Early reviews say the tool can feel clunky and has a high failure rate, but it clearly aims to match what Manus offers. Meanwhile, Zhipu AI has released a free agent called AutoGLM Rumination, built on its proprietary ChatGLM models. Shanghai‑based Minimax has launched Minimax Agent. Both products look almost identical to Manus and demo basic tasks such as building a simple website, planning a trip, making a small Flash game, or running quick data analysis. Despite the limited usability of most general AI agents launched within China, big companies have plans to change that. During a May 15 earnings call, Tencent president Liu Zhiping teased an agent that would weave automation directly into China’s most ubiquitous app, WeChat.  Considered the original super-app, WeChat already handles messaging, mobile payments, news, and millions of mini‑programs that act like embedded apps. These programs give Tencent, its developer, access to data from millions of services that pervade everyday life in China, an advantage most competitors can only envy. Historically, China’s consumer internet has splintered into competing walled gardens—share a Taobao link in WeChat and it resolves as plaintext, not a preview card. Unlike the more interoperable Western internet, China’s tech giants have long resisted integration with one another, choosing to wage platform war at the expense of a seamless user experience. But the use of mini‑programs has given WeChat unprecedented reach across services that once resisted interoperability, from gym bookings to grocery orders. An agent able to roam that ecosystem could bypass the integration headaches dogging independent startups. Alibaba, the e-commerce giant behind the Qwen model series, has been a front-runner in China’s AI race but has been slower to release consumer-facing products. Even though Qwen was the most downloaded open-source model on Hugging Face in 2024, it didn’t power a dedicated chatbot app until early 2025. In March, Alibaba rebranded its cloud storage and search app Quark into an all-in-one AI search tool. By June, Quark had introduced DeepResearch—a new mode that marks its most agent-like effort to date.  ByteDance and Alibaba did not reply to MIT Technology Review’s request for comments. “Historically, Chinese tech products tend to pursue the all-in-one, super-app approach, and the latest Chinese AI agents reflect just that,” says Li of Simular, who previously worked at Google DeepMind on AI-enabled work automation. “In contrast, AI agents in the US are more focused on serving specific verticals.” Pei, the researcher at Stanford, says that existing tech giants could have a huge advantage in bringing the vision of general AI agents to life—especially those with built-in integration across services. “The customer-facing AI agent market is still very early, with tons of problems like authentication and liability,” he says. “But companies that already operate across a wide range of services have a natural advantage in deploying agents at scale.”
    Like
    Love
    Wow
    Sad
    Angry
    421
    0 Σχόλια 0 Μοιράστηκε
  • Enigmata’s Multi-Stage and Mix-Training Reinforcement Learning Recipe Drives Breakthrough Performance in LLM Puzzle Reasoning

    Large Reasoning Models, trained from LLMs using reinforcement learning, demonstrated great performance in complex reasoning tasks, including mathematics, STEM, and coding. However, existing LRMs face challenges in completing various puzzle tasks that require purely logical reasoning skills, which are easy and obvious for humans. Current methods targeting puzzles focus only on designing benchmarks for evaluation, lacking the training methods and resources for modern LLMs to tackle this challenge. Current puzzle datasets lack diversity and scalability, covering limited puzzle types with little control over generation or difficulty. Moreover, due to the success of the “LLM+RLVR” paradigm, it has become crucial to obtain large, diverse, and challenging sets of verifiable puzzle prompts for training agents.
    Reinforcement Learning with Verifiable Rewardshas emerged as a key method for improving models’ reasoning capabilities, removing the need for reward models by directly assigning rewards based on objectively verifiable answers. Puzzles are particularly well-suited for RLVR. However, most prior RLVR research has overlooked the puzzles’ potential for delivering effective reward signals. In puzzle reasoning of LLMs, existing benchmarks evaluate different types of reasoning, including abstract, deductive, and compositional reasoning. Few benchmarks support scalable generation and difficulty control but lack puzzle diversity. Moreover, the improvement of LLMs’ puzzle-solving abilities mainly falls into two categories: tool integration and RLVR.
    Researchers from ByteDance Seed, Fudan University, Tsinghua University, Nanjing University, and Shanghai Jiao Tong University have proposed Enigmata, the first comprehensive toolkit designed for improving LLMs with puzzle reasoning skills. It contains 36 tasks across seven categories, each featuring a generator that produces unlimited examples with controllable difficulty and a rule-based verifier for automatic evaluation. The researchers further developed Enigmata-Eval as a rigorous benchmark and created optimized multi-task RLVR strategies. Puzzle data from Enigmata enhances SoTA performance on advanced math and STEM reasoning tasks like AIME, BeyondAIME, and GPQA when trained on larger models like Seed1.5-Thinking. This shows the generalization benefits of Enigmata.

    The Enigmata-Data comprises 36 puzzle tasks organized into 7 primary categories, including Crypto, Arithmetic, Logic, Grid, Graph, Search, and Sequential Puzzle, making it the only dataset having multiple task categories with scalability, automatic verification, and public availability. The data construction follows a three-phase pipeline: Tasks Collection and Design, Auto-Generator and Verifier Development, and Sliding Difficulty Control. Moreover, the Enigmata-Eval is developed by systematically sampling from the broader dataset, aiming to extract 50 instances per difficulty level for each task. The final evaluation set contains 4,758 puzzle instances rather than the theoretical maximum of 5,400, due to inherent constraints, where some tasks generate fewer instances per difficulty level.

    The proposed model outperforms most public models on Enigmata-Eval with 32B parameters, showing the effectiveness of the dataset and training recipe. The model stands out on the challenging ARC-AGI benchmark, surpassing strong reasoning models such as Gemini 2.5 Pro, o3-mini, and o1. The Qwen2.5-32B-Enigmata shows outstanding performance in structured reasoning categories, outperforming in Crypto, Arithmetic, and Logic tasks, suggesting effective development of rule-based reasoning capabilities. The model shows competitive performance in search tasks that require strategic exploration and planning capabilities. Moreover, Crypto and Arithmetic tasks tend to provide the highest accuracy, while spatial and sequential tasks remain more difficult.
    In this paper, researchers introduced Enigmata, a comprehensive suite for equipping LLMs with advanced puzzle reasoning that integrates seamlessly with RL using verifiable rule-based rewards. The trained Enigmata-Model shows superior performance and robust generalization skills through RLVR training. Experiments reveal that when applied to larger models such as Seed1.5-Thinking, synthetic puzzle data brings additional benefits in other domains, including mathematics and STEM reasoning over state-of-the-art models. Enigmata provides a solid foundation for the research community to advance reasoning model development, offering a unified framework that effectively bridges logical puzzle-solving with broader reasoning capabilities in LLMs.

    Check out the Paper, GitHub Page and Project Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.
    Sajjad AnsariSajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.Sajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic IntegrationSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Meta AI Introduces Multi-SpatialMLLM: A Multi-Frame Spatial Understanding with Multi-modal Large Language ModelsSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Can LLMs Really Judge with Reasoning? Microsoft and Tsinghua Researchers Introduce Reward Reasoning Models to Dynamically Scale Test-Time Compute for Better AlignmentSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/NVIDIA AI Introduces AceReason-Nemotron for Advancing Math and Code Reasoning through Reinforcement Learning
    #enigmatas #multistage #mixtraining #reinforcement #learning
    Enigmata’s Multi-Stage and Mix-Training Reinforcement Learning Recipe Drives Breakthrough Performance in LLM Puzzle Reasoning
    Large Reasoning Models, trained from LLMs using reinforcement learning, demonstrated great performance in complex reasoning tasks, including mathematics, STEM, and coding. However, existing LRMs face challenges in completing various puzzle tasks that require purely logical reasoning skills, which are easy and obvious for humans. Current methods targeting puzzles focus only on designing benchmarks for evaluation, lacking the training methods and resources for modern LLMs to tackle this challenge. Current puzzle datasets lack diversity and scalability, covering limited puzzle types with little control over generation or difficulty. Moreover, due to the success of the “LLM+RLVR” paradigm, it has become crucial to obtain large, diverse, and challenging sets of verifiable puzzle prompts for training agents. Reinforcement Learning with Verifiable Rewardshas emerged as a key method for improving models’ reasoning capabilities, removing the need for reward models by directly assigning rewards based on objectively verifiable answers. Puzzles are particularly well-suited for RLVR. However, most prior RLVR research has overlooked the puzzles’ potential for delivering effective reward signals. In puzzle reasoning of LLMs, existing benchmarks evaluate different types of reasoning, including abstract, deductive, and compositional reasoning. Few benchmarks support scalable generation and difficulty control but lack puzzle diversity. Moreover, the improvement of LLMs’ puzzle-solving abilities mainly falls into two categories: tool integration and RLVR. Researchers from ByteDance Seed, Fudan University, Tsinghua University, Nanjing University, and Shanghai Jiao Tong University have proposed Enigmata, the first comprehensive toolkit designed for improving LLMs with puzzle reasoning skills. It contains 36 tasks across seven categories, each featuring a generator that produces unlimited examples with controllable difficulty and a rule-based verifier for automatic evaluation. The researchers further developed Enigmata-Eval as a rigorous benchmark and created optimized multi-task RLVR strategies. Puzzle data from Enigmata enhances SoTA performance on advanced math and STEM reasoning tasks like AIME, BeyondAIME, and GPQA when trained on larger models like Seed1.5-Thinking. This shows the generalization benefits of Enigmata. The Enigmata-Data comprises 36 puzzle tasks organized into 7 primary categories, including Crypto, Arithmetic, Logic, Grid, Graph, Search, and Sequential Puzzle, making it the only dataset having multiple task categories with scalability, automatic verification, and public availability. The data construction follows a three-phase pipeline: Tasks Collection and Design, Auto-Generator and Verifier Development, and Sliding Difficulty Control. Moreover, the Enigmata-Eval is developed by systematically sampling from the broader dataset, aiming to extract 50 instances per difficulty level for each task. The final evaluation set contains 4,758 puzzle instances rather than the theoretical maximum of 5,400, due to inherent constraints, where some tasks generate fewer instances per difficulty level. The proposed model outperforms most public models on Enigmata-Eval with 32B parameters, showing the effectiveness of the dataset and training recipe. The model stands out on the challenging ARC-AGI benchmark, surpassing strong reasoning models such as Gemini 2.5 Pro, o3-mini, and o1. The Qwen2.5-32B-Enigmata shows outstanding performance in structured reasoning categories, outperforming in Crypto, Arithmetic, and Logic tasks, suggesting effective development of rule-based reasoning capabilities. The model shows competitive performance in search tasks that require strategic exploration and planning capabilities. Moreover, Crypto and Arithmetic tasks tend to provide the highest accuracy, while spatial and sequential tasks remain more difficult. In this paper, researchers introduced Enigmata, a comprehensive suite for equipping LLMs with advanced puzzle reasoning that integrates seamlessly with RL using verifiable rule-based rewards. The trained Enigmata-Model shows superior performance and robust generalization skills through RLVR training. Experiments reveal that when applied to larger models such as Seed1.5-Thinking, synthetic puzzle data brings additional benefits in other domains, including mathematics and STEM reasoning over state-of-the-art models. Enigmata provides a solid foundation for the research community to advance reasoning model development, offering a unified framework that effectively bridges logical puzzle-solving with broader reasoning capabilities in LLMs. Check out the Paper, GitHub Page and Project Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. Sajjad AnsariSajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.Sajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic IntegrationSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Meta AI Introduces Multi-SpatialMLLM: A Multi-Frame Spatial Understanding with Multi-modal Large Language ModelsSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Can LLMs Really Judge with Reasoning? Microsoft and Tsinghua Researchers Introduce Reward Reasoning Models to Dynamically Scale Test-Time Compute for Better AlignmentSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/NVIDIA AI Introduces AceReason-Nemotron for Advancing Math and Code Reasoning through Reinforcement Learning #enigmatas #multistage #mixtraining #reinforcement #learning
    WWW.MARKTECHPOST.COM
    Enigmata’s Multi-Stage and Mix-Training Reinforcement Learning Recipe Drives Breakthrough Performance in LLM Puzzle Reasoning
    Large Reasoning Models (LRMs), trained from LLMs using reinforcement learning (RL), demonstrated great performance in complex reasoning tasks, including mathematics, STEM, and coding. However, existing LRMs face challenges in completing various puzzle tasks that require purely logical reasoning skills, which are easy and obvious for humans. Current methods targeting puzzles focus only on designing benchmarks for evaluation, lacking the training methods and resources for modern LLMs to tackle this challenge. Current puzzle datasets lack diversity and scalability, covering limited puzzle types with little control over generation or difficulty. Moreover, due to the success of the “LLM+RLVR” paradigm, it has become crucial to obtain large, diverse, and challenging sets of verifiable puzzle prompts for training agents. Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a key method for improving models’ reasoning capabilities, removing the need for reward models by directly assigning rewards based on objectively verifiable answers. Puzzles are particularly well-suited for RLVR. However, most prior RLVR research has overlooked the puzzles’ potential for delivering effective reward signals. In puzzle reasoning of LLMs, existing benchmarks evaluate different types of reasoning, including abstract, deductive, and compositional reasoning. Few benchmarks support scalable generation and difficulty control but lack puzzle diversity. Moreover, the improvement of LLMs’ puzzle-solving abilities mainly falls into two categories: tool integration and RLVR. Researchers from ByteDance Seed, Fudan University, Tsinghua University, Nanjing University, and Shanghai Jiao Tong University have proposed Enigmata, the first comprehensive toolkit designed for improving LLMs with puzzle reasoning skills. It contains 36 tasks across seven categories, each featuring a generator that produces unlimited examples with controllable difficulty and a rule-based verifier for automatic evaluation. The researchers further developed Enigmata-Eval as a rigorous benchmark and created optimized multi-task RLVR strategies. Puzzle data from Enigmata enhances SoTA performance on advanced math and STEM reasoning tasks like AIME, BeyondAIME, and GPQA when trained on larger models like Seed1.5-Thinking. This shows the generalization benefits of Enigmata. The Enigmata-Data comprises 36 puzzle tasks organized into 7 primary categories, including Crypto, Arithmetic, Logic, Grid, Graph, Search, and Sequential Puzzle, making it the only dataset having multiple task categories with scalability, automatic verification, and public availability. The data construction follows a three-phase pipeline: Tasks Collection and Design, Auto-Generator and Verifier Development, and Sliding Difficulty Control. Moreover, the Enigmata-Eval is developed by systematically sampling from the broader dataset, aiming to extract 50 instances per difficulty level for each task. The final evaluation set contains 4,758 puzzle instances rather than the theoretical maximum of 5,400, due to inherent constraints, where some tasks generate fewer instances per difficulty level. The proposed model outperforms most public models on Enigmata-Eval with 32B parameters, showing the effectiveness of the dataset and training recipe. The model stands out on the challenging ARC-AGI benchmark, surpassing strong reasoning models such as Gemini 2.5 Pro, o3-mini, and o1. The Qwen2.5-32B-Enigmata shows outstanding performance in structured reasoning categories, outperforming in Crypto, Arithmetic, and Logic tasks, suggesting effective development of rule-based reasoning capabilities. The model shows competitive performance in search tasks that require strategic exploration and planning capabilities. Moreover, Crypto and Arithmetic tasks tend to provide the highest accuracy, while spatial and sequential tasks remain more difficult. In this paper, researchers introduced Enigmata, a comprehensive suite for equipping LLMs with advanced puzzle reasoning that integrates seamlessly with RL using verifiable rule-based rewards. The trained Enigmata-Model shows superior performance and robust generalization skills through RLVR training. Experiments reveal that when applied to larger models such as Seed1.5-Thinking (20B/200B parameters), synthetic puzzle data brings additional benefits in other domains, including mathematics and STEM reasoning over state-of-the-art models. Enigmata provides a solid foundation for the research community to advance reasoning model development, offering a unified framework that effectively bridges logical puzzle-solving with broader reasoning capabilities in LLMs. Check out the Paper, GitHub Page and Project Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. Sajjad AnsariSajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.Sajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic IntegrationSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Meta AI Introduces Multi-SpatialMLLM: A Multi-Frame Spatial Understanding with Multi-modal Large Language ModelsSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/Can LLMs Really Judge with Reasoning? Microsoft and Tsinghua Researchers Introduce Reward Reasoning Models to Dynamically Scale Test-Time Compute for Better AlignmentSajjad Ansarihttps://www.marktechpost.com/author/sajjadansari/NVIDIA AI Introduces AceReason-Nemotron for Advancing Math and Code Reasoning through Reinforcement Learning
    0 Σχόλια 0 Μοιράστηκε
  • Augmented World Expo 2025 will draw 400 speakers, 6K attendees and 300 global exhibitors

    Augmented World Expo 2025 will draw more than 6,000 attendees, 400 speakers and 300 global exhibitors to its event June 10 to June 12 in Long Beach, California.
    The speaker lineup includes Snap CEO Evan Spiegel, Atari cofounder Nolan Bushnell and Oculus/Anduril founder Palmer Luckey. If the show is any indication, the XR industry isn’t doing so bad. A variety of market researchers are forecasting fast growth for the industry through 2030. Ori Inbar, CEO of AWE, believes that the XR revolution is “ready to conquer the mainstream.” But to get there, he believes the industry still needs to create “head-turning content that must be experienced.”
    Of course, the red hot days of the “metaverse,” inspired by Neal Stephenson’s Snow Crash sci-fi novel in 1992, is no longer driving the industry forward. With less focus on sci-fi, the industry is focused on practical uses for mixed reality technology in the enterprise and consumer markets like gaming.
    But will XR and the metaverse be overrun by AI, or will it carry them to the mass market destination?
    Much is riding on how committed Mark Zuckerberg’s Meta will be even as it reprioritizes some resources away from XR to AI. Meta, which acquired Luckey’s Oculus back in 2014, has invested billions every quarter in the technology, with no profits so far. But, in a very unexpected turnaround, Zuckerberg and Luckey buried the hatchet on the past differences and set up an alliance between Meta and Anduril — the latter being Luckey’s AI/drone defense company.
    Zuckerberg has new competition from his own nemesis, Apple, which launched the Apple Vision Pro in February 2024. However, Apple has slowed down its development of the next-generation XR headset, while Zuckerberg has put more emphasis on AR/AI glasses.
    Spiegel, the CEO of Snap, has focused on augmented reality glasses. His Spectacles are now in their fifth generation, powered by the Snap OS and authoring tool Lens Studio.
    Nolan Bushnell, founder of Atari and Chuck E. Cheese, will deliver a one-of-a-kind talk on the main stage with five of his children, who are continuing his pioneering vision in gaming through XR. Brent Bushnell, Nolan’s eldest son, recently debuted DreamPark, a new XR startup that turns any park or playground into a mixed reality theme parks.
    Others speakers include Vicki Dobbs Beck – VP, Immersive Content Innovation, Lucasfilm & ILM Immersive; Ziad Asghar – SVP & GM XR, Qualcomm; Brian McClendon – Chief Technology Officer, Niantic Spatial, Inc.; Jason Rubin – VP, Metaverse Experiences, Meta; Hugo Swart, Senior Director of XR Ecosystem Strategy and Technology, Google; Jacqui Bransky – VP Web3 & Innovation, Warner Records; Chi Xu – CEO and Founder, XREAL; Helen Papagiannis – AR Pioneer and XR Hall of Famer; and Tom Furness – Grandfather of VR and Founder, Virtual World Society.
    AWE Builders Nexus will be a new program focused on startups this year. Startup founders, developers, designers, product managers, and business leaders alike will get the resources they need to build something extraordinary, get advice and funding, scale through partnerships, and win customers, Inbar said. The event will also feature the AWE Gaming Hub.
    I also interviewed some companies that are showcasing technology at the show. Here’s some snippets from what they are going to show.
    Pico VR
    Pico started out in Beijing, China, in 2015 and is now hitting its 10th anniversary. It is making the standalone Pico XR headsets, and it was acquired by ByteDance, the owner of TikTok, in 2021. In September 2024, the company launched the Pico 4 Ultra Enterprise headset, filling out the high end of its product line in addition to its G3 and Neo 3 legacy headsets.
    Pico also has its set of full-body motion trackers to its product offerings to allow for full-body and object tracking. That’s helping it with its focus on location-based entertainment in markets such as China. It’s focused on WiFi7, hand tracking and motion tracking.
    Leland Hedges, head of enterprise business at Pico, said that the LBE market in China has grown by 1,000% in the last six to nine months Pico has an app for PC streaming and another app for managing devices over a LAN. Pico can track play spaces with columns or cordoned-off areas. Hedges said the company will share 15 different user stories at AWE in public places such as zoos, museums, aquariums and planetariums.
    Convai
    Purnendu Mukherjee, CEO of Convai, showed me a bunch of demos at the Game Developers Conference where it has been able to create avatar-based demos of generative AI solutions with 3D animated people. These can be used to show off brands and greet people on web sites or as avatars in games.
    At AWE, Convai will also off learning and training scenarios for education and enterprises through a variety of simulations. Convai can render high fidelity avatars that are effectively coming from the cloud. At GDC, Convai scanned me and captured my voice so that it can create a lifelike avatar of me. These avatars can be created quickly and answer a variety of questions from website visitors. The idea is to enable non-technical people to create simulations without the need to code anything.
    In a demo, Convai’s avatar of me said, “I’ve been covering the games industry for many years now at games beat I’ve seen it evolve from the arcades to the massive global phenomenon it is today. I love digging into the business side of gaming, the technology, the culture, the whole shebang.” Convai will announce pricing for its self-serve platform as well as an enterprise subscription fee.
    Doublepoint
    Ohto Pentikäinen, CEO of Doublepoint, has a technology that detects the gesture you can make with your hand. It captures that movement via a smartwatch and allows you to control things on a TV interface or an XR device. With Android XR, Doublepoint is showing off demos where gesture control can unlock a more intuitive and comfortable augmented reality experience for those wearing AR glasses. Xreal is one of the glasses makers that is using the technology for controlling an AR user interface with gestures.
    “Our technology is able to fully control a XR system. A stat that we can update you on is that there’s 150,000 people who have downloaded the technology so far, and we have a developer community of over 2,000 people since January 2024,” Pentikäinen said.
    Now the company is starting its own Doublepoing developer program, and this adds layers on top of the enterprise client. So now the company can provide technology for indie developers or startups that are building augmented reality or AI hardware experiences.
    “We’re empowering developers in AR robotics and AI hardware, and we’re providing everything that we’re providing the enterprise clients, but for a much reduced price,” Pentikäinen said.
    #augmented #world #expo #will #draw
    Augmented World Expo 2025 will draw 400 speakers, 6K attendees and 300 global exhibitors
    Augmented World Expo 2025 will draw more than 6,000 attendees, 400 speakers and 300 global exhibitors to its event June 10 to June 12 in Long Beach, California. The speaker lineup includes Snap CEO Evan Spiegel, Atari cofounder Nolan Bushnell and Oculus/Anduril founder Palmer Luckey. If the show is any indication, the XR industry isn’t doing so bad. A variety of market researchers are forecasting fast growth for the industry through 2030. Ori Inbar, CEO of AWE, believes that the XR revolution is “ready to conquer the mainstream.” But to get there, he believes the industry still needs to create “head-turning content that must be experienced.” Of course, the red hot days of the “metaverse,” inspired by Neal Stephenson’s Snow Crash sci-fi novel in 1992, is no longer driving the industry forward. With less focus on sci-fi, the industry is focused on practical uses for mixed reality technology in the enterprise and consumer markets like gaming. But will XR and the metaverse be overrun by AI, or will it carry them to the mass market destination? Much is riding on how committed Mark Zuckerberg’s Meta will be even as it reprioritizes some resources away from XR to AI. Meta, which acquired Luckey’s Oculus back in 2014, has invested billions every quarter in the technology, with no profits so far. But, in a very unexpected turnaround, Zuckerberg and Luckey buried the hatchet on the past differences and set up an alliance between Meta and Anduril — the latter being Luckey’s AI/drone defense company. Zuckerberg has new competition from his own nemesis, Apple, which launched the Apple Vision Pro in February 2024. However, Apple has slowed down its development of the next-generation XR headset, while Zuckerberg has put more emphasis on AR/AI glasses. Spiegel, the CEO of Snap, has focused on augmented reality glasses. His Spectacles are now in their fifth generation, powered by the Snap OS and authoring tool Lens Studio. Nolan Bushnell, founder of Atari and Chuck E. Cheese, will deliver a one-of-a-kind talk on the main stage with five of his children, who are continuing his pioneering vision in gaming through XR. Brent Bushnell, Nolan’s eldest son, recently debuted DreamPark, a new XR startup that turns any park or playground into a mixed reality theme parks. Others speakers include Vicki Dobbs Beck – VP, Immersive Content Innovation, Lucasfilm & ILM Immersive; Ziad Asghar – SVP & GM XR, Qualcomm; Brian McClendon – Chief Technology Officer, Niantic Spatial, Inc.; Jason Rubin – VP, Metaverse Experiences, Meta; Hugo Swart, Senior Director of XR Ecosystem Strategy and Technology, Google; Jacqui Bransky – VP Web3 & Innovation, Warner Records; Chi Xu – CEO and Founder, XREAL; Helen Papagiannis – AR Pioneer and XR Hall of Famer; and Tom Furness – Grandfather of VR and Founder, Virtual World Society. AWE Builders Nexus will be a new program focused on startups this year. Startup founders, developers, designers, product managers, and business leaders alike will get the resources they need to build something extraordinary, get advice and funding, scale through partnerships, and win customers, Inbar said. The event will also feature the AWE Gaming Hub. I also interviewed some companies that are showcasing technology at the show. Here’s some snippets from what they are going to show. Pico VR Pico started out in Beijing, China, in 2015 and is now hitting its 10th anniversary. It is making the standalone Pico XR headsets, and it was acquired by ByteDance, the owner of TikTok, in 2021. In September 2024, the company launched the Pico 4 Ultra Enterprise headset, filling out the high end of its product line in addition to its G3 and Neo 3 legacy headsets. Pico also has its set of full-body motion trackers to its product offerings to allow for full-body and object tracking. That’s helping it with its focus on location-based entertainment in markets such as China. It’s focused on WiFi7, hand tracking and motion tracking. Leland Hedges, head of enterprise business at Pico, said that the LBE market in China has grown by 1,000% in the last six to nine months Pico has an app for PC streaming and another app for managing devices over a LAN. Pico can track play spaces with columns or cordoned-off areas. Hedges said the company will share 15 different user stories at AWE in public places such as zoos, museums, aquariums and planetariums. Convai Purnendu Mukherjee, CEO of Convai, showed me a bunch of demos at the Game Developers Conference where it has been able to create avatar-based demos of generative AI solutions with 3D animated people. These can be used to show off brands and greet people on web sites or as avatars in games. At AWE, Convai will also off learning and training scenarios for education and enterprises through a variety of simulations. Convai can render high fidelity avatars that are effectively coming from the cloud. At GDC, Convai scanned me and captured my voice so that it can create a lifelike avatar of me. These avatars can be created quickly and answer a variety of questions from website visitors. The idea is to enable non-technical people to create simulations without the need to code anything. In a demo, Convai’s avatar of me said, “I’ve been covering the games industry for many years now at games beat I’ve seen it evolve from the arcades to the massive global phenomenon it is today. I love digging into the business side of gaming, the technology, the culture, the whole shebang.” Convai will announce pricing for its self-serve platform as well as an enterprise subscription fee. Doublepoint Ohto Pentikäinen, CEO of Doublepoint, has a technology that detects the gesture you can make with your hand. It captures that movement via a smartwatch and allows you to control things on a TV interface or an XR device. With Android XR, Doublepoint is showing off demos where gesture control can unlock a more intuitive and comfortable augmented reality experience for those wearing AR glasses. Xreal is one of the glasses makers that is using the technology for controlling an AR user interface with gestures. “Our technology is able to fully control a XR system. A stat that we can update you on is that there’s 150,000 people who have downloaded the technology so far, and we have a developer community of over 2,000 people since January 2024,” Pentikäinen said. Now the company is starting its own Doublepoing developer program, and this adds layers on top of the enterprise client. So now the company can provide technology for indie developers or startups that are building augmented reality or AI hardware experiences. “We’re empowering developers in AR robotics and AI hardware, and we’re providing everything that we’re providing the enterprise clients, but for a much reduced price,” Pentikäinen said. #augmented #world #expo #will #draw
    VENTUREBEAT.COM
    Augmented World Expo 2025 will draw 400 speakers, 6K attendees and 300 global exhibitors
    Augmented World Expo 2025 will draw more than 6,000 attendees, 400 speakers and 300 global exhibitors to its event June 10 to June 12 in Long Beach, California. The speaker lineup includes Snap CEO Evan Spiegel, Atari cofounder Nolan Bushnell and Oculus/Anduril founder Palmer Luckey. If the show is any indication, the XR industry isn’t doing so bad. A variety of market researchers are forecasting fast growth for the industry through 2030. Ori Inbar, CEO of AWE, believes that the XR revolution is “ready to conquer the mainstream.” But to get there, he believes the industry still needs to create “head-turning content that must be experienced.” Of course, the red hot days of the “metaverse,” inspired by Neal Stephenson’s Snow Crash sci-fi novel in 1992, is no longer driving the industry forward. With less focus on sci-fi, the industry is focused on practical uses for mixed reality technology in the enterprise and consumer markets like gaming. But will XR and the metaverse be overrun by AI, or will it carry them to the mass market destination? Much is riding on how committed Mark Zuckerberg’s Meta will be even as it reprioritizes some resources away from XR to AI. Meta, which acquired Luckey’s Oculus back in 2014, has invested billions every quarter in the technology, with no profits so far. But, in a very unexpected turnaround, Zuckerberg and Luckey buried the hatchet on the past differences and set up an alliance between Meta and Anduril — the latter being Luckey’s AI/drone defense company. Zuckerberg has new competition from his own nemesis, Apple, which launched the Apple Vision Pro in February 2024. However, Apple has slowed down its development of the next-generation XR headset, while Zuckerberg has put more emphasis on AR/AI glasses. Spiegel, the CEO of Snap, has focused on augmented reality glasses. His Spectacles are now in their fifth generation, powered by the Snap OS and authoring tool Lens Studio. Nolan Bushnell, founder of Atari and Chuck E. Cheese, will deliver a one-of-a-kind talk on the main stage with five of his children, who are continuing his pioneering vision in gaming through XR. Brent Bushnell, Nolan’s eldest son, recently debuted DreamPark, a new XR startup that turns any park or playground into a mixed reality theme parks. Others speakers include Vicki Dobbs Beck – VP, Immersive Content Innovation, Lucasfilm & ILM Immersive; Ziad Asghar – SVP & GM XR, Qualcomm; Brian McClendon – Chief Technology Officer, Niantic Spatial, Inc.; Jason Rubin – VP, Metaverse Experiences, Meta; Hugo Swart, Senior Director of XR Ecosystem Strategy and Technology, Google; Jacqui Bransky – VP Web3 & Innovation, Warner Records; Chi Xu – CEO and Founder, XREAL; Helen Papagiannis – AR Pioneer and XR Hall of Famer; and Tom Furness – Grandfather of VR and Founder, Virtual World Society. AWE Builders Nexus will be a new program focused on startups this year. Startup founders, developers, designers, product managers, and business leaders alike will get the resources they need to build something extraordinary, get advice and funding, scale through partnerships, and win customers, Inbar said. The event will also feature the AWE Gaming Hub. I also interviewed some companies that are showcasing technology at the show. Here’s some snippets from what they are going to show. Pico VR Pico started out in Beijing, China, in 2015 and is now hitting its 10th anniversary. It is making the standalone Pico XR headsets, and it was acquired by ByteDance, the owner of TikTok, in 2021. In September 2024, the company launched the Pico 4 Ultra Enterprise headset, filling out the high end of its product line in addition to its G3 and Neo 3 legacy headsets. Pico also has its set of full-body motion trackers to its product offerings to allow for full-body and object tracking. That’s helping it with its focus on location-based entertainment in markets such as China. It’s focused on WiFi7, hand tracking and motion tracking. Leland Hedges, head of enterprise business at Pico, said that the LBE market in China has grown by 1,000% in the last six to nine months Pico has an app for PC streaming and another app for managing devices over a LAN. Pico can track play spaces with columns or cordoned-off areas. Hedges said the company will share 15 different user stories at AWE in public places such as zoos, museums, aquariums and planetariums. Convai Purnendu Mukherjee, CEO of Convai, showed me a bunch of demos at the Game Developers Conference where it has been able to create avatar-based demos of generative AI solutions with 3D animated people. These can be used to show off brands and greet people on web sites or as avatars in games. At AWE, Convai will also off learning and training scenarios for education and enterprises through a variety of simulations. Convai can render high fidelity avatars that are effectively coming from the cloud. At GDC, Convai scanned me and captured my voice so that it can create a lifelike avatar of me. These avatars can be created quickly and answer a variety of questions from website visitors. The idea is to enable non-technical people to create simulations without the need to code anything. In a demo, Convai’s avatar of me said, “I’ve been covering the games industry for many years now at games beat I’ve seen it evolve from the arcades to the massive global phenomenon it is today. I love digging into the business side of gaming, the technology, the culture, the whole shebang.” Convai will announce pricing for its self-serve platform as well as an enterprise subscription fee. Doublepoint Ohto Pentikäinen, CEO of Doublepoint, has a technology that detects the gesture you can make with your hand. It captures that movement via a smartwatch and allows you to control things on a TV interface or an XR device. With Android XR, Doublepoint is showing off demos where gesture control can unlock a more intuitive and comfortable augmented reality experience for those wearing AR glasses. Xreal is one of the glasses makers that is using the technology for controlling an AR user interface with gestures. “Our technology is able to fully control a XR system. A stat that we can update you on is that there’s 150,000 people who have downloaded the technology so far, and we have a developer community of over 2,000 people since January 2024,” Pentikäinen said. Now the company is starting its own Doublepoing developer program, and this adds layers on top of the enterprise client. So now the company can provide technology for indie developers or startups that are building augmented reality or AI hardware experiences. “We’re empowering developers in AR robotics and AI hardware, and we’re providing everything that we’re providing the enterprise clients, but for a much reduced price,” Pentikäinen said.
    0 Σχόλια 0 Μοιράστηκε
  • Reinforcement Learning Makes LLMs Search-Savvy: Ant Group Researchers Introduce SEM to Optimize Tool Usage and Reasoning Efficiency

    Recent progress in LLMs has shown their potential in performing complex reasoning tasks and effectively using external tools like search engines. Despite this, teaching models to make smart decisions about when to rely on internal knowledge versus search remains a key challenge. While simple prompt-based methods can guide models to invoke tools, LLMs still struggle with more nuanced behaviors, such as recognizing when an initial search was incorrect and deciding to search again. RL has been explored to improve these behaviors by rewarding effective search usage. However, RL often leads to unnecessary tool use, with models executing redundant searches even for simple tasks, highlighting inefficiencies that must be addressed.
    Various RL strategies, including Proximal Policy Optimization, Direct Preference Optimization, and Group Relative Policy Optimization, have been used to align LLM behavior with human expectations. PPO helps balance learning exploration with maintaining policy stability, while DPO simplifies alignment by directly optimizing model responses based on user preferences. GRPO introduces group-based evaluations to capture subtle improvements in reasoning better. Meanwhile, treating LLMs as autonomous agents that plan and execute multi-step reasoning tasks is gaining traction. Frameworks like AutoGPT and LangChain showcase how these agents can refine their outputs through iterative reasoning and search. Yet, current agent systems often depend on fixed prompts or heuristic-based tool use, limiting their adaptability and efficiency. 
    Researchers at Ant Group introduce SEM, a post-training reinforcement learning framework designed to teach LLMs when to use search tools and when to rely on internal knowledge. By training on a balanced dataset combining questions that do and do not require external retrieval, SEM guides the model to issue search requests only when necessary. Using a structured reasoning format and GRPO, the framework rewards accurate answers without search and penalizes unnecessary tool use. Results show that SEM improves response accuracy and efficiency, helping models better judge when external information is needed, thus enhancing reasoning in complex scenarios. 
    To integrate search tools into a model’s reasoning process, SEM uses reinforcement learning to teach models when and how to use search effectively. The training data combines Musiqueand MMLU, helping models learn to judge when search is necessary. Using the GRPO framework, the model is rewarded for accurate, efficient answers, discouraging unnecessary searches, and encouraging them when internal knowledge falls short. A structured response formatstandardizes training and allows for precise reward assignment, improving both reasoning quality and search decision-making. 
    The study evaluates a model trained to determine when to rely on its internal knowledge and when to use external search. It combines Musiqueand MMLUfor training and evaluates performance on datasets like HotpotQA, GSM8K, and MMLU. The proposed SEM method outperforms baselines like Naive RAG and ReSearch in answer accuracy and search efficiency. SEM reduces unnecessary searches on known questions while improving reasoning on unknown ones. Case studies and training curves confirm SEM’s stable learning and intelligent decision-making. Overall, SEM enhances retrieval decisions and internal reasoning in large language models. 

    In conclusion, SEM is a post-training reinforcement learning framework designed to improve how large language models use external search tools. The model is trained on a dataset combining MuSiQue and MMLU, helping it distinguish between questions it can answer internally and those that require external retrieval. SEM uses a structured reasoning approach and a reward function that penalizes unnecessary searches while promoting accurate and efficient retrieval. Experiments on benchmarks like HotpotQA, GSM8K, and MMLU show that SEM reduces redundant searches and improves accuracy. This approach enhances reasoning efficiency and intelligent use of external knowledge in LLMs. 

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit.
    Sana HassanSana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.Sana Hassanhttps://www.marktechpost.com/author/sana-hassan/SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context AgentsSana Hassanhttps://www.marktechpost.com/author/sana-hassan/This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational EfficiencySana Hassanhttps://www.marktechpost.com/author/sana-hassan/Meet LangGraph Multi-Agent Swarm: A Python Library for Creating Swarm-Style Multi-Agent Systems Using LangGraphSana Hassanhttps://www.marktechpost.com/author/sana-hassan/ByteDance Introduces Seed1.5-VL: A Vision-Language Foundation Model Designed to Advance General-Purpose Multimodal Understanding and Reasoning

    Build GenAI you can trust. ⭐️ Parlant is your open-source engine for controlled, compliant, and purposeful AI conversations — Star Parlant on GitHub!
    #reinforcement #learning #makes #llms #searchsavvy
    Reinforcement Learning Makes LLMs Search-Savvy: Ant Group Researchers Introduce SEM to Optimize Tool Usage and Reasoning Efficiency
    Recent progress in LLMs has shown their potential in performing complex reasoning tasks and effectively using external tools like search engines. Despite this, teaching models to make smart decisions about when to rely on internal knowledge versus search remains a key challenge. While simple prompt-based methods can guide models to invoke tools, LLMs still struggle with more nuanced behaviors, such as recognizing when an initial search was incorrect and deciding to search again. RL has been explored to improve these behaviors by rewarding effective search usage. However, RL often leads to unnecessary tool use, with models executing redundant searches even for simple tasks, highlighting inefficiencies that must be addressed. Various RL strategies, including Proximal Policy Optimization, Direct Preference Optimization, and Group Relative Policy Optimization, have been used to align LLM behavior with human expectations. PPO helps balance learning exploration with maintaining policy stability, while DPO simplifies alignment by directly optimizing model responses based on user preferences. GRPO introduces group-based evaluations to capture subtle improvements in reasoning better. Meanwhile, treating LLMs as autonomous agents that plan and execute multi-step reasoning tasks is gaining traction. Frameworks like AutoGPT and LangChain showcase how these agents can refine their outputs through iterative reasoning and search. Yet, current agent systems often depend on fixed prompts or heuristic-based tool use, limiting their adaptability and efficiency.  Researchers at Ant Group introduce SEM, a post-training reinforcement learning framework designed to teach LLMs when to use search tools and when to rely on internal knowledge. By training on a balanced dataset combining questions that do and do not require external retrieval, SEM guides the model to issue search requests only when necessary. Using a structured reasoning format and GRPO, the framework rewards accurate answers without search and penalizes unnecessary tool use. Results show that SEM improves response accuracy and efficiency, helping models better judge when external information is needed, thus enhancing reasoning in complex scenarios.  To integrate search tools into a model’s reasoning process, SEM uses reinforcement learning to teach models when and how to use search effectively. The training data combines Musiqueand MMLU, helping models learn to judge when search is necessary. Using the GRPO framework, the model is rewarded for accurate, efficient answers, discouraging unnecessary searches, and encouraging them when internal knowledge falls short. A structured response formatstandardizes training and allows for precise reward assignment, improving both reasoning quality and search decision-making.  The study evaluates a model trained to determine when to rely on its internal knowledge and when to use external search. It combines Musiqueand MMLUfor training and evaluates performance on datasets like HotpotQA, GSM8K, and MMLU. The proposed SEM method outperforms baselines like Naive RAG and ReSearch in answer accuracy and search efficiency. SEM reduces unnecessary searches on known questions while improving reasoning on unknown ones. Case studies and training curves confirm SEM’s stable learning and intelligent decision-making. Overall, SEM enhances retrieval decisions and internal reasoning in large language models.  In conclusion, SEM is a post-training reinforcement learning framework designed to improve how large language models use external search tools. The model is trained on a dataset combining MuSiQue and MMLU, helping it distinguish between questions it can answer internally and those that require external retrieval. SEM uses a structured reasoning approach and a reward function that penalizes unnecessary searches while promoting accurate and efficient retrieval. Experiments on benchmarks like HotpotQA, GSM8K, and MMLU show that SEM reduces redundant searches and improves accuracy. This approach enhances reasoning efficiency and intelligent use of external knowledge in LLMs.  Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit. Sana HassanSana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.Sana Hassanhttps://www.marktechpost.com/author/sana-hassan/SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context AgentsSana Hassanhttps://www.marktechpost.com/author/sana-hassan/This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational EfficiencySana Hassanhttps://www.marktechpost.com/author/sana-hassan/Meet LangGraph Multi-Agent Swarm: A Python Library for Creating Swarm-Style Multi-Agent Systems Using LangGraphSana Hassanhttps://www.marktechpost.com/author/sana-hassan/ByteDance Introduces Seed1.5-VL: A Vision-Language Foundation Model Designed to Advance General-Purpose Multimodal Understanding and Reasoning 🚨 Build GenAI you can trust. ⭐️ Parlant is your open-source engine for controlled, compliant, and purposeful AI conversations — Star Parlant on GitHub! #reinforcement #learning #makes #llms #searchsavvy
    WWW.MARKTECHPOST.COM
    Reinforcement Learning Makes LLMs Search-Savvy: Ant Group Researchers Introduce SEM to Optimize Tool Usage and Reasoning Efficiency
    Recent progress in LLMs has shown their potential in performing complex reasoning tasks and effectively using external tools like search engines. Despite this, teaching models to make smart decisions about when to rely on internal knowledge versus search remains a key challenge. While simple prompt-based methods can guide models to invoke tools, LLMs still struggle with more nuanced behaviors, such as recognizing when an initial search was incorrect and deciding to search again. RL has been explored to improve these behaviors by rewarding effective search usage. However, RL often leads to unnecessary tool use, with models executing redundant searches even for simple tasks, highlighting inefficiencies that must be addressed. Various RL strategies, including Proximal Policy Optimization (PPO), Direct Preference Optimization (DPO), and Group Relative Policy Optimization (GRPO), have been used to align LLM behavior with human expectations. PPO helps balance learning exploration with maintaining policy stability, while DPO simplifies alignment by directly optimizing model responses based on user preferences. GRPO introduces group-based evaluations to capture subtle improvements in reasoning better. Meanwhile, treating LLMs as autonomous agents that plan and execute multi-step reasoning tasks is gaining traction. Frameworks like AutoGPT and LangChain showcase how these agents can refine their outputs through iterative reasoning and search. Yet, current agent systems often depend on fixed prompts or heuristic-based tool use, limiting their adaptability and efficiency.  Researchers at Ant Group introduce SEM, a post-training reinforcement learning framework designed to teach LLMs when to use search tools and when to rely on internal knowledge. By training on a balanced dataset combining questions that do and do not require external retrieval, SEM guides the model to issue search requests only when necessary. Using a structured reasoning format and GRPO, the framework rewards accurate answers without search and penalizes unnecessary tool use. Results show that SEM improves response accuracy and efficiency, helping models better judge when external information is needed, thus enhancing reasoning in complex scenarios.  To integrate search tools into a model’s reasoning process, SEM uses reinforcement learning to teach models when and how to use search effectively. The training data combines Musique (questions needing external info) and MMLU (questions answerable from prior knowledge), helping models learn to judge when search is necessary. Using the GRPO framework, the model is rewarded for accurate, efficient answers, discouraging unnecessary searches, and encouraging them when internal knowledge falls short. A structured response format (<think>, <answer>, <search>, <result>) standardizes training and allows for precise reward assignment, improving both reasoning quality and search decision-making.  The study evaluates a model trained to determine when to rely on its internal knowledge and when to use external search. It combines Musique (unfamiliar questions) and MMLU (familiar questions) for training and evaluates performance on datasets like HotpotQA, GSM8K, and MMLU. The proposed SEM method outperforms baselines like Naive RAG and ReSearch in answer accuracy and search efficiency. SEM reduces unnecessary searches on known questions while improving reasoning on unknown ones. Case studies and training curves confirm SEM’s stable learning and intelligent decision-making. Overall, SEM enhances retrieval decisions and internal reasoning in large language models.  In conclusion, SEM is a post-training reinforcement learning framework designed to improve how large language models use external search tools. The model is trained on a dataset combining MuSiQue and MMLU, helping it distinguish between questions it can answer internally and those that require external retrieval. SEM uses a structured reasoning approach and a reward function that penalizes unnecessary searches while promoting accurate and efficient retrieval. Experiments on benchmarks like HotpotQA, GSM8K, and MMLU show that SEM reduces redundant searches and improves accuracy. This approach enhances reasoning efficiency and intelligent use of external knowledge in LLMs.  Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit. Sana HassanSana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.Sana Hassanhttps://www.marktechpost.com/author/sana-hassan/SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context AgentsSana Hassanhttps://www.marktechpost.com/author/sana-hassan/This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational EfficiencySana Hassanhttps://www.marktechpost.com/author/sana-hassan/Meet LangGraph Multi-Agent Swarm: A Python Library for Creating Swarm-Style Multi-Agent Systems Using LangGraphSana Hassanhttps://www.marktechpost.com/author/sana-hassan/ByteDance Introduces Seed1.5-VL: A Vision-Language Foundation Model Designed to Advance General-Purpose Multimodal Understanding and Reasoning 🚨 Build GenAI you can trust. ⭐️ Parlant is your open-source engine for controlled, compliant, and purposeful AI conversations — Star Parlant on GitHub! (Promoted)
    0 Σχόλια 0 Μοιράστηκε
  • SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context Agents

    Recent advancements in LM agents have shown promising potential for automating intricate real-world tasks. These agents typically operate by proposing and executing actions through APIs, supporting applications such as software engineering, robotics, and scientific experimentation. As these tasks become more complex, LM agent frameworks have evolved to include multiple agents, multi-step retrieval, and tailored scaffolding to optimize performance. A central challenge lies in effectively exploring and understanding the environment, which has prompted the development of engineered scaffolds using tools, memory mechanisms, and custom pipelines. However, most existing methods assume partial observability, requiring agents to collect observations incrementally. While this assumption holds in dynamic or unfamiliar environments, it is less applicable in fully observable settings like SWE-bench, where all relevant information is accessible from the start.
    In software engineering, research on LM agents has focused on two main strategies: agent-based frameworks and structured pipelines. Agent-based systems, such as SWE-Agent and OpenHands CodeAct, allow LMs to interact autonomously with codebases, often through custom interfaces and retrieval tools. Other models like Moatless and AutoCodeRover enhance localization through search techniques, while SpecRover refines scaffolding design. Alternatively, structured pipelines—such as Agentless and CodeMonkey—decompose tasks into sequential phases like localization, repair, and validation. While these approaches depend on engineered components for performance, the current study proposes leveraging Long-Context LMsto directly interpret the entire task environment. Advances in LCLM architecture and infrastructure now allow these models to outperform retrieval-augmented systems in many contexts, reducing reliance on complex external scaffolding. 
    Researchers from Stanford, IBM, and the University of Toronto explored whether complex scaffolding is necessary for LM agents tackling tasks like SWE-bench. They show that simply using LCLMs, such as Gemini-1.5-Pro, with proper prompting and no scaffolding, can achieve competitive performance—reaching 38% on SWE-Bench-Verified. Gemini-2.5-Pro, using the same simple setup, reaches 50.8%. Their work suggests that many complex agentic designs could be replaced with a single powerful LCLM, simplifying architecture and training. Additionally, a hybrid two-stage approach using Gemini-1.5-Pro and Claude-3.7 achieves a 48.6% solve rate, further supporting this simplified direction. 
    Traditional LM agents rely on interactive exploration due to partial observability, but many tasks, like software debugging, allow full observability. The study proposes state-in-context agents that leverage LCLMs to directly process full or compressed environment states, bypassing the need for complex agentic scaffolding. For large codebases, a ranking-based compression selects relevant files to fit within context limits. Two methods are introduced: DIRECTSOLVE, where LCLMs solve tasks using the full context; and SELECTSOLVE, where LCLMs localize relevant files for short-context LMsto solve. Both use targeted patch formats and validation to ensure accuracy and reduce hallucination. 
    The experiments evaluate a simplified agent framework using LLMs on the SWE-bench Verified benchmark, which includes 500 real-world software engineering tasks. The proposed methods, DIRECTSOLVE and SELECTSOLVE, utilize LCLMs like Gemini-1.5-Pro and Gemini-2.5-Pro, and in SELECTSOLVE, an additional SCLMfor patch generation. Results show that DIRECTSOLVE outperforms complex agentic approaches like Agentless and CodeAct with minimal engineering. SELECTSOLVE further improves accuracy by leveraging stronger models for patching. Ablation studies highlight the importance of CoT prompting, code restatement, and token-efficient context design. Additionally, positioning relevant files at the start of the prompt improves performance, underscoring limitations in long-context processing. 

    In conclusion, the cost of using LCLM-based methods is currently higher than existing approaches like Agentless and CodeAct, averaging per instance compared to and respectively. However, rapid drops in inference costs and increasing context lengths make LCLMs more practical. Techniques like KV caching significantly lower costs after initial runs, reducing it to about Although slight codebase changes still limit caching benefits, further improvements could help. The study also suggests that LCLMs can handle long interaction histories, reducing the need for complex memory and retrieval mechanisms. Notably, unscaffolded LCLM models can perform competitively on SWE-bench tasks. 

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit.
    Sana HassanSana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.Sana Hassanhttps://www.marktechpost.com/author/sana-hassan/This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational EfficiencySana Hassanhttps://www.marktechpost.com/author/sana-hassan/Meet LangGraph Multi-Agent Swarm: A Python Library for Creating Swarm-Style Multi-Agent Systems Using LangGraphSana Hassanhttps://www.marktechpost.com/author/sana-hassan/ByteDance Introduces Seed1.5-VL: A Vision-Language Foundation Model Designed to Advance General-Purpose Multimodal Understanding and ReasoningSana Hassanhttps://www.marktechpost.com/author/sana-hassan/Researchers from Tsinghua and ModelBest Release Ultra-FineWeb: A Trillion-Token Dataset Enhancing LLM Accuracy Across Benchmarks

    Build GenAI you can trust. ⭐️ Parlant is your open-source engine for controlled, compliant, and purposeful AI conversations — Star Parlant on GitHub!
    #swebench #performance #reaches #without #tool
    SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context Agents
    Recent advancements in LM agents have shown promising potential for automating intricate real-world tasks. These agents typically operate by proposing and executing actions through APIs, supporting applications such as software engineering, robotics, and scientific experimentation. As these tasks become more complex, LM agent frameworks have evolved to include multiple agents, multi-step retrieval, and tailored scaffolding to optimize performance. A central challenge lies in effectively exploring and understanding the environment, which has prompted the development of engineered scaffolds using tools, memory mechanisms, and custom pipelines. However, most existing methods assume partial observability, requiring agents to collect observations incrementally. While this assumption holds in dynamic or unfamiliar environments, it is less applicable in fully observable settings like SWE-bench, where all relevant information is accessible from the start. In software engineering, research on LM agents has focused on two main strategies: agent-based frameworks and structured pipelines. Agent-based systems, such as SWE-Agent and OpenHands CodeAct, allow LMs to interact autonomously with codebases, often through custom interfaces and retrieval tools. Other models like Moatless and AutoCodeRover enhance localization through search techniques, while SpecRover refines scaffolding design. Alternatively, structured pipelines—such as Agentless and CodeMonkey—decompose tasks into sequential phases like localization, repair, and validation. While these approaches depend on engineered components for performance, the current study proposes leveraging Long-Context LMsto directly interpret the entire task environment. Advances in LCLM architecture and infrastructure now allow these models to outperform retrieval-augmented systems in many contexts, reducing reliance on complex external scaffolding.  Researchers from Stanford, IBM, and the University of Toronto explored whether complex scaffolding is necessary for LM agents tackling tasks like SWE-bench. They show that simply using LCLMs, such as Gemini-1.5-Pro, with proper prompting and no scaffolding, can achieve competitive performance—reaching 38% on SWE-Bench-Verified. Gemini-2.5-Pro, using the same simple setup, reaches 50.8%. Their work suggests that many complex agentic designs could be replaced with a single powerful LCLM, simplifying architecture and training. Additionally, a hybrid two-stage approach using Gemini-1.5-Pro and Claude-3.7 achieves a 48.6% solve rate, further supporting this simplified direction.  Traditional LM agents rely on interactive exploration due to partial observability, but many tasks, like software debugging, allow full observability. The study proposes state-in-context agents that leverage LCLMs to directly process full or compressed environment states, bypassing the need for complex agentic scaffolding. For large codebases, a ranking-based compression selects relevant files to fit within context limits. Two methods are introduced: DIRECTSOLVE, where LCLMs solve tasks using the full context; and SELECTSOLVE, where LCLMs localize relevant files for short-context LMsto solve. Both use targeted patch formats and validation to ensure accuracy and reduce hallucination.  The experiments evaluate a simplified agent framework using LLMs on the SWE-bench Verified benchmark, which includes 500 real-world software engineering tasks. The proposed methods, DIRECTSOLVE and SELECTSOLVE, utilize LCLMs like Gemini-1.5-Pro and Gemini-2.5-Pro, and in SELECTSOLVE, an additional SCLMfor patch generation. Results show that DIRECTSOLVE outperforms complex agentic approaches like Agentless and CodeAct with minimal engineering. SELECTSOLVE further improves accuracy by leveraging stronger models for patching. Ablation studies highlight the importance of CoT prompting, code restatement, and token-efficient context design. Additionally, positioning relevant files at the start of the prompt improves performance, underscoring limitations in long-context processing.  In conclusion, the cost of using LCLM-based methods is currently higher than existing approaches like Agentless and CodeAct, averaging per instance compared to and respectively. However, rapid drops in inference costs and increasing context lengths make LCLMs more practical. Techniques like KV caching significantly lower costs after initial runs, reducing it to about Although slight codebase changes still limit caching benefits, further improvements could help. The study also suggests that LCLMs can handle long interaction histories, reducing the need for complex memory and retrieval mechanisms. Notably, unscaffolded LCLM models can perform competitively on SWE-bench tasks.  Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit. Sana HassanSana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.Sana Hassanhttps://www.marktechpost.com/author/sana-hassan/This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational EfficiencySana Hassanhttps://www.marktechpost.com/author/sana-hassan/Meet LangGraph Multi-Agent Swarm: A Python Library for Creating Swarm-Style Multi-Agent Systems Using LangGraphSana Hassanhttps://www.marktechpost.com/author/sana-hassan/ByteDance Introduces Seed1.5-VL: A Vision-Language Foundation Model Designed to Advance General-Purpose Multimodal Understanding and ReasoningSana Hassanhttps://www.marktechpost.com/author/sana-hassan/Researchers from Tsinghua and ModelBest Release Ultra-FineWeb: A Trillion-Token Dataset Enhancing LLM Accuracy Across Benchmarks 🚨 Build GenAI you can trust. ⭐️ Parlant is your open-source engine for controlled, compliant, and purposeful AI conversations — Star Parlant on GitHub! #swebench #performance #reaches #without #tool
    WWW.MARKTECHPOST.COM
    SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context Agents
    Recent advancements in LM agents have shown promising potential for automating intricate real-world tasks. These agents typically operate by proposing and executing actions through APIs, supporting applications such as software engineering, robotics, and scientific experimentation. As these tasks become more complex, LM agent frameworks have evolved to include multiple agents, multi-step retrieval, and tailored scaffolding to optimize performance. A central challenge lies in effectively exploring and understanding the environment, which has prompted the development of engineered scaffolds using tools, memory mechanisms, and custom pipelines. However, most existing methods assume partial observability, requiring agents to collect observations incrementally. While this assumption holds in dynamic or unfamiliar environments, it is less applicable in fully observable settings like SWE-bench, where all relevant information is accessible from the start. In software engineering, research on LM agents has focused on two main strategies: agent-based frameworks and structured pipelines. Agent-based systems, such as SWE-Agent and OpenHands CodeAct, allow LMs to interact autonomously with codebases, often through custom interfaces and retrieval tools. Other models like Moatless and AutoCodeRover enhance localization through search techniques, while SpecRover refines scaffolding design. Alternatively, structured pipelines—such as Agentless and CodeMonkey—decompose tasks into sequential phases like localization, repair, and validation. While these approaches depend on engineered components for performance, the current study proposes leveraging Long-Context LMs (LCLMs) to directly interpret the entire task environment. Advances in LCLM architecture and infrastructure now allow these models to outperform retrieval-augmented systems in many contexts, reducing reliance on complex external scaffolding.  Researchers from Stanford, IBM, and the University of Toronto explored whether complex scaffolding is necessary for LM agents tackling tasks like SWE-bench. They show that simply using LCLMs, such as Gemini-1.5-Pro, with proper prompting and no scaffolding, can achieve competitive performance—reaching 38% on SWE-Bench-Verified. Gemini-2.5-Pro, using the same simple setup, reaches 50.8%. Their work suggests that many complex agentic designs could be replaced with a single powerful LCLM, simplifying architecture and training. Additionally, a hybrid two-stage approach using Gemini-1.5-Pro and Claude-3.7 achieves a 48.6% solve rate, further supporting this simplified direction.  Traditional LM agents rely on interactive exploration due to partial observability, but many tasks, like software debugging, allow full observability. The study proposes state-in-context agents that leverage LCLMs to directly process full or compressed environment states, bypassing the need for complex agentic scaffolding. For large codebases, a ranking-based compression selects relevant files to fit within context limits. Two methods are introduced: DIRECTSOLVE, where LCLMs solve tasks using the full context; and SELECTSOLVE, where LCLMs localize relevant files for short-context LMs (SCLMs) to solve. Both use targeted patch formats and validation to ensure accuracy and reduce hallucination.  The experiments evaluate a simplified agent framework using LLMs on the SWE-bench Verified benchmark, which includes 500 real-world software engineering tasks. The proposed methods, DIRECTSOLVE and SELECTSOLVE, utilize LCLMs like Gemini-1.5-Pro and Gemini-2.5-Pro, and in SELECTSOLVE, an additional SCLM (Claude-3.7-Sonnet) for patch generation. Results show that DIRECTSOLVE outperforms complex agentic approaches like Agentless and CodeAct with minimal engineering. SELECTSOLVE further improves accuracy by leveraging stronger models for patching. Ablation studies highlight the importance of CoT prompting, code restatement, and token-efficient context design. Additionally, positioning relevant files at the start of the prompt improves performance, underscoring limitations in long-context processing.  In conclusion, the cost of using LCLM-based methods is currently higher than existing approaches like Agentless and CodeAct, averaging $2.60 per instance compared to $0.25 and $0.87, respectively. However, rapid drops in inference costs and increasing context lengths make LCLMs more practical. Techniques like KV caching significantly lower costs after initial runs, reducing it to about $0.725. Although slight codebase changes still limit caching benefits, further improvements could help. The study also suggests that LCLMs can handle long interaction histories, reducing the need for complex memory and retrieval mechanisms. Notably, unscaffolded LCLM models can perform competitively on SWE-bench tasks.  Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit. Sana HassanSana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.Sana Hassanhttps://www.marktechpost.com/author/sana-hassan/This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational EfficiencySana Hassanhttps://www.marktechpost.com/author/sana-hassan/Meet LangGraph Multi-Agent Swarm: A Python Library for Creating Swarm-Style Multi-Agent Systems Using LangGraphSana Hassanhttps://www.marktechpost.com/author/sana-hassan/ByteDance Introduces Seed1.5-VL: A Vision-Language Foundation Model Designed to Advance General-Purpose Multimodal Understanding and ReasoningSana Hassanhttps://www.marktechpost.com/author/sana-hassan/Researchers from Tsinghua and ModelBest Release Ultra-FineWeb: A Trillion-Token Dataset Enhancing LLM Accuracy Across Benchmarks 🚨 Build GenAI you can trust. ⭐️ Parlant is your open-source engine for controlled, compliant, and purposeful AI conversations — Star Parlant on GitHub! (Promoted)
    0 Σχόλια 0 Μοιράστηκε
  • Former Activision Boss Bobby Kotick Wants To Buy Tiktok: Report

    TikTok is in an exceptionally tough spot these days. Despite everyone you know using it for hours on end, the video-sharing app is currently facing legislation that would force its ban in the U.S pending a potential sale, and prospective buyers are lining up. One of these potential buyers is reportedly Bobby Kotick, the former boss of Activision Blizzard, according to the Wall Street Journal.Suggested ReadingWhy This Under-the-Radar AAA Title Is More Than Just A Far Cry Clone

    Share SubtitlesOffEnglishview videoSuggested ReadingWhy This Under-the-Radar AAA Title Is More Than Just A Far Cry Clone

    Share SubtitlesOffEnglishTikTok has been scrutinized for years by U.S. lawmakers who have argued that its China-based parent company ByteDance may share data it collects with the Chinese government, or that the app could serve as a propaganda delivery tool. Despite tensions ramping up some time ago, leading many to believe that the app would be banned in the U.S., matters had seemingly cooled until a bill was pushed through the House Energy and Commerce Committee last week, ratcheting up the pressure on ByteDance. The bill is expected to be reviewed and approved by the House of Representatives this week before being sent to the Senate, and President Joe Biden has already claimed he would sign off on a ban if the bill made it through legislation.The bill requires that ByteDance “divest itself” of TikTok or see the app banned in the U.S., which has led to renewed interest from potential buyers, including Kotick. Kotick, according to WSJ’s sources, has floated the idea of a buy to ByteDance’s co-founder and is reportedly looking for partners, which could include Sam Altman of OpenAI. According to the Wall Street Journal, “OpenAI could use TikTok to help train its AI models if a partner such as Kotick could raise the capital for such an acquisition.” TikTok’s sale has been estimated to be in the range of “hundreds of billions of dollars.”Kotick departed from Activision Blizzard late last year after completing the publisher’s billion sale to Microsoft. Kotick’s tenure at Activision Blizzard spanned decades and came under fire in 2021, when the state of California filed a now-dismissed lawsuit following an investigation into allegations of sexual harassment and discrimination. Ultimately, California’s Civil Rights Department withdrew all allegations and claims relating to harassment and settled with Activision Blizzard in December 2023 for million to resolve unsubstantiated pay and promotions claims.The court-approved settlement included a statement that provided that: “o court or any independent investigation has substantiated any allegations that there has been systemic or widespread sexual harassment at Activision Blizzard; that Activision Blizzard senior executives ignored, condoned, or tolerated a culture of systemic harassment, retaliation, or discrimination; or that Activision Blizzard’s Board of Directors including its Chief Executive Officer, Robert Kotick, acted improperly with regard to the handling of any instances of workplace misconduct.”In addition, the settlement noted that a former chair of the EEOC had conducted a review of the company’s policies, practices and certain complaint data and reported that there was no widespread harassment at the company. The company itself publicly released its Transparency Report, which further asserted that there was never been widespread or systemic harassment or gender pay inequity at Activision Blizzard. Kotick departed with a golden parachute estimated to be worth around million.Updated: 04/01/2024, 2:00 p.m. ET: This article has been updated to include details of the CRD settlement, that Activision Blizzard denied any wrongdoing, and the settlement confirms CRD could not substantiate those claims. Updated: 05/17/2025, 12:10 p.m. ET: This article has been updated to include additional language from the CRD settlement, and to include that as part of the settlement, the CRD withdrew the claims related to harassment from its complaint..
    #former #activision #boss #bobby #kotick
    Former Activision Boss Bobby Kotick Wants To Buy Tiktok: Report
    TikTok is in an exceptionally tough spot these days. Despite everyone you know using it for hours on end, the video-sharing app is currently facing legislation that would force its ban in the U.S pending a potential sale, and prospective buyers are lining up. One of these potential buyers is reportedly Bobby Kotick, the former boss of Activision Blizzard, according to the Wall Street Journal.Suggested ReadingWhy This Under-the-Radar AAA Title Is More Than Just A Far Cry Clone Share SubtitlesOffEnglishview videoSuggested ReadingWhy This Under-the-Radar AAA Title Is More Than Just A Far Cry Clone Share SubtitlesOffEnglishTikTok has been scrutinized for years by U.S. lawmakers who have argued that its China-based parent company ByteDance may share data it collects with the Chinese government, or that the app could serve as a propaganda delivery tool. Despite tensions ramping up some time ago, leading many to believe that the app would be banned in the U.S., matters had seemingly cooled until a bill was pushed through the House Energy and Commerce Committee last week, ratcheting up the pressure on ByteDance. The bill is expected to be reviewed and approved by the House of Representatives this week before being sent to the Senate, and President Joe Biden has already claimed he would sign off on a ban if the bill made it through legislation.The bill requires that ByteDance “divest itself” of TikTok or see the app banned in the U.S., which has led to renewed interest from potential buyers, including Kotick. Kotick, according to WSJ’s sources, has floated the idea of a buy to ByteDance’s co-founder and is reportedly looking for partners, which could include Sam Altman of OpenAI. According to the Wall Street Journal, “OpenAI could use TikTok to help train its AI models if a partner such as Kotick could raise the capital for such an acquisition.” TikTok’s sale has been estimated to be in the range of “hundreds of billions of dollars.”Kotick departed from Activision Blizzard late last year after completing the publisher’s billion sale to Microsoft. Kotick’s tenure at Activision Blizzard spanned decades and came under fire in 2021, when the state of California filed a now-dismissed lawsuit following an investigation into allegations of sexual harassment and discrimination. Ultimately, California’s Civil Rights Department withdrew all allegations and claims relating to harassment and settled with Activision Blizzard in December 2023 for million to resolve unsubstantiated pay and promotions claims.The court-approved settlement included a statement that provided that: “o court or any independent investigation has substantiated any allegations that there has been systemic or widespread sexual harassment at Activision Blizzard; that Activision Blizzard senior executives ignored, condoned, or tolerated a culture of systemic harassment, retaliation, or discrimination; or that Activision Blizzard’s Board of Directors including its Chief Executive Officer, Robert Kotick, acted improperly with regard to the handling of any instances of workplace misconduct.”In addition, the settlement noted that a former chair of the EEOC had conducted a review of the company’s policies, practices and certain complaint data and reported that there was no widespread harassment at the company. The company itself publicly released its Transparency Report, which further asserted that there was never been widespread or systemic harassment or gender pay inequity at Activision Blizzard. Kotick departed with a golden parachute estimated to be worth around million.Updated: 04/01/2024, 2:00 p.m. ET: This article has been updated to include details of the CRD settlement, that Activision Blizzard denied any wrongdoing, and the settlement confirms CRD could not substantiate those claims. Updated: 05/17/2025, 12:10 p.m. ET: This article has been updated to include additional language from the CRD settlement, and to include that as part of the settlement, the CRD withdrew the claims related to harassment from its complaint.. #former #activision #boss #bobby #kotick
    KOTAKU.COM
    Former Activision Boss Bobby Kotick Wants To Buy Tiktok: Report
    TikTok is in an exceptionally tough spot these days. Despite everyone you know using it for hours on end, the video-sharing app is currently facing legislation that would force its ban in the U.S pending a potential sale, and prospective buyers are lining up. One of these potential buyers is reportedly Bobby Kotick, the former boss of Activision Blizzard, according to the Wall Street Journal.Suggested ReadingWhy This Under-the-Radar AAA Title Is More Than Just A Far Cry Clone Share SubtitlesOffEnglishview videoSuggested ReadingWhy This Under-the-Radar AAA Title Is More Than Just A Far Cry Clone Share SubtitlesOffEnglishTikTok has been scrutinized for years by U.S. lawmakers who have argued that its China-based parent company ByteDance may share data it collects with the Chinese government, or that the app could serve as a propaganda delivery tool. Despite tensions ramping up some time ago, leading many to believe that the app would be banned in the U.S., matters had seemingly cooled until a bill was pushed through the House Energy and Commerce Committee last week, ratcheting up the pressure on ByteDance. The bill is expected to be reviewed and approved by the House of Representatives this week before being sent to the Senate, and President Joe Biden has already claimed he would sign off on a ban if the bill made it through legislation.The bill requires that ByteDance “divest itself” of TikTok or see the app banned in the U.S., which has led to renewed interest from potential buyers, including Kotick. Kotick, according to WSJ’s sources, has floated the idea of a buy to ByteDance’s co-founder and is reportedly looking for partners, which could include Sam Altman of OpenAI. According to the Wall Street Journal, “OpenAI could use TikTok to help train its AI models if a partner such as Kotick could raise the capital for such an acquisition.” TikTok’s sale has been estimated to be in the range of “hundreds of billions of dollars.”Kotick departed from Activision Blizzard late last year after completing the publisher’s $68 billion sale to Microsoft. Kotick’s tenure at Activision Blizzard spanned decades and came under fire in 2021, when the state of California filed a now-dismissed lawsuit following an investigation into allegations of sexual harassment and discrimination. Ultimately, California’s Civil Rights Department withdrew all allegations and claims relating to harassment and settled with Activision Blizzard in December 2023 for $54 million to resolve unsubstantiated pay and promotions claims.The court-approved settlement included a statement that provided that: “[N]o court or any independent investigation has substantiated any allegations that there has been systemic or widespread sexual harassment at Activision Blizzard; that Activision Blizzard senior executives ignored, condoned, or tolerated a culture of systemic harassment, retaliation, or discrimination; or that Activision Blizzard’s Board of Directors including its Chief Executive Officer, Robert Kotick, acted improperly with regard to the handling of any instances of workplace misconduct.”In addition, the settlement noted that a former chair of the EEOC had conducted a review of the company’s policies, practices and certain complaint data and reported that there was no widespread harassment at the company. The company itself publicly released its Transparency Report, which further asserted that there was never been widespread or systemic harassment or gender pay inequity at Activision Blizzard. Kotick departed with a golden parachute estimated to be worth around $15 million.Updated: 04/01/2024, 2:00 p.m. ET: This article has been updated to include details of the CRD settlement, that Activision Blizzard denied any wrongdoing, and the settlement confirms CRD could not substantiate those claims. Updated: 05/17/2025, 12:10 p.m. ET: This article has been updated to include additional language from the CRD settlement, and to include that as part of the settlement, the CRD withdrew the claims related to harassment from its complaint..
    0 Σχόλια 0 Μοιράστηκε
  • The Best Free Software for 2025

    It's a mobile world, but we have not fully abandoned the desktop. The real workof computing requires a full personal computing system, and to get the most out of that, you need software.Software can be expensive, but free programs have been a mainstay of the desktop experience for decades, and today's offerings are pretty powerful. Software developers can adopt an ad-based model, donation-ware to keep things afloat, or a shareware/freemium model that charges for extra features.Something to always watch for: Crapware installers. To make ends meet, many creators of otherwise great free software, or the services that offer the programs for download, bundle in things you don't want. Worse, the installation routine obfuscates the steps, so you provide the unwanted program tacit permission to be installed. For more about how to spot and avoid this problem, see How to Rid a New PC of Crapware.A pro tip: Only download desktop software from the maker of the software directly. It's not foolproof—after all, developers want to eat, too—but it helps.Other Criteria:The software must be available directly from the developer/creator/original publisher.The software shouldhave a Windows-based download—no browser extensions here, because we're not all on the same browser. However, we've included web-based apps that are as good, or better, than most downloadable programs.If the software is on a tiered sales model, the free version cannot be trial-ware. It has to have at least a free-for-life option.Preferably the program had an update in the last year or two.The program should have little or no advertising to support it.Software for productivity is what this list is about; there are plenty of other places to find free PC games.For more free software, check out The 100 Best iPhone Apps and The 100 Best Android Apps.Did we miss any free programs you can't live without? Let us know in the comments.

    Best Free Audio-Editing Software

    Audacity

    4.0 Excellent

    Windows, macOS, LinuxOpen-source Audacity can record and edit audio files on more tracks than you can imagine. It then outputs exactly what you need. It is perfect for noobs and pros alike and works on any desktop OS.
    Audacity review

    Best Free Simple Video Editor

    CapCut

    4.0 Excellent

    Windows, macOS, iOS, Android, webWhile it seems like most video editing today takes place on phones, at least one mobile video editor has jumped to the desktop: ByteDance’s CapCut is on Windows; it's even in the Microsoft Store. In our review of the mobile version, we found it to be fast, easy, and powerful.
    CapCut review

    Best Free Advanced Video Editing

    DaVinci Resolve

    4.0 Excellent

    Windows, macOS, LinuxHow on earth does Blackmagic Design make DaVinci Resolve so capable as a video editor yet still offer a free version? The hope is that as users get better at making videos, they’ll buy the full suite for the extras, even if it costs Meanwhile, the free version can handle almost any 8-bit format up to 3,840 by 2,160 pixels for editing, color correction, VFX, motion graphics, and audio.
    DaVinci Resolve review

    Best Free Video Converter

    Handbrake

    3.5 Good

    Windows, macOS, LinuxNo one would call HandBrake simple, but few video transcoders—software that converts almost any video format into another video format—can compete when it comes to power and comprehensiveness. It's been around for over two decades and remains open-source.

    Best Free Cartooning Tool

    Pencil2D

    Windows, macOS, LinuxOpen-source and multiplatform, the Pencil 2D Animation tool is what it sounds like: a way to quickly create two-dimensional animations by penciling in each frame. The site is full of video tutorials to help you get the gist.

    Best Free Video Editing

    Shotcut

    3.5 Good

    Windows, macOS, LinuxWhile it lacks the slick interface found in most other video editors, Shotcut's got lot of power. It offers a phenomenal number of features and gets frequent improvement updates. Just don't expect it to feel like an Adobe product.

    Best Free Game-Recording/Streaming Software

    Streamlabs OBS

    Windows, Web, iOS, AndroidStream your video game sessions with Logitech's Streamlabs Desktop directly to YouTube, Twitch, or Facebook. You can switch between gameplay and your webcam, so you can show your face as you make commentary. There may be a learning curve, but you can find plenty of help online.

    Best Free Video Player

    VLC

    Windows, macOS, Linux, iOS, AndroidThe premier way to watch just about any video, no matter the clip's weird codec. VLC media player can auto-rotate smartphone videos taken at the wrong orientation and resume playback from where you left off during a previous session. Seriously, VLC plays back anything on all desktop platforms, and it guarantees no ads, tracking, or spyware.Best Free Messaging Software

    Discord

    4.5 Excellent

    Windows, macOS, Linux, web, iOS, Android, Xbox, PlayStationMillions of people worldwide use Discord for text, voice chatting, and video chatting—mainly while kicking one another's arses in online games or watching gameplay streams on Twitch or Caffeine. You can spend a feeto go premium for better video and audio quality and to upload larger files.
    Discord review

    Best Free Secure Messaging

    Signal Private Messenger

    4.5 Excellent

    Windows, macOS, Linux, iOS, AndroidPCMag’s Editors’ Choice Award winner for secure messagingis Signal, which you may recall from a recent high-level scandal. It does it all: group chat, voice chat, and video chat, all with mandatory end-to-end encryption. You need Android or iOS to register to use Signal, which requires the mobile app, but it also works on your desktop OSes. Perhaps best of all, it’s owned by a nonprofit with no incentive to sell your data. 
    Signal Private Messenger review

    Best Free Remote Access

    TeamViewer

    4.5 Excellent

    Windows, macOS, Linux, web, iOS, Android, ChromeOSPCMag's top pick for software that can control other computers is TeamViewer, which is only free for personal use. That version has everything you need: desktop sharing, file transfers, and chat with remote users. The setup couldn't be easier. Take control of a remote PC over an internet connection with the app, or use a browser with the TeamViewer extension. Just keep in mind that remote-access tools can be abused, so don't turn one on unless you're on the phone with the person you're allowing access to. And make sure to turn them off after you're done.
    TeamViewer review

    Best Free Friends and Family Messaging

    WhatsApp

    4.0 Excellent

    Windows, macOS, Linux, web, iOS, AndroidIf you want to avoid the giant corporations that run messaging services, maybe WhatsAppisn’t for you. But it is a massive service with a loyal user base, an easy-to-use interface, and self-destructing messages and images. It even uses the Signal protocol, so the folks at Meta can’t read what you send. But then again, you could just use Signal. Still, you might opt for WhatsApp if you have an existing platoon of friends and family using it.
    WhatsApp review

    Best Free Freeform Drawing

    Adobe Fresco

    4.5 Excellent

    Windows, iOSYou may think of Adobe Fresco—the company’s painting app—as strictly for mobile devices. But it is also available for Windows, whether you use it in tablet mode or not. The free version has its limits, but overall makes the feeling of drawing on a screen as close as you can get to doing so on paper.
    Adobe Fresco review

    Best Free AI

    ChatGPT

    4.0 Excellent

    Windows, macOS, iOS, AndroidDoes ChatGPT hallucinate and make mistakes? You better believe it. But it's still the most advanced and mature generative AI available today, especially considering you can do a lot with it for free. It'll generate text and imagesand even let you use the Deep Research function five times per month. You can do quite a bit without an account, but signing up unlocks features like saved chat history. And if you don't want to use it on the web, you can download ChatGPT apps for the operating systems above.For more, read our full review and note this disclosure: Ziff Davis, PCMag's parent company, filed a lawsuit against OpenAI in April 2025, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.
    ChatGPT review

    Best Free Painting Software

    Krita

    Windows, macOS, LinuxKrita is a powerful, full-fledged painting tool for digital artists. It does come with a bit of a learning curve, but the nonexistent price tag and the vibrant community behind it make it more than worth digging into, especially if you’ve got artistic skills but no desire to pick up paint and brushes IRL.

    Best Free Desktop Publishing Tool

    Scribus

    Windows, macOS, LinuxScribus is the open-source equivalent of Adobe InDesign for desktop publishing, or as close as you can get to it, with a history that goes back almost a quarter century. It has built-in color separation, color management, and a lot more—including its own wiki for documentation.

    Best Free World-Building Tool

    Shaxpir

    Windows, macOSPronounced like the playwright, Shaxpir is essentially a simplistic version of our top-rated Scrivener, with an “everyone” free tier that is very useful. For no charge, you get the full manuscript builder, world-building notebook, progress tracker, offline use, and cloud backup. Still, pros might consider the -a-month subscription with extra features a bargain after the 30-day trial.

    Best Free Screenwriting Tool

    Trelby

    Windows, LinuxDo you fancy yourself a budding screenwriter but lack the funds for high-end tools like Final Draft? Trelby does a fine job of helping you format scripts correctly, remember character names, and import and export to formats used in Hollywood.

    Best Free Android Emulation

    BlueStacks 5

    Windows, macOSFor a hot second, Windows 11 had an Android simulator that could play apps from the Amazon store, but that got shut down. The next best option is BlueStacks, which only takes up about 5GB of space and can access the Google Play Store. The emulator will help you map your mouse and keyboard to work with Android games. For more info, read Ways to Run Android Apps on Your PC for Free.

    Best Free Social Photo Sharing

    Instagram3.0 Good

    Windows, WebSocial media apps don’t have to just be on your phone. Like TikTok, you can get to the 'Gram on your desktop with this app found on the Windows Store. It’ll show you all the amazing images shared by people and brands you follow, as well as the Reels they generate.

    Best Free Maps Software

    Google Earth

    Windows, macOS, Linux, Web, iOS, AndroidAs if high-end software that lets you virtually fly across the globe isn't cool enough, Google Earth Pro for the desktop is totally free. It includes advanced features such as high-resolution printing, distance measuring, and global guided tours. Although it also comes in web and mobile versions, the desktop version is the only one that lets you view satellite images of the moon and Mars. Plus, it has star maps and will even let you go back in time.

    Best Free Writing Tool

    yWriter

    3.5 Good

    Windows, macOS, iOS, AndroidThe highly structured interface of yWriter can help anyone, from budding to experienced novelists, get a real handle on their story and its characters. The program is full of stats on what you have written, providing you with a data-driven writing experience. It doesn't have the depth of Scrivener, but it's free.
    yWriter review

    Best Free Media Center

    Plex TV

    Windows, macOS, Linux, iOS, Android, Xbox, PlayStation, Smart TVs, media hubs, NAS devicesIf you don't know or care what a media server is, but you just want to stream your videos and music collection around the house, Plex could work well for you. Install it on all your devices, point it at some media, and those audio and video files become available on everything—even remotely. For more, read How to Set Up a Plex Server, How to Share Your Plex Libraries, How to Organize Your Plex Media Library, and The Expert's Guide to Managing Your Plex Server.

    Best Free File Viewer and Converter

    Faststone Image Viewer

    Windows onlyView, manage, and compare your images with this fast and intuitive freebie. FastStone Image Viewer supports a wide range of image formats, including unprocessed raw files from specific digital camera manufacturers.It also has companion apps for screenshots and photo resizing.

    Best Free Photoshop Replacement

    GNU Image Manipulation Program3.5 Good

    Windows, macOS, LinuxGIMP is a stalwart of the open-source world. It's a full-featured Photoshop alternative with all the functions—including layers, filters, masking, and plug-ins—that image editors need. It may lack the polish and AI extras you get with Adobe’s product, but GIMP more than makes up for that by being really, truly free. You can get it for Windows in the Microsoft Store.
    GNU Image Manipulation Programreview

    Best Free Graphics SoftwareInkscape

    3.0 Good

    Windows, macOS, LinuxAdobe Illustrator is the high bar of vector image editing, but it has a premium price to match. You can still get cross-platform Scalable Vector Graphic image creation with the free Inkscape. You'll have to work a little harder to learn it, but it may be exactly what a talentedartist needs.

    Best Free Graphics SoftwarePaint.net

    WindowsIs Paint.net a perfect replacement for Photoshop? Nothing is as powerful as Adobe's program, but at this price—free—Paint.net comes close. For any minorpicture manipulation, it's fast, comprehensive, and easy to use.

    Best Free PDF Reader

    Foxit PDF Reader

    Windows, macOSJust about any browser can read a PDF. But Foxit PDF Reader is free, not just for reading but also for annotation and collaboration on files. The program allows you to send signed and edited PDF files to friends or coworkers and works seamlessly with the Foxit PDF Editor on mobile platforms. For more, read How to Convert PDFs to Word Documents and Image Files.

    Best Free Grammar Help

    Grammarly

    4.0 Excellent

    Windows, macOS, Web, iOS, AndroidIf you use the internet, you’ve probably heard of Grammarly—the ads are everywhere. The free version provides plenty of insights and suggestions to improve all the words you put on the screen in almost any program. And, yes, it really can up your writing game.
    Grammarly review

    Best Cross-Platform Note Taker

    Joplin

    4.5 Excellent

    Windows, macOS, Linux, Web, iOS, AndroidOur review of Joplin calls it "the ideal note-taking app for users who value simplicity.” It lacks some advanced features, but the open-source tool works on all major platforms to do what you need most: store unlimited notes. You only pay if you want to get into sharing and collaboration. It even has a web clipper browser extension for grabbing notes as you traverse the internet.
    Joplin review

    Best Free Kanban Project Management

    Kanri

    Windows, macOS, LinuxIf you do any kind of projects or organizing that involve index cards, then you have probably embraced the Kanban board approach. Kanri is a great, free way to Kanban your desktop without signing in or creating an account—it doesn't even need you to be online. As a bonus, it can import boards from big-name products like Trello.

    Best Free Office Suite

    LibreOffice

    3.0 Good

    Windows, macOS, LinuxThere aren't many free office suites, and only one is a free, open-source download available for the major desktop operating systems. LibreOffice could be a bit more polished, lacks collaboration features, and sports an overstuffed toolbar interface that might remind you of Microsoft Office a decade ago. But it's powerful nevertheless, and it easily converts and imports files from other systems. It comes with a word processor, a spreadsheet component, a presentation program, a vector drawing program, and even a full databaseand math-formula editor.
    LibreOffice review

    Best Free Note-Taking App

    Microsoft OneNote

    4.5 Excellent

    Windows, macOS, iOS, Android, WebOnce just a part of Microsoft Office, the sublime OneNote has become a free, standalone powerhouse for note-taking across all the major operating systems. It still works with Office, syncs data across all platforms, and has full online access via Office.com, with storage on OneDrive. That's why it's our Editors' Choice pick for note storage.
    Microsoft OneNote review

    Best Free Browser

    Firefox

    4.5 Excellent

    Windows, macOS, Linux, iOS, AndroidThe venerable browser Firefox remains highly customizable and strong on security, privacy, and performance. It stays cutting-edge without the backing of Big Tech—in fact, the Firefox website brags that its parent, Mozilla, has been "billionaire-free for 20+ years." Mozilla also owns Pocket, so you can easily use Firefox to save what you see online to that read-it-later service. For more, read Which Browser Is Best? and Top Firefox Tips.

    Best Free Text Editor

    Notepad++

    WindowsNotepad++ is nothing like the anemic Notepad that Windows users grew used to over the decades. This free download has tabs, color-coded nesting text, WYSIWYG printing, and support for macros. It's a must for hand-coders or any writer who wants a minimalist interface.

    Best Power-User Note Taker

    Obsidian

    4.0 Excellent

    Windows, macOS, Linux, iOS, AndroidObsidian’s got a learning curve, but once mastered, it's the best note-taker for power users. The free version is available for personal use—it lacks only support and sync options, but you can get around the sync by storing your Obsidian Vault in a spot where a cloud service backs it up.
    Obsidian review

    Best Free Doc Viewer and Annotator

    Okular

    Windows, LinuxIf you seek a free and full-fledged PDF editor, Okular can do the job. It boasts annotations and highlights, even digital signature support. It will also read many other formats, including ePub books, comics formats, and many types of images.

    Best To-Do List for Everyone

    Todoist

    5.0 Outstanding

    Windows, macOS, Linux, iOS, Android, WebThis is our favorite to-do list app, ever. We give the paid version a full five-star review, but even the free version is fantastic. The Todoist interface is simple perfection on all platforms—even wearables and via email. The free version gives you five projects with five collaborators on each, supports uploads of 5MB files, and keeps a one-week active history.
    Todoist review

    Best Programming Environment

    Visual Studio Code

    Windows, macOS, Linux, webNeed to write some code? Use VS Code from Microsoft. It has everything you’d want in a coding environment, from plug-ins to great organization. And it's easy to get started with this program, even though you have to do a little setup to tweak it to perfection.

    Best Free Antivirus

    Avast One Basic

    4.5 Excellent

    Windows, macOS, iOS, AndroidOur Editors' Choice award winner for free antivirus this year is Avast One Basic. It's a top scorer against malware in lab tests, and it did great in our hands-on tests, too. It offers more free protection than ever.
    Avast One Basic review

    Best Free Secure Browser

    Bitwarden

    4.0 Excellent

    Windows, macOS, Linux, iOS, AndroidDo you want to stop the trackers watching you online dead? Going incognito on a standard browser isn't enough. You need to use a full-on privacy browser, one that blocks cookies and prevents the fingerprinting of your whole browser and computer. Brave is one of a slew of them with a rating for strong protection from the Electronic Frontier Foundation. For details, read The Best Private Browsers.
    Bitwarden review

    Best Free Desktop Authenticator

    Ente Auth

    Windows, macOS, Linux, iOS, Android, webWhen it comes to multi-factor authentication, the downside to most authenticator apps is that they're mobile-only. If you don't have your phone close by when asked for the code, you're out of luck. So, it's very nice to have a desktop MFA authenticator. Authy had one but killed it. Ente Auth is here to take up the slack. Set up your MFA logins with it on the phone or tablet, and all the codes sync with the desktop versions. Plus, it's always previewing your next code, so you don't have to wait, and it lets you share codes with a team.

    Best Free Password Manager

    Proton Pass

    4.5 Excellent

    Windows, macOS, Linux, Android, iOS, multiple browser extensionsProton already has a great reputation. Its Proton Pass offers the most outstanding password management of the year while charging you nothing. It includes email alias options, dark web monitoring, and password hygiene, all while managing an unlimited number of passwords and credentials. You can pay for extra features like credit card storage and data breach monitoring. For more, read our guide to The Best Free Password Managers.
    Proton Pass review

    Best Clipping with Annotations

    ClipClip

    WindowsClipClip holds multiple copied items in the clipboard, lets you extract text from images to paste, syncs on cloud services, allows history searches, and even does on-the-fly translation. It also allows for full-screen and video captures, plus edits and annotations.

    Best Synchronization of Clipboards

    Recuva

    3.5 Good

    WindowsThe clipboard has come a long way, but you can take it further with a tool like Ditto. It’ll not only show you everything you’ve copied, but also handle searches, allow multiple ways to select, and keep the contents of multiple computers’ clipboards synchronized.

    Best Free Local Search Tool

    Everything

    3.0 Good

    WindowsEverything has been around a long while and continues plugging along to help people find the things on their PC that built-in search can’t seem to fathom. It can even look inside files, though it won’t index them. If you name files and folders carefully, it will bring you results fast.

    Best Free Backup and Synchronization Software

    IDrive

    4.5 Excellent

    Windows, macOS, Linux, iOS, AndroidIDrive is a PCMag Editors' Choice award winner for cloud storage and file sharing. You get 10GB free from IDrive to back up files from all your devices, an upgrade from the original 5GB. If that's enough capacity for you, you'll find this service more than up to your needs. It'll even back up your photos and videos from Facebook. Bonus: At this price tier, you don't have to give the company a credit card.
    IDrive review

    Best Media Viewer and Annotator

    IrfanView

    WindowsIrfanView has been letting people view, edit, and organize media and more on Windows for well over a quarter century now. The current version supports Vista all the way up to 11. The list of file format types you can click on, view, and annotate instantly is long, and the program's ease of use is legendary. And it's utterly free for personal use.

    Best Free Screen Capture Editor

    Gemoo Snap

    Windows, macOSWhen it comes to screengrabs, if the Snipping Tool in Windows doesn’t do it for you, Gemoo Snap is an excellent alternative. It's available for the desktopor just as a Chrome extension if you only capture web pages. You can snap a screen, then annotate it, share it, pull out text, or even “beautify” it with edits and new backgrounds.

    Best Free File Compression for Archives

    NanaZip

    WindowsA lot of people adore the 7-zip archiving software. NanaZip is a fork of the original code, meant to make the archive experience feel more native to Windows 10 and 11 by working right in the context menu of File Explorer.

    Best Free File Manager for Windows

    OneCommander

    WindowsIf you find the Windows 10 and 11 way of dealing with files—via the built-in File Explorer—a chore, consider an upgrade to a third-party file manager. OneCommander has all the extras you'd want, including tab support, file previews, dual-pane browsing, dark and light themes, and a lot more. Best of all: It's fast. And free for home use.

    Best Free File Recovery and Deletion

    Recuva

    3.5 Good

    WindowsRecuvais a must for any techie's tool belt: It's the key to helping recover a lost file. It's easy to understand, but note: Recuva should really be installed before you lose a file. It's a portable application, too, so you have the option to run it from a USB thumb drive.

    Best for Screen Video Capture

    ScreenPal

    4.5 Excellent

    Windows, macOS, Android, iOSWant to capture more than a still image? ScreenPalwill do it. The free-to-use-forever tier will take still shots, up to 15 minutes of video of your screen, and share to social, plus store as much as you want online. The mobile apps will sync your captured files. We gave it an Editors' Choice award. You can pay a year if you want unlimited full-screen video recording sans watermarks.
    ScreenPal review

    Best Free Power Screen Grabber

    ShareX

    WindowsWhat ShareX lacks in sexiness it makes up for in power, offering just about every option one could wish for in capturing a Windows screen. It supports image effects add-ons such as backgrounds and borders, optical character recognition, and pre-set actions for processing captures just the way you like them.

    Best Free Screen Capture

    Microsoft Snip

    WindowsEven those with modest screen-capture needs would say the old Snipping Tool in Windows was...lacking. The new version of Snipping Tool merges it with the Windows Snip & Sketch, which was itself an evolutionary leap. Now it's more revolutionary, as it can also capture things like video and voice. Plus, you can annotate a screengrab. For more, read The Best Screen Capture Apps.

    Best Free Simple File Backup

    SyncBackFree

    WindowsSyncBack dates way back and still rocks at synchronizing backups. That includes the free version, which can copy files in both directions to make a restore as easy as a backup.

    Best Free Social Media Software

    TikTok Windows

    Windows, Web, iOS, AndroidYou probably think of TikTok as a mobile-only phenomenon. However, not only can you access the video wonderland on the desktop at TikTok.com, but there's also a well-done app for it right in the Windows Store. TikTok for Windows won't work with your webcam, but you can use it to upload videos you edit to perfection with desktop video tools. It's all free but has ads for support—just like on the mobile version, they show up looking like videos you might want to see.

    Best Free File Transfer Program

    Teracopy

    Windows, macOS, AndroidSure, Windows itself copies files between folders and drives just fine. But TeraCopy can take over that job and do it faster, and its interface for making copies is better-looking. Plus, it provides more information and feedback, and it can even recover from transfer errors.

    Best Free VPN

    Proton VPN5.0 Outstanding

    Windows, ChromeOS, macOS, Linux, iOS, AndroidYou probably should pay for a VPN, but you can save cash with a tool like the PCMag Editors' Choice award winner ProtonVPN, albeit with a few restrictions. It's not just our pick for the best free VPN; it's our best VPN overall. With the free ProtonVPN, your bandwidth is not limited, and the focus is mainly on keeping you secure. For more, read The Best Free VPNs.
    Proton VPNreview

    Best Free Video Conferencing

    Zoom Workplace

    4.5 Excellent

    Windows, macOS, Linux, web, iOS, AndroidWant to host an online meeting for you and 100 of your closest friends? Zoom Workplace will let them all in for free, with a 40-minute time limit. They can join from any device, even a smartphone. Competitively priced premium plans with additional features are also available. Zoom is a PCMag Editors' Choice award winner for communicationsand productivity. Also, check out our top Zoom tips.
    Zoom Workplace review
    #best #free #software
    The Best Free Software for 2025
    It's a mobile world, but we have not fully abandoned the desktop. The real workof computing requires a full personal computing system, and to get the most out of that, you need software.Software can be expensive, but free programs have been a mainstay of the desktop experience for decades, and today's offerings are pretty powerful. Software developers can adopt an ad-based model, donation-ware to keep things afloat, or a shareware/freemium model that charges for extra features.Something to always watch for: Crapware installers. To make ends meet, many creators of otherwise great free software, or the services that offer the programs for download, bundle in things you don't want. Worse, the installation routine obfuscates the steps, so you provide the unwanted program tacit permission to be installed. For more about how to spot and avoid this problem, see How to Rid a New PC of Crapware.A pro tip: Only download desktop software from the maker of the software directly. It's not foolproof—after all, developers want to eat, too—but it helps.Other Criteria:The software must be available directly from the developer/creator/original publisher.The software shouldhave a Windows-based download—no browser extensions here, because we're not all on the same browser. However, we've included web-based apps that are as good, or better, than most downloadable programs.If the software is on a tiered sales model, the free version cannot be trial-ware. It has to have at least a free-for-life option.Preferably the program had an update in the last year or two.The program should have little or no advertising to support it.Software for productivity is what this list is about; there are plenty of other places to find free PC games.For more free software, check out The 100 Best iPhone Apps and The 100 Best Android Apps.Did we miss any free programs you can't live without? Let us know in the comments. Best Free Audio-Editing Software Audacity 4.0 Excellent Windows, macOS, LinuxOpen-source Audacity can record and edit audio files on more tracks than you can imagine. It then outputs exactly what you need. It is perfect for noobs and pros alike and works on any desktop OS. Audacity review Best Free Simple Video Editor CapCut 4.0 Excellent Windows, macOS, iOS, Android, webWhile it seems like most video editing today takes place on phones, at least one mobile video editor has jumped to the desktop: ByteDance’s CapCut is on Windows; it's even in the Microsoft Store. In our review of the mobile version, we found it to be fast, easy, and powerful. CapCut review Best Free Advanced Video Editing DaVinci Resolve 4.0 Excellent Windows, macOS, LinuxHow on earth does Blackmagic Design make DaVinci Resolve so capable as a video editor yet still offer a free version? The hope is that as users get better at making videos, they’ll buy the full suite for the extras, even if it costs Meanwhile, the free version can handle almost any 8-bit format up to 3,840 by 2,160 pixels for editing, color correction, VFX, motion graphics, and audio. DaVinci Resolve review Best Free Video Converter Handbrake 3.5 Good Windows, macOS, LinuxNo one would call HandBrake simple, but few video transcoders—software that converts almost any video format into another video format—can compete when it comes to power and comprehensiveness. It's been around for over two decades and remains open-source. Best Free Cartooning Tool Pencil2D Windows, macOS, LinuxOpen-source and multiplatform, the Pencil 2D Animation tool is what it sounds like: a way to quickly create two-dimensional animations by penciling in each frame. The site is full of video tutorials to help you get the gist. Best Free Video Editing Shotcut 3.5 Good Windows, macOS, LinuxWhile it lacks the slick interface found in most other video editors, Shotcut's got lot of power. It offers a phenomenal number of features and gets frequent improvement updates. Just don't expect it to feel like an Adobe product. Best Free Game-Recording/Streaming Software Streamlabs OBS Windows, Web, iOS, AndroidStream your video game sessions with Logitech's Streamlabs Desktop directly to YouTube, Twitch, or Facebook. You can switch between gameplay and your webcam, so you can show your face as you make commentary. There may be a learning curve, but you can find plenty of help online. Best Free Video Player VLC Windows, macOS, Linux, iOS, AndroidThe premier way to watch just about any video, no matter the clip's weird codec. VLC media player can auto-rotate smartphone videos taken at the wrong orientation and resume playback from where you left off during a previous session. Seriously, VLC plays back anything on all desktop platforms, and it guarantees no ads, tracking, or spyware.Best Free Messaging Software Discord 4.5 Excellent Windows, macOS, Linux, web, iOS, Android, Xbox, PlayStationMillions of people worldwide use Discord for text, voice chatting, and video chatting—mainly while kicking one another's arses in online games or watching gameplay streams on Twitch or Caffeine. You can spend a feeto go premium for better video and audio quality and to upload larger files. Discord review Best Free Secure Messaging Signal Private Messenger 4.5 Excellent Windows, macOS, Linux, iOS, AndroidPCMag’s Editors’ Choice Award winner for secure messagingis Signal, which you may recall from a recent high-level scandal. It does it all: group chat, voice chat, and video chat, all with mandatory end-to-end encryption. You need Android or iOS to register to use Signal, which requires the mobile app, but it also works on your desktop OSes. Perhaps best of all, it’s owned by a nonprofit with no incentive to sell your data.  Signal Private Messenger review Best Free Remote Access TeamViewer 4.5 Excellent Windows, macOS, Linux, web, iOS, Android, ChromeOSPCMag's top pick for software that can control other computers is TeamViewer, which is only free for personal use. That version has everything you need: desktop sharing, file transfers, and chat with remote users. The setup couldn't be easier. Take control of a remote PC over an internet connection with the app, or use a browser with the TeamViewer extension. Just keep in mind that remote-access tools can be abused, so don't turn one on unless you're on the phone with the person you're allowing access to. And make sure to turn them off after you're done. TeamViewer review Best Free Friends and Family Messaging WhatsApp 4.0 Excellent Windows, macOS, Linux, web, iOS, AndroidIf you want to avoid the giant corporations that run messaging services, maybe WhatsAppisn’t for you. But it is a massive service with a loyal user base, an easy-to-use interface, and self-destructing messages and images. It even uses the Signal protocol, so the folks at Meta can’t read what you send. But then again, you could just use Signal. Still, you might opt for WhatsApp if you have an existing platoon of friends and family using it. WhatsApp review Best Free Freeform Drawing Adobe Fresco 4.5 Excellent Windows, iOSYou may think of Adobe Fresco—the company’s painting app—as strictly for mobile devices. But it is also available for Windows, whether you use it in tablet mode or not. The free version has its limits, but overall makes the feeling of drawing on a screen as close as you can get to doing so on paper. Adobe Fresco review Best Free AI ChatGPT 4.0 Excellent Windows, macOS, iOS, AndroidDoes ChatGPT hallucinate and make mistakes? You better believe it. But it's still the most advanced and mature generative AI available today, especially considering you can do a lot with it for free. It'll generate text and imagesand even let you use the Deep Research function five times per month. You can do quite a bit without an account, but signing up unlocks features like saved chat history. And if you don't want to use it on the web, you can download ChatGPT apps for the operating systems above.For more, read our full review and note this disclosure: Ziff Davis, PCMag's parent company, filed a lawsuit against OpenAI in April 2025, alleging it infringed Ziff Davis copyrights in training and operating its AI systems. ChatGPT review Best Free Painting Software Krita Windows, macOS, LinuxKrita is a powerful, full-fledged painting tool for digital artists. It does come with a bit of a learning curve, but the nonexistent price tag and the vibrant community behind it make it more than worth digging into, especially if you’ve got artistic skills but no desire to pick up paint and brushes IRL. Best Free Desktop Publishing Tool Scribus Windows, macOS, LinuxScribus is the open-source equivalent of Adobe InDesign for desktop publishing, or as close as you can get to it, with a history that goes back almost a quarter century. It has built-in color separation, color management, and a lot more—including its own wiki for documentation. Best Free World-Building Tool Shaxpir Windows, macOSPronounced like the playwright, Shaxpir is essentially a simplistic version of our top-rated Scrivener, with an “everyone” free tier that is very useful. For no charge, you get the full manuscript builder, world-building notebook, progress tracker, offline use, and cloud backup. Still, pros might consider the -a-month subscription with extra features a bargain after the 30-day trial. Best Free Screenwriting Tool Trelby Windows, LinuxDo you fancy yourself a budding screenwriter but lack the funds for high-end tools like Final Draft? Trelby does a fine job of helping you format scripts correctly, remember character names, and import and export to formats used in Hollywood. Best Free Android Emulation BlueStacks 5 Windows, macOSFor a hot second, Windows 11 had an Android simulator that could play apps from the Amazon store, but that got shut down. The next best option is BlueStacks, which only takes up about 5GB of space and can access the Google Play Store. The emulator will help you map your mouse and keyboard to work with Android games. For more info, read Ways to Run Android Apps on Your PC for Free. Best Free Social Photo Sharing Instagram3.0 Good Windows, WebSocial media apps don’t have to just be on your phone. Like TikTok, you can get to the 'Gram on your desktop with this app found on the Windows Store. It’ll show you all the amazing images shared by people and brands you follow, as well as the Reels they generate. Best Free Maps Software Google Earth Windows, macOS, Linux, Web, iOS, AndroidAs if high-end software that lets you virtually fly across the globe isn't cool enough, Google Earth Pro for the desktop is totally free. It includes advanced features such as high-resolution printing, distance measuring, and global guided tours. Although it also comes in web and mobile versions, the desktop version is the only one that lets you view satellite images of the moon and Mars. Plus, it has star maps and will even let you go back in time. Best Free Writing Tool yWriter 3.5 Good Windows, macOS, iOS, AndroidThe highly structured interface of yWriter can help anyone, from budding to experienced novelists, get a real handle on their story and its characters. The program is full of stats on what you have written, providing you with a data-driven writing experience. It doesn't have the depth of Scrivener, but it's free. yWriter review Best Free Media Center Plex TV Windows, macOS, Linux, iOS, Android, Xbox, PlayStation, Smart TVs, media hubs, NAS devicesIf you don't know or care what a media server is, but you just want to stream your videos and music collection around the house, Plex could work well for you. Install it on all your devices, point it at some media, and those audio and video files become available on everything—even remotely. For more, read How to Set Up a Plex Server, How to Share Your Plex Libraries, How to Organize Your Plex Media Library, and The Expert's Guide to Managing Your Plex Server. Best Free File Viewer and Converter Faststone Image Viewer Windows onlyView, manage, and compare your images with this fast and intuitive freebie. FastStone Image Viewer supports a wide range of image formats, including unprocessed raw files from specific digital camera manufacturers.It also has companion apps for screenshots and photo resizing. Best Free Photoshop Replacement GNU Image Manipulation Program3.5 Good Windows, macOS, LinuxGIMP is a stalwart of the open-source world. It's a full-featured Photoshop alternative with all the functions—including layers, filters, masking, and plug-ins—that image editors need. It may lack the polish and AI extras you get with Adobe’s product, but GIMP more than makes up for that by being really, truly free. You can get it for Windows in the Microsoft Store. GNU Image Manipulation Programreview Best Free Graphics SoftwareInkscape 3.0 Good Windows, macOS, LinuxAdobe Illustrator is the high bar of vector image editing, but it has a premium price to match. You can still get cross-platform Scalable Vector Graphic image creation with the free Inkscape. You'll have to work a little harder to learn it, but it may be exactly what a talentedartist needs. Best Free Graphics SoftwarePaint.net WindowsIs Paint.net a perfect replacement for Photoshop? Nothing is as powerful as Adobe's program, but at this price—free—Paint.net comes close. For any minorpicture manipulation, it's fast, comprehensive, and easy to use. Best Free PDF Reader Foxit PDF Reader Windows, macOSJust about any browser can read a PDF. But Foxit PDF Reader is free, not just for reading but also for annotation and collaboration on files. The program allows you to send signed and edited PDF files to friends or coworkers and works seamlessly with the Foxit PDF Editor on mobile platforms. For more, read How to Convert PDFs to Word Documents and Image Files. Best Free Grammar Help Grammarly 4.0 Excellent Windows, macOS, Web, iOS, AndroidIf you use the internet, you’ve probably heard of Grammarly—the ads are everywhere. The free version provides plenty of insights and suggestions to improve all the words you put on the screen in almost any program. And, yes, it really can up your writing game. Grammarly review Best Cross-Platform Note Taker Joplin 4.5 Excellent Windows, macOS, Linux, Web, iOS, AndroidOur review of Joplin calls it "the ideal note-taking app for users who value simplicity.” It lacks some advanced features, but the open-source tool works on all major platforms to do what you need most: store unlimited notes. You only pay if you want to get into sharing and collaboration. It even has a web clipper browser extension for grabbing notes as you traverse the internet. Joplin review Best Free Kanban Project Management Kanri Windows, macOS, LinuxIf you do any kind of projects or organizing that involve index cards, then you have probably embraced the Kanban board approach. Kanri is a great, free way to Kanban your desktop without signing in or creating an account—it doesn't even need you to be online. As a bonus, it can import boards from big-name products like Trello. Best Free Office Suite LibreOffice 3.0 Good Windows, macOS, LinuxThere aren't many free office suites, and only one is a free, open-source download available for the major desktop operating systems. LibreOffice could be a bit more polished, lacks collaboration features, and sports an overstuffed toolbar interface that might remind you of Microsoft Office a decade ago. But it's powerful nevertheless, and it easily converts and imports files from other systems. It comes with a word processor, a spreadsheet component, a presentation program, a vector drawing program, and even a full databaseand math-formula editor. LibreOffice review Best Free Note-Taking App Microsoft OneNote 4.5 Excellent Windows, macOS, iOS, Android, WebOnce just a part of Microsoft Office, the sublime OneNote has become a free, standalone powerhouse for note-taking across all the major operating systems. It still works with Office, syncs data across all platforms, and has full online access via Office.com, with storage on OneDrive. That's why it's our Editors' Choice pick for note storage. Microsoft OneNote review Best Free Browser Firefox 4.5 Excellent Windows, macOS, Linux, iOS, AndroidThe venerable browser Firefox remains highly customizable and strong on security, privacy, and performance. It stays cutting-edge without the backing of Big Tech—in fact, the Firefox website brags that its parent, Mozilla, has been "billionaire-free for 20+ years." Mozilla also owns Pocket, so you can easily use Firefox to save what you see online to that read-it-later service. For more, read Which Browser Is Best? and Top Firefox Tips. Best Free Text Editor Notepad++ WindowsNotepad++ is nothing like the anemic Notepad that Windows users grew used to over the decades. This free download has tabs, color-coded nesting text, WYSIWYG printing, and support for macros. It's a must for hand-coders or any writer who wants a minimalist interface. Best Power-User Note Taker Obsidian 4.0 Excellent Windows, macOS, Linux, iOS, AndroidObsidian’s got a learning curve, but once mastered, it's the best note-taker for power users. The free version is available for personal use—it lacks only support and sync options, but you can get around the sync by storing your Obsidian Vault in a spot where a cloud service backs it up. Obsidian review Best Free Doc Viewer and Annotator Okular Windows, LinuxIf you seek a free and full-fledged PDF editor, Okular can do the job. It boasts annotations and highlights, even digital signature support. It will also read many other formats, including ePub books, comics formats, and many types of images. Best To-Do List for Everyone Todoist 5.0 Outstanding Windows, macOS, Linux, iOS, Android, WebThis is our favorite to-do list app, ever. We give the paid version a full five-star review, but even the free version is fantastic. The Todoist interface is simple perfection on all platforms—even wearables and via email. The free version gives you five projects with five collaborators on each, supports uploads of 5MB files, and keeps a one-week active history. Todoist review Best Programming Environment Visual Studio Code Windows, macOS, Linux, webNeed to write some code? Use VS Code from Microsoft. It has everything you’d want in a coding environment, from plug-ins to great organization. And it's easy to get started with this program, even though you have to do a little setup to tweak it to perfection. Best Free Antivirus Avast One Basic 4.5 Excellent Windows, macOS, iOS, AndroidOur Editors' Choice award winner for free antivirus this year is Avast One Basic. It's a top scorer against malware in lab tests, and it did great in our hands-on tests, too. It offers more free protection than ever. Avast One Basic review Best Free Secure Browser Bitwarden 4.0 Excellent Windows, macOS, Linux, iOS, AndroidDo you want to stop the trackers watching you online dead? Going incognito on a standard browser isn't enough. You need to use a full-on privacy browser, one that blocks cookies and prevents the fingerprinting of your whole browser and computer. Brave is one of a slew of them with a rating for strong protection from the Electronic Frontier Foundation. For details, read The Best Private Browsers. Bitwarden review Best Free Desktop Authenticator Ente Auth Windows, macOS, Linux, iOS, Android, webWhen it comes to multi-factor authentication, the downside to most authenticator apps is that they're mobile-only. If you don't have your phone close by when asked for the code, you're out of luck. So, it's very nice to have a desktop MFA authenticator. Authy had one but killed it. Ente Auth is here to take up the slack. Set up your MFA logins with it on the phone or tablet, and all the codes sync with the desktop versions. Plus, it's always previewing your next code, so you don't have to wait, and it lets you share codes with a team. Best Free Password Manager Proton Pass 4.5 Excellent Windows, macOS, Linux, Android, iOS, multiple browser extensionsProton already has a great reputation. Its Proton Pass offers the most outstanding password management of the year while charging you nothing. It includes email alias options, dark web monitoring, and password hygiene, all while managing an unlimited number of passwords and credentials. You can pay for extra features like credit card storage and data breach monitoring. For more, read our guide to The Best Free Password Managers. Proton Pass review Best Clipping with Annotations ClipClip WindowsClipClip holds multiple copied items in the clipboard, lets you extract text from images to paste, syncs on cloud services, allows history searches, and even does on-the-fly translation. It also allows for full-screen and video captures, plus edits and annotations. Best Synchronization of Clipboards Recuva 3.5 Good WindowsThe clipboard has come a long way, but you can take it further with a tool like Ditto. It’ll not only show you everything you’ve copied, but also handle searches, allow multiple ways to select, and keep the contents of multiple computers’ clipboards synchronized. Best Free Local Search Tool Everything 3.0 Good WindowsEverything has been around a long while and continues plugging along to help people find the things on their PC that built-in search can’t seem to fathom. It can even look inside files, though it won’t index them. If you name files and folders carefully, it will bring you results fast. Best Free Backup and Synchronization Software IDrive 4.5 Excellent Windows, macOS, Linux, iOS, AndroidIDrive is a PCMag Editors' Choice award winner for cloud storage and file sharing. You get 10GB free from IDrive to back up files from all your devices, an upgrade from the original 5GB. If that's enough capacity for you, you'll find this service more than up to your needs. It'll even back up your photos and videos from Facebook. Bonus: At this price tier, you don't have to give the company a credit card. IDrive review Best Media Viewer and Annotator IrfanView WindowsIrfanView has been letting people view, edit, and organize media and more on Windows for well over a quarter century now. The current version supports Vista all the way up to 11. The list of file format types you can click on, view, and annotate instantly is long, and the program's ease of use is legendary. And it's utterly free for personal use. Best Free Screen Capture Editor Gemoo Snap Windows, macOSWhen it comes to screengrabs, if the Snipping Tool in Windows doesn’t do it for you, Gemoo Snap is an excellent alternative. It's available for the desktopor just as a Chrome extension if you only capture web pages. You can snap a screen, then annotate it, share it, pull out text, or even “beautify” it with edits and new backgrounds. Best Free File Compression for Archives NanaZip WindowsA lot of people adore the 7-zip archiving software. NanaZip is a fork of the original code, meant to make the archive experience feel more native to Windows 10 and 11 by working right in the context menu of File Explorer. Best Free File Manager for Windows OneCommander WindowsIf you find the Windows 10 and 11 way of dealing with files—via the built-in File Explorer—a chore, consider an upgrade to a third-party file manager. OneCommander has all the extras you'd want, including tab support, file previews, dual-pane browsing, dark and light themes, and a lot more. Best of all: It's fast. And free for home use. Best Free File Recovery and Deletion Recuva 3.5 Good WindowsRecuvais a must for any techie's tool belt: It's the key to helping recover a lost file. It's easy to understand, but note: Recuva should really be installed before you lose a file. It's a portable application, too, so you have the option to run it from a USB thumb drive. Best for Screen Video Capture ScreenPal 4.5 Excellent Windows, macOS, Android, iOSWant to capture more than a still image? ScreenPalwill do it. The free-to-use-forever tier will take still shots, up to 15 minutes of video of your screen, and share to social, plus store as much as you want online. The mobile apps will sync your captured files. We gave it an Editors' Choice award. You can pay a year if you want unlimited full-screen video recording sans watermarks. ScreenPal review Best Free Power Screen Grabber ShareX WindowsWhat ShareX lacks in sexiness it makes up for in power, offering just about every option one could wish for in capturing a Windows screen. It supports image effects add-ons such as backgrounds and borders, optical character recognition, and pre-set actions for processing captures just the way you like them. Best Free Screen Capture Microsoft Snip WindowsEven those with modest screen-capture needs would say the old Snipping Tool in Windows was...lacking. The new version of Snipping Tool merges it with the Windows Snip & Sketch, which was itself an evolutionary leap. Now it's more revolutionary, as it can also capture things like video and voice. Plus, you can annotate a screengrab. For more, read The Best Screen Capture Apps. Best Free Simple File Backup SyncBackFree WindowsSyncBack dates way back and still rocks at synchronizing backups. That includes the free version, which can copy files in both directions to make a restore as easy as a backup. Best Free Social Media Software TikTok Windows Windows, Web, iOS, AndroidYou probably think of TikTok as a mobile-only phenomenon. However, not only can you access the video wonderland on the desktop at TikTok.com, but there's also a well-done app for it right in the Windows Store. TikTok for Windows won't work with your webcam, but you can use it to upload videos you edit to perfection with desktop video tools. It's all free but has ads for support—just like on the mobile version, they show up looking like videos you might want to see. Best Free File Transfer Program Teracopy Windows, macOS, AndroidSure, Windows itself copies files between folders and drives just fine. But TeraCopy can take over that job and do it faster, and its interface for making copies is better-looking. Plus, it provides more information and feedback, and it can even recover from transfer errors. Best Free VPN Proton VPN5.0 Outstanding Windows, ChromeOS, macOS, Linux, iOS, AndroidYou probably should pay for a VPN, but you can save cash with a tool like the PCMag Editors' Choice award winner ProtonVPN, albeit with a few restrictions. It's not just our pick for the best free VPN; it's our best VPN overall. With the free ProtonVPN, your bandwidth is not limited, and the focus is mainly on keeping you secure. For more, read The Best Free VPNs. Proton VPNreview Best Free Video Conferencing Zoom Workplace 4.5 Excellent Windows, macOS, Linux, web, iOS, AndroidWant to host an online meeting for you and 100 of your closest friends? Zoom Workplace will let them all in for free, with a 40-minute time limit. They can join from any device, even a smartphone. Competitively priced premium plans with additional features are also available. Zoom is a PCMag Editors' Choice award winner for communicationsand productivity. Also, check out our top Zoom tips. Zoom Workplace review #best #free #software
    ME.PCMAG.COM
    The Best Free Software for 2025
    It's a mobile world, but we have not fully abandoned the desktop. The real work (and a lot of the play) of computing requires a full personal computing system, and to get the most out of that, you need software.Software can be expensive, but free programs have been a mainstay of the desktop experience for decades, and today's offerings are pretty powerful. Software developers can adopt an ad-based model, donation-ware to keep things afloat, or a shareware/freemium model that charges for extra features.Something to always watch for: Crapware installers. To make ends meet, many creators of otherwise great free software, or the services that offer the programs for download, bundle in things you don't want. Worse, the installation routine obfuscates the steps, so you provide the unwanted program tacit permission to be installed. For more about how to spot and avoid this problem, see How to Rid a New PC of Crapware.A pro tip: Only download desktop software from the maker of the software directly. It's not foolproof—after all, developers want to eat, too—but it helps.Other Criteria:The software must be available directly from the developer/creator/original publisher.The software should (typically) have a Windows-based download—no browser extensions here, because we're not all on the same browser. However, we've included web-based apps that are as good, or better, than most downloadable programs.If the software is on a tiered sales model, the free version cannot be trial-ware. It has to have at least a free-for-life option.Preferably the program had an update in the last year or two.The program should have little or no advertising to support it.Software for productivity is what this list is about; there are plenty of other places to find free PC games.For more free software, check out The 100 Best iPhone Apps and The 100 Best Android Apps.Did we miss any free programs you can't live without? Let us know in the comments. Best Free Audio-Editing Software Audacity 4.0 Excellent Windows, macOS, LinuxOpen-source Audacity can record and edit audio files on more tracks than you can imagine. It then outputs exactly what you need. It is perfect for noobs and pros alike and works on any desktop OS. Audacity review Best Free Simple Video Editor CapCut 4.0 Excellent Windows, macOS, iOS, Android, webWhile it seems like most video editing today takes place on phones, at least one mobile video editor has jumped to the desktop: ByteDance’s CapCut is on Windows; it's even in the Microsoft Store. In our review of the mobile version, we found it to be fast, easy, and powerful. CapCut review Best Free Advanced Video Editing DaVinci Resolve 4.0 Excellent Windows, macOS, LinuxHow on earth does Blackmagic Design make DaVinci Resolve so capable as a video editor yet still offer a free version? The hope is that as users get better at making videos, they’ll buy the full suite for the extras, even if it costs $395. Meanwhile, the free version can handle almost any 8-bit format up to 3,840 by 2,160 pixels for editing, color correction, VFX, motion graphics, and audio. DaVinci Resolve review Best Free Video Converter Handbrake 3.5 Good Windows, macOS, LinuxNo one would call HandBrake simple, but few video transcoders—software that converts almost any video format into another video format—can compete when it comes to power and comprehensiveness. It's been around for over two decades and remains open-source. Best Free Cartooning Tool Pencil2D Windows, macOS, LinuxOpen-source and multiplatform, the Pencil 2D Animation tool is what it sounds like: a way to quickly create two-dimensional animations by penciling in each frame. The site is full of video tutorials to help you get the gist. Best Free Video Editing Shotcut 3.5 Good Windows, macOS, LinuxWhile it lacks the slick interface found in most other video editors, Shotcut's got lot of power. It offers a phenomenal number of features and gets frequent improvement updates. Just don't expect it to feel like an Adobe product. Best Free Game-Recording/Streaming Software Streamlabs OBS Windows, Web, iOS, AndroidStream your video game sessions with Logitech's Streamlabs Desktop directly to YouTube, Twitch, or Facebook. You can switch between gameplay and your webcam, so you can show your face as you make commentary. There may be a learning curve, but you can find plenty of help online. Best Free Video Player VLC Windows, macOS, Linux, iOS, AndroidThe premier way to watch just about any video, no matter the clip's weird codec. VLC media player can auto-rotate smartphone videos taken at the wrong orientation and resume playback from where you left off during a previous session. Seriously, VLC plays back anything on all desktop platforms, and it guarantees no ads, tracking, or spyware. (For more, read How to Play DVDs and Blu-ray Discs in Windows.) Best Free Messaging Software Discord 4.5 Excellent Windows, macOS, Linux, web, iOS, Android, Xbox, PlayStationMillions of people worldwide use Discord for text, voice chatting, and video chatting—mainly while kicking one another's arses in online games or watching gameplay streams on Twitch or Caffeine. You can spend a fee (starting at $2.99 per month) to go premium for better video and audio quality and to upload larger files. Discord review Best Free Secure Messaging Signal Private Messenger 4.5 Excellent Windows, macOS, Linux, iOS, AndroidPCMag’s Editors’ Choice Award winner for secure messaging (for mobile or desktop) is Signal, which you may recall from a recent high-level scandal. It does it all: group chat, voice chat, and video chat, all with mandatory end-to-end encryption. You need Android or iOS to register to use Signal, which requires the mobile app, but it also works on your desktop OSes. Perhaps best of all, it’s owned by a nonprofit with no incentive to sell your data.  Signal Private Messenger review Best Free Remote Access TeamViewer 4.5 Excellent Windows, macOS, Linux, web, iOS, Android, ChromeOSPCMag's top pick for software that can control other computers is TeamViewer, which is only free for personal use. That version has everything you need: desktop sharing, file transfers, and chat with remote users. The setup couldn't be easier. Take control of a remote PC over an internet connection with the app, or use a browser with the TeamViewer extension. Just keep in mind that remote-access tools can be abused, so don't turn one on unless you're on the phone with the person you're allowing access to. And make sure to turn them off after you're done. TeamViewer review Best Free Friends and Family Messaging WhatsApp 4.0 Excellent Windows, macOS, Linux, web, iOS, AndroidIf you want to avoid the giant corporations that run messaging services, maybe WhatsApp (which is owned by Meta) isn’t for you. But it is a massive service with a loyal user base, an easy-to-use interface, and self-destructing messages and images. It even uses the Signal protocol, so the folks at Meta can’t read what you send. But then again, you could just use Signal. Still, you might opt for WhatsApp if you have an existing platoon of friends and family using it. WhatsApp review Best Free Freeform Drawing Adobe Fresco 4.5 Excellent Windows, iOSYou may think of Adobe Fresco—the company’s painting app—as strictly for mobile devices. But it is also available for Windows, whether you use it in tablet mode or not. The free version has its limits, but overall makes the feeling of drawing on a screen as close as you can get to doing so on paper. Adobe Fresco review Best Free AI ChatGPT 4.0 Excellent Windows, macOS, iOS, AndroidDoes ChatGPT hallucinate and make mistakes? You better believe it. But it's still the most advanced and mature generative AI available today, especially considering you can do a lot with it for free (like get unlimited access to the GPT-4o mini, the fastest model offered by parent company OpenAI). It'll generate text and images (a limited amount per day) and even let you use the Deep Research function five times per month. You can do quite a bit without an account, but signing up unlocks features like saved chat history. And if you don't want to use it on the web, you can download ChatGPT apps for the operating systems above.For more, read our full review and note this disclosure: Ziff Davis, PCMag's parent company, filed a lawsuit against OpenAI in April 2025, alleging it infringed Ziff Davis copyrights in training and operating its AI systems. ChatGPT review Best Free Painting Software Krita Windows, macOS, LinuxKrita is a powerful, full-fledged painting tool for digital artists. It does come with a bit of a learning curve, but the nonexistent price tag and the vibrant community behind it make it more than worth digging into, especially if you’ve got artistic skills but no desire to pick up paint and brushes IRL. Best Free Desktop Publishing Tool Scribus Windows, macOS, LinuxScribus is the open-source equivalent of Adobe InDesign for desktop publishing, or as close as you can get to it, with a history that goes back almost a quarter century. It has built-in color separation, color management, and a lot more—including its own wiki for documentation. Best Free World-Building Tool Shaxpir Windows, macOSPronounced like the playwright, Shaxpir is essentially a simplistic version of our top-rated Scrivener, with an “everyone” free tier that is very useful. For no charge, you get the full manuscript builder, world-building notebook, progress tracker, offline use, and cloud backup. Still, pros might consider the $7.99-a-month subscription with extra features a bargain after the 30-day trial. Best Free Screenwriting Tool Trelby Windows, LinuxDo you fancy yourself a budding screenwriter but lack the funds for high-end tools like Final Draft? Trelby does a fine job of helping you format scripts correctly, remember character names, and import and export to formats used in Hollywood. Best Free Android Emulation BlueStacks 5 Windows, macOSFor a hot second, Windows 11 had an Android simulator that could play apps from the Amazon store, but that got shut down. The next best option is BlueStacks, which only takes up about 5GB of space and can access the Google Play Store. The emulator will help you map your mouse and keyboard to work with Android games. For more info, read Ways to Run Android Apps on Your PC for Free. Best Free Social Photo Sharing Instagram (for Windows Phone) 3.0 Good Windows, WebSocial media apps don’t have to just be on your phone. Like TikTok, you can get to the 'Gram on your desktop with this app found on the Windows Store. It’ll show you all the amazing images shared by people and brands you follow, as well as the Reels they generate. Best Free Maps Software Google Earth Windows, macOS, Linux, Web, iOS, AndroidAs if high-end software that lets you virtually fly across the globe isn't cool enough, Google Earth Pro for the desktop is totally free. It includes advanced features such as high-resolution printing, distance measuring, and global guided tours. Although it also comes in web and mobile versions, the desktop version is the only one that lets you view satellite images of the moon and Mars. Plus, it has star maps and will even let you go back in time. Best Free Writing Tool yWriter 3.5 Good Windows, macOS, iOS, AndroidThe highly structured interface of yWriter can help anyone, from budding to experienced novelists, get a real handle on their story and its characters. The program is full of stats on what you have written, providing you with a data-driven writing experience. It doesn't have the depth of Scrivener, but it's free (or you can make a donation). yWriter review Best Free Media Center Plex TV Windows, macOS, Linux, iOS, Android, Xbox, PlayStation, Smart TVs, media hubs, NAS devicesIf you don't know or care what a media server is, but you just want to stream your videos and music collection around the house, Plex could work well for you. Install it on all your devices, point it at some media, and those audio and video files become available on everything—even remotely. For more, read How to Set Up a Plex Server, How to Share Your Plex Libraries, How to Organize Your Plex Media Library, and The Expert's Guide to Managing Your Plex Server. Best Free File Viewer and Converter Faststone Image Viewer Windows onlyView, manage, and compare your images with this fast and intuitive freebie. FastStone Image Viewer supports a wide range of image formats, including unprocessed raw files from specific digital camera manufacturers. (For more, read What Are Raw Camera Files and Why Should You Use Them?.) It also has companion apps for screenshots and photo resizing. Best Free Photoshop Replacement GNU Image Manipulation Program (GIMP) 3.5 Good Windows, macOS, LinuxGIMP is a stalwart of the open-source world. It's a full-featured Photoshop alternative with all the functions—including layers, filters, masking, and plug-ins—that image editors need. It may lack the polish and AI extras you get with Adobe’s product, but GIMP more than makes up for that by being really, truly free. You can get it for Windows in the Microsoft Store. GNU Image Manipulation Program (GIMP) review Best Free Graphics Software (Vector Editing) Inkscape 3.0 Good Windows, macOS, LinuxAdobe Illustrator is the high bar of vector image editing, but it has a premium price to match. You can still get cross-platform Scalable Vector Graphic image creation with the free Inkscape. You'll have to work a little harder to learn it, but it may be exactly what a talented (but cash-strapped or subscription-shy) artist needs. Best Free Graphics Software (Bitmap Editing) Paint.net WindowsIs Paint.net a perfect replacement for Photoshop? Nothing is as powerful as Adobe's program, but at this price—free—Paint.net comes close. For any minor (and even some major) picture manipulation, it's fast, comprehensive, and easy to use. Best Free PDF Reader Foxit PDF Reader Windows, macOSJust about any browser can read a PDF. But Foxit PDF Reader is free, not just for reading but also for annotation and collaboration on files. The program allows you to send signed and edited PDF files to friends or coworkers and works seamlessly with the Foxit PDF Editor on mobile platforms. For more, read How to Convert PDFs to Word Documents and Image Files. Best Free Grammar Help Grammarly 4.0 Excellent Windows, macOS, Web, iOS, AndroidIf you use the internet, you’ve probably heard of Grammarly—the ads are everywhere. The free version provides plenty of insights and suggestions to improve all the words you put on the screen in almost any program. And, yes, it really can up your writing game. Grammarly review Best Cross-Platform Note Taker Joplin 4.5 Excellent Windows, macOS, Linux, Web, iOS, AndroidOur review of Joplin calls it "the ideal note-taking app for users who value simplicity.” It lacks some advanced features, but the open-source tool works on all major platforms to do what you need most: store unlimited notes. You only pay if you want to get into sharing and collaboration. It even has a web clipper browser extension for grabbing notes as you traverse the internet. Joplin review Best Free Kanban Project Management Kanri Windows, macOS, LinuxIf you do any kind of projects or organizing that involve index cards, then you have probably embraced the Kanban board approach. Kanri is a great, free way to Kanban your desktop without signing in or creating an account—it doesn't even need you to be online. As a bonus, it can import boards from big-name products like Trello. Best Free Office Suite LibreOffice 3.0 Good Windows, macOS, LinuxThere aren't many free office suites, and only one is a free, open-source download available for the major desktop operating systems. LibreOffice could be a bit more polished, lacks collaboration features, and sports an overstuffed toolbar interface that might remind you of Microsoft Office a decade ago. But it's powerful nevertheless, and it easily converts and imports files from other systems. It comes with a word processor (Writer), a spreadsheet component (Calc), a presentation program (Impress), a vector drawing program (Draw), and even a full database (Base) and math-formula editor (Math). LibreOffice review Best Free Note-Taking App Microsoft OneNote 4.5 Excellent Windows, macOS, iOS, Android, WebOnce just a part of Microsoft Office, the sublime OneNote has become a free, standalone powerhouse for note-taking across all the major operating systems. It still works with Office, syncs data across all platforms, and has full online access via Office.com, with storage on OneDrive. That's why it's our Editors' Choice pick for note storage. Microsoft OneNote review Best Free Browser Firefox 4.5 Excellent Windows, macOS, Linux, iOS, AndroidThe venerable browser Firefox remains highly customizable and strong on security, privacy, and performance. It stays cutting-edge without the backing of Big Tech—in fact, the Firefox website brags that its parent, Mozilla, has been "billionaire-free for 20+ years." Mozilla also owns Pocket, so you can easily use Firefox to save what you see online to that read-it-later service. For more, read Which Browser Is Best? and Top Firefox Tips. Best Free Text Editor Notepad++ WindowsNotepad++ is nothing like the anemic Notepad that Windows users grew used to over the decades. This free download has tabs, color-coded nesting text, WYSIWYG printing, and support for macros. It's a must for hand-coders or any writer who wants a minimalist interface. Best Power-User Note Taker Obsidian 4.0 Excellent Windows, macOS, Linux, iOS, AndroidObsidian’s got a learning curve, but once mastered, it's the best note-taker for power users. The free version is available for personal use—it lacks only support and sync options, but you can get around the sync by storing your Obsidian Vault in a spot where a cloud service backs it up. Obsidian review Best Free Doc Viewer and Annotator Okular Windows, LinuxIf you seek a free and full-fledged PDF editor, Okular can do the job (on Windows—it's in the Microsoft Store—and Linux). It boasts annotations and highlights, even digital signature support. It will also read many other formats, including ePub books, comics formats, and many types of images. Best To-Do List for Everyone Todoist 5.0 Outstanding Windows, macOS, Linux, iOS, Android, WebThis is our favorite to-do list app, ever. We give the paid version a full five-star review, but even the free version is fantastic. The Todoist interface is simple perfection on all platforms—even wearables and via email (where you can turn messages into tasks). The free version gives you five projects with five collaborators on each (working across 300 possible tasks), supports uploads of 5MB files, and keeps a one-week active history. Todoist review Best Programming Environment Visual Studio Code Windows, macOS, Linux, webNeed to write some code? Use VS Code from Microsoft. It has everything you’d want in a coding environment, from plug-ins to great organization. And it's easy to get started with this program, even though you have to do a little setup to tweak it to perfection. Best Free Antivirus Avast One Basic 4.5 Excellent Windows, macOS, iOS, AndroidOur Editors' Choice award winner for free antivirus this year is Avast One Basic. It's a top scorer against malware in lab tests, and it did great in our hands-on tests, too. It offers more free protection than ever. Avast One Basic review Best Free Secure Browser Bitwarden 4.0 Excellent Windows, macOS, Linux, iOS, AndroidDo you want to stop the trackers watching you online dead? Going incognito on a standard browser isn't enough. You need to use a full-on privacy browser, one that blocks cookies and prevents the fingerprinting of your whole browser and computer. Brave is one of a slew of them with a rating for strong protection from the Electronic Frontier Foundation. For details, read The Best Private Browsers. Bitwarden review Best Free Desktop Authenticator Ente Auth Windows, macOS, Linux, iOS, Android, webWhen it comes to multi-factor authentication, the downside to most authenticator apps is that they're mobile-only. If you don't have your phone close by when asked for the code, you're out of luck. So, it's very nice to have a desktop MFA authenticator. Authy had one but killed it. Ente Auth is here to take up the slack. Set up your MFA logins with it on the phone or tablet, and all the codes sync with the desktop versions. Plus, it's always previewing your next code, so you don't have to wait, and it lets you share codes with a team. Best Free Password Manager Proton Pass 4.5 Excellent Windows, macOS, Linux, Android, iOS, multiple browser extensionsProton already has a great reputation. Its Proton Pass offers the most outstanding password management of the year while charging you nothing. It includes email alias options, dark web monitoring, and password hygiene (it'll tell you when you have reused or weak passwords that need updating, pronto), all while managing an unlimited number of passwords and credentials. You can pay for extra features like credit card storage and data breach monitoring. For more, read our guide to The Best Free Password Managers. Proton Pass review Best Clipping with Annotations ClipClip WindowsClipClip holds multiple copied items in the clipboard, lets you extract text from images to paste, syncs on cloud services, allows history searches, and even does on-the-fly translation. It also allows for full-screen and video captures, plus edits and annotations. Best Synchronization of Clipboards Recuva 3.5 Good WindowsThe clipboard has come a long way, but you can take it further with a tool like Ditto. It’ll not only show you everything you’ve copied, but also handle searches, allow multiple ways to select, and keep the contents of multiple computers’ clipboards synchronized. Best Free Local Search Tool Everything 3.0 Good WindowsEverything has been around a long while and continues plugging along to help people find the things on their PC that built-in search can’t seem to fathom. It can even look inside files, though it won’t index them. If you name files and folders carefully, it will bring you results fast. Best Free Backup and Synchronization Software IDrive 4.5 Excellent Windows, macOS, Linux, iOS, AndroidIDrive is a PCMag Editors' Choice award winner for cloud storage and file sharing. You get 10GB free from IDrive to back up files from all your devices, an upgrade from the original 5GB. If that's enough capacity for you, you'll find this service more than up to your needs. It'll even back up your photos and videos from Facebook. Bonus: At this price tier, you don't have to give the company a credit card. IDrive review Best Media Viewer and Annotator IrfanView WindowsIrfanView has been letting people view, edit, and organize media and more on Windows for well over a quarter century now. The current version supports Vista all the way up to 11. The list of file format types you can click on, view, and annotate instantly is long, and the program's ease of use is legendary. And it's utterly free for personal use. Best Free Screen Capture Editor Gemoo Snap Windows, macOSWhen it comes to screengrabs, if the Snipping Tool in Windows doesn’t do it for you, Gemoo Snap is an excellent alternative. It's available for the desktop (including on macOS) or just as a Chrome extension if you only capture web pages. You can snap a screen, then annotate it, share it, pull out text, or even “beautify” it with edits and new backgrounds. Best Free File Compression for Archives NanaZip WindowsA lot of people adore the 7-zip archiving software. NanaZip is a fork of the original code, meant to make the archive experience feel more native to Windows 10 and 11 by working right in the context menu of File Explorer. Best Free File Manager for Windows OneCommander WindowsIf you find the Windows 10 and 11 way of dealing with files—via the built-in File Explorer—a chore, consider an upgrade to a third-party file manager. OneCommander has all the extras you'd want, including tab support, file previews, dual-pane browsing, dark and light themes, and a lot more. Best of all: It's fast. And free for home use. Best Free File Recovery and Deletion Recuva 3.5 Good WindowsRecuva (say it out loud) is a must for any techie's tool belt: It's the key to helping recover a lost file. It's easy to understand, but note: Recuva should really be installed before you lose a file. It's a portable application, too, so you have the option to run it from a USB thumb drive. Best for Screen Video Capture ScreenPal 4.5 Excellent Windows, macOS, Android, iOSWant to capture more than a still image? ScreenPal (previously called Screencast-O-Matic) will do it. The free-to-use-forever tier will take still shots, up to 15 minutes of video of your screen (with a watermark), and share to social, plus store as much as you want online. The mobile apps will sync your captured files. We gave it an Editors' Choice award. You can pay $48 a year if you want unlimited full-screen video recording sans watermarks. ScreenPal review Best Free Power Screen Grabber ShareX WindowsWhat ShareX lacks in sexiness it makes up for in power, offering just about every option one could wish for in capturing a Windows screen (including video screen recording and GIF exports). It supports image effects add-ons such as backgrounds and borders, optical character recognition, and pre-set actions for processing captures just the way you like them. Best Free Screen Capture Microsoft Snip WindowsEven those with modest screen-capture needs would say the old Snipping Tool in Windows was...lacking. The new version of Snipping Tool merges it with the Windows Snip & Sketch, which was itself an evolutionary leap. Now it's more revolutionary, as it can also capture things like video and voice. Plus, you can annotate a screengrab. For more, read The Best Screen Capture Apps. Best Free Simple File Backup SyncBackFree WindowsSyncBack dates way back and still rocks at synchronizing backups. That includes the free version, which can copy files in both directions to make a restore as easy as a backup. Best Free Social Media Software TikTok Windows Windows, Web, iOS, AndroidYou probably think of TikTok as a mobile-only phenomenon. However, not only can you access the video wonderland on the desktop at TikTok.com, but there's also a well-done app for it right in the Windows Store. TikTok for Windows won't work with your webcam, but you can use it to upload videos you edit to perfection with desktop video tools. It's all free but has ads for support—just like on the mobile version, they show up looking like videos you might want to see. Best Free File Transfer Program Teracopy Windows, macOS, AndroidSure, Windows itself copies files between folders and drives just fine. But TeraCopy can take over that job and do it faster, and its interface for making copies is better-looking. Plus, it provides more information and feedback, and it can even recover from transfer errors. Best Free VPN Proton VPN (Windows) 5.0 Outstanding Windows, ChromeOS, macOS, Linux, iOS, AndroidYou probably should pay for a VPN, but you can save cash with a tool like the PCMag Editors' Choice award winner ProtonVPN, albeit with a few restrictions. It's not just our pick for the best free VPN; it's our best VPN overall. With the free ProtonVPN, your bandwidth is not limited, and the focus is mainly on keeping you secure. For more, read The Best Free VPNs. Proton VPN (Windows) review Best Free Video Conferencing Zoom Workplace 4.5 Excellent Windows, macOS, Linux, web, iOS, AndroidWant to host an online meeting for you and 100 of your closest friends? Zoom Workplace will let them all in for free, with a 40-minute time limit. They can join from any device, even a smartphone. Competitively priced premium plans with additional features are also available. Zoom is a PCMag Editors' Choice award winner for communications (with end-to-end encryption) and productivity (even the free version has team chat and whiteboards). Also, check out our top Zoom tips. Zoom Workplace review
    0 Σχόλια 0 Μοιράστηκε