Agent-Based Debugging Gets a Cost-Effective Alternative: Salesforce AI Presents SWERank for Accurate and Scalable Software Issue Localization
Identifying the exact location of a software issue—such as a bug or feature request—remains one of the most labor-intensive tasks in the development lifecycle.
Despite advances in automated patch generation and code assistants, the process of pinpointing where in the codebase a change is needed often consumes more time than determining how to fix it.
Agent-based approaches powered by large language models (LLMs) have made headway by simulating developer workflows through iterative tool use and reasoning.
However, these systems are typically slow, brittle, and expensive to operate, especially when built on closed-source models.
In parallel, existing code retrieval models—while faster—are not optimized for the verbosity and behavioral focus of real-world issue descriptions.
This misalignment between natural language inputs and code search capability presents a fundamental challenge for scalable automated debugging.
SWERank — A Practical Framework for Precise Localization
To address these limitations, Salesforce AI has introduced SWERank, a lightweight and effective retrieve-and-rerank framework tailored for software issue localization.
SWERank is designed to bridge the gap between efficiency and precision by reframing localization as a code ranking task.
The framework consists of two key components:
SWERankEmbed, a bi-encoder retrieval model that encodes GitHub issues and code snippets into a shared embedding space for efficient similarity-based retrieval.
SWERankLLM, a listwise reranker built on instruction-tuned LLMs that refines the ranking of retrieved candidates using contextual understanding.
To train this system, the research team curated SWELOC, a large-scale dataset extracted from public GitHub repositories, linking real-world issue reports with corresponding code changes.
SWELOC introduces contrastive training examples using consistency filtering and hard-negative mining to ensure data quality and relevance.
Architecture and Methodological Contributions
At its core, SWERank follows a two-stage pipeline.
First, SWERankEmbed maps a given issue description and candidate functions into dense vector representations.
Using a contrastive InfoNCE loss, the retriever is trained to increase the similarity between an issue and its true associated function while reducing its similarity to unrelated code snippets.
Notably, the model benefits from carefully mined hard negatives—code functions that are semantically similar but not relevant—which improve the model’s discriminative capability.
The reranking stage leverages SWERankLLM, a listwise LLM-based reranker that processes an issue description along with top-k code candidates and generates a ranked list where the relevant code appears at the top.
Importantly, the training objective is adapted to settings where only the true positive is known.
The model is trained to output the identifier of the relevant code snippet, maintaining compatibility with listwise inference while simplifying the supervision process.
Together, these components allow SWERank to offer high performance without requiring multiple rounds of interaction or costly agent orchestration.
Insights
Evaluations on SWE-Bench-Lite and LocBench—two standard benchmarks for software localization—demonstrate that SWERank achieves state-of-the-art results across file, module, and function levels.
On SWE-Bench-Lite, SWERankEmbed-Large (7B) attained a function-level accuracy@10 of 82.12%, outperforming even LocAgent running with Claude-3.5.
When coupled with SWERankLLM-Large (32B), performance further improved to 88.69%, establishing a new benchmark for this task.
In addition to performance gains, SWERank offers substantial cost benefits.
Compared to Claude-powered agents, which average around $0.66 per example, SWERankLLM’s inference cost is $0.011 for the 7B model and $0.015 for the 32B variant—delivering up to 6x better accuracy-to-cost ratio.
Moreover, the 137M parameter SWERankEmbed-Small model achieves competitive results, demonstrating the framework’s scalability and efficiency even on lightweight architectures.
Beyond benchmark performance, experiments also show that SWELOC data improves a broad class of embedding and reranking models.
Models pre-trained for general-purpose retrieval exhibited significant accuracy gains when fine-tuned with SWELOC, validating its utility as a training resource for issue localization tasks.
Conclusion
SWERank introduces a compelling alternative to traditional agent-based localization approaches by modeling software issue localization as a ranking problem.
Through its retrieve-and-rerank architecture, SWERank delivers state-of-the-art accuracy while maintaining low inference cost and minimal latency.
The accompanying SWELOC dataset provides a high-quality training foundation, enabling robust generalization across various codebases and issue types.
By decoupling localization from agentic multi-step reasoning and grounding it in efficient neural retrieval, Salesforce AI demonstrates that practical, scalable solutions for debugging and code maintenance are not only possible—but well within reach using open-source tools.
SWERank sets a new bar for accuracy, efficiency, and deployability in automated software engineering.
Check out the Paper and Project Page. All credit for this research goes to the researchers of this project.
Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit.
Here’s a brief overview of what we’re building at Marktechpost:
ML News Community – r/machinelearningnews (92k+ members)
Newsletter– airesearchinsights.com/(30k+ subscribers)
miniCON AI Events – minicon.marktechpost.com
AI Reports & Magazines – magazine.marktechpost.com
AI Dev & Research News – marktechpost.com (1M+ monthly readers)
Partner with us
Asif RazzaqWebsite | + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc..
As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good.
His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience.
The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/A" style="color: #0066cc;">https://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Build a Fast Semantic Search and RAG QA Engine on Web-Scraped Data Using Together AI Embeddings, FAISS Retrieval, and LangChainAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A" style="color: #0066cc;">https://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Deploy a Fully Integrated Firecrawl-Powered MCP Server on Claude Desktop with Smithery and VeryaXAsif Razzaqhttps://www.marktechpost.com/author/6flvq/OpenAI" style="color: #0066cc;">https://www.marktechpost.com/author/6flvq/OpenAI Releases HealthBench: An Open-Source Benchmark for Measuring the Performance and Safety of Large Language Models in HealthcareAsif Razzaqhttps://www.marktechpost.com/author/6flvq/PrimeIntellect" style="color: #0066cc;">https://www.marktechpost.com/author/6flvq/PrimeIntellect Releases INTELLECT-2: A 32B Reasoning Model Trained via Distributed Asynchronous Reinforcement Learning
Source: https://www.marktechpost.com/2025/05/13/agent-based-debugging-gets-a-cost-effective-alternative-salesforce-ai-presents-swerank-for-accurate-and-scalable-software-issue-localization/">https://www.marktechpost.com/2025/05/13/agent-based-debugging-gets-a-cost-effective-alternative-salesforce-ai-presents-swerank-for-accurate-and-scalable-software-issue-localization/">https://www.marktechpost.com/2025/05/13/agent-based-debugging-gets-a-cost-effective-alternative-salesforce-ai-presents-swerank-for-accurate-and-scalable-software-issue-localization/
#agentbased #debugging #gets #costeffective #alternative #salesforce #presents #swerank #for #accurate #and #scalable #software #issue #localization
Agent-Based Debugging Gets a Cost-Effective Alternative: Salesforce AI Presents SWERank for Accurate and Scalable Software Issue Localization
Identifying the exact location of a software issue—such as a bug or feature request—remains one of the most labor-intensive tasks in the development lifecycle.
Despite advances in automated patch generation and code assistants, the process of pinpointing where in the codebase a change is needed often consumes more time than determining how to fix it.
Agent-based approaches powered by large language models (LLMs) have made headway by simulating developer workflows through iterative tool use and reasoning.
However, these systems are typically slow, brittle, and expensive to operate, especially when built on closed-source models.
In parallel, existing code retrieval models—while faster—are not optimized for the verbosity and behavioral focus of real-world issue descriptions.
This misalignment between natural language inputs and code search capability presents a fundamental challenge for scalable automated debugging.
SWERank — A Practical Framework for Precise Localization
To address these limitations, Salesforce AI has introduced SWERank, a lightweight and effective retrieve-and-rerank framework tailored for software issue localization.
SWERank is designed to bridge the gap between efficiency and precision by reframing localization as a code ranking task.
The framework consists of two key components:
SWERankEmbed, a bi-encoder retrieval model that encodes GitHub issues and code snippets into a shared embedding space for efficient similarity-based retrieval.
SWERankLLM, a listwise reranker built on instruction-tuned LLMs that refines the ranking of retrieved candidates using contextual understanding.
To train this system, the research team curated SWELOC, a large-scale dataset extracted from public GitHub repositories, linking real-world issue reports with corresponding code changes.
SWELOC introduces contrastive training examples using consistency filtering and hard-negative mining to ensure data quality and relevance.
Architecture and Methodological Contributions
At its core, SWERank follows a two-stage pipeline.
First, SWERankEmbed maps a given issue description and candidate functions into dense vector representations.
Using a contrastive InfoNCE loss, the retriever is trained to increase the similarity between an issue and its true associated function while reducing its similarity to unrelated code snippets.
Notably, the model benefits from carefully mined hard negatives—code functions that are semantically similar but not relevant—which improve the model’s discriminative capability.
The reranking stage leverages SWERankLLM, a listwise LLM-based reranker that processes an issue description along with top-k code candidates and generates a ranked list where the relevant code appears at the top.
Importantly, the training objective is adapted to settings where only the true positive is known.
The model is trained to output the identifier of the relevant code snippet, maintaining compatibility with listwise inference while simplifying the supervision process.
Together, these components allow SWERank to offer high performance without requiring multiple rounds of interaction or costly agent orchestration.
Insights
Evaluations on SWE-Bench-Lite and LocBench—two standard benchmarks for software localization—demonstrate that SWERank achieves state-of-the-art results across file, module, and function levels.
On SWE-Bench-Lite, SWERankEmbed-Large (7B) attained a function-level accuracy@10 of 82.12%, outperforming even LocAgent running with Claude-3.5.
When coupled with SWERankLLM-Large (32B), performance further improved to 88.69%, establishing a new benchmark for this task.
In addition to performance gains, SWERank offers substantial cost benefits.
Compared to Claude-powered agents, which average around $0.66 per example, SWERankLLM’s inference cost is $0.011 for the 7B model and $0.015 for the 32B variant—delivering up to 6x better accuracy-to-cost ratio.
Moreover, the 137M parameter SWERankEmbed-Small model achieves competitive results, demonstrating the framework’s scalability and efficiency even on lightweight architectures.
Beyond benchmark performance, experiments also show that SWELOC data improves a broad class of embedding and reranking models.
Models pre-trained for general-purpose retrieval exhibited significant accuracy gains when fine-tuned with SWELOC, validating its utility as a training resource for issue localization tasks.
Conclusion
SWERank introduces a compelling alternative to traditional agent-based localization approaches by modeling software issue localization as a ranking problem.
Through its retrieve-and-rerank architecture, SWERank delivers state-of-the-art accuracy while maintaining low inference cost and minimal latency.
The accompanying SWELOC dataset provides a high-quality training foundation, enabling robust generalization across various codebases and issue types.
By decoupling localization from agentic multi-step reasoning and grounding it in efficient neural retrieval, Salesforce AI demonstrates that practical, scalable solutions for debugging and code maintenance are not only possible—but well within reach using open-source tools.
SWERank sets a new bar for accuracy, efficiency, and deployability in automated software engineering.
Check out the Paper and Project Page. All credit for this research goes to the researchers of this project.
Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit.
Here’s a brief overview of what we’re building at Marktechpost:
ML News Community – r/machinelearningnews (92k+ members)
Newsletter– airesearchinsights.com/(30k+ subscribers)
miniCON AI Events – minicon.marktechpost.com
AI Reports & Magazines – magazine.marktechpost.com
AI Dev & Research News – marktechpost.com (1M+ monthly readers)
Partner with us
Asif RazzaqWebsite | + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc..
As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good.
His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience.
The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Build a Fast Semantic Search and RAG QA Engine on Web-Scraped Data Using Together AI Embeddings, FAISS Retrieval, and LangChainAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Deploy a Fully Integrated Firecrawl-Powered MCP Server on Claude Desktop with Smithery and VeryaXAsif Razzaqhttps://www.marktechpost.com/author/6flvq/OpenAI Releases HealthBench: An Open-Source Benchmark for Measuring the Performance and Safety of Large Language Models in HealthcareAsif Razzaqhttps://www.marktechpost.com/author/6flvq/PrimeIntellect Releases INTELLECT-2: A 32B Reasoning Model Trained via Distributed Asynchronous Reinforcement Learning
Source: https://www.marktechpost.com/2025/05/13/agent-based-debugging-gets-a-cost-effective-alternative-salesforce-ai-presents-swerank-for-accurate-and-scalable-software-issue-localization/
#agentbased #debugging #gets #costeffective #alternative #salesforce #presents #swerank #for #accurate #and #scalable #software #issue #localization
·49 Views