Research Focus: Week of January 27, 2025
www.microsoft.com
In this edition:We introduce FLAVARS, a multimodal foundation language and vision alignment model for remote sensing; Managed-retention memory, a new class of memory which is more optimized to store key data structures for AI inference workloads; and Enhanced detection of macular telangiectasia type 2 (MacTel 2) using self-supervised learning and ensemble models.We present a new approach to generalizing symbolic automata, which brings together a variety of classic automata and logics in a unified framework with all the necessary ingredients to support symbolic model checking moduloA.And we invite you to join an upcoming workshop: LLM4Eval@WSDM 2025: Large Language Models for Evaluation in Information Retrieval. LLM4Eval is a promising technique in the areas of automated judgments, natural language generation, and retrieval augmented generation (RAG) systems. Researchers from Microsoft and experts from industry and academia will explore this technique at an interactive workshop on Friday, March 14, in Hanover, Germany.NEW RESEARCHIn the field of remote sensing, imagery is generally dense with objects and visual content which can vary regionally across the globe. This creates a need for vision-language datasets to be highly detailed when describing imagery, and for pretraining to better balance visual task performance while retaining the ability to perform zero-shot classification and image-text retrieval.One strategy is to combine paired satellite images and text captions for pretraining performant encoders for downstream tasks. However, while contrastive image-text methods like CLIP enable vision-language alignment and zero-shot classification ability, CLIPs vision-only downstream performance tends to degrade compared to image-only pretraining, such as Masked Autoencoders (MAE).To better approach multimodal pretraining for remote sensing, researchers from Microsoft propose a pretraining method that combines the best of both contrastive learning and masked modeling, along with geospatial alignment via contrastive location encoding, in the recent paper: FLAVARS: A Multimodal Foundational Language and Vision Alignment Model for Remote Sensing. The research shows that FLAVARS significantly outperforms a baseline of SkyCLIP for vision-only tasks such as KNN classification and semantic segmentation, +6% mIOU on SpaceNet1, while retaining the ability to perform zero-shot classification, unlike MAE pretrained methods.Read the paperNEW RESEARCHAI clusters today are one of the major uses of high bandwidth memory (HBM), a high-performance type of computer memory. However, HBM is suboptimal for AI inference workloads for several reasons. Analysis shows that HBM is overprovisioned on write performance, underprovisioned on density and read bandwidth, and has significant energy-per-bit overhead. It is also expensive, with lower yield than DRAM due to manufacturing complexity.In a recent paper: Managed-Retention Memory: A New Class of Memory for the AI Era, researchers from Microsoft propose a memory class which is more optimized to store key data structures for AI inference workloads. The paper makes the case that MRM may finally provide a path to viability for technologies that were originally proposed to support storage class memory (SCM). These technologies traditionally offered long-term persistence (10+ years) but provided poor IO performance and/or endurance. MRM makes different trade-offs, and by understanding the workload IO patterns, MRM foregoes long-term data retention and write performance for better potential performance on the metrics important for AI inference.Read the paperNEW RESEARCHMacular telangiectasia type 2 (MacTel) is a retinal disease that is challenging to diagnose. While increased awareness has led to improved diagnostic outcomes, MacTel diagnosis relies significantly upon a multimodal image set and the expertise of clinicians familiar with the disease. Optical coherence tomography (OCT) imaging has emerged as a valuable tool for the diagnosis and monitoring of various retinal diseases.With the increasing integration of OCT into clinical practice, deep learning models may be able to achieve accurate MacTel prediction comparable to that of retinal specialists, even when working with limited data.Researchers from Microsoft and external colleagues address this challenge in a recent paper: Enhanced Macular Telangiectasia Type 2 Detection: Leveraging Self-Supervised Learning and Ensemble Models. Published in the journal of Ophthalmology Science, the paper focuses on the accurate classification of macular telangiectasia type 2 using OCT images, with the overarching goal of facilitating early and precise detection of this neurodegenerative disease.The researchers present results leveraging self-supervised learning and ensemble models, showing their approach improves both MacTel classification accuracy and interpretability when compared to the use of individual models. Ensemble models exhibited superior agreement with the assessments of the most experienced individual human experts, as well as the ensemble of human experts.Read the paperMicrosoft research podcastCollaborators: Silica in space with Richard Black and Dexter GreeneCollege freshman Dexter Greene and Microsoft research manager Richard Black discuss how technology that stores data in glass is supporting students as they expand earlier efforts to communicate what it means to be human to extraterrestrials.Listen nowOpens in a new tab NEW RESEARCHSymbolic automata are finite state automata that support potentially infinite alphabets, such as the set of rational numbers, generally applied to regular expressions and languages over finite words. In symbolic automata (or automata moduloA), an alphabet is represented by an effective Boolean algebraA, supported by a decision procedure for satisfiability. Regular languages over infinite words (so called -regular languages) have a rich history paralleling that of regular languages over finite words, with well-known applications to model checking via Bchi automata and temporal logics.In a recent paper: Symbolic Automata: Omega-Regularity Modulo Theories, researchers from Microsoft generalize symbolic automata to support -regular languages viatransition termsandsymbolic derivatives. This brings together a variety of classic automata and logics in a unified framework that provides all the necessary ingredients to support symbolic model checking moduloA.Read the paperEVENTLLMs have shown increasing task-solving abilities not present in smaller models. Using LLMs for automated evaluation (LLM4Eval) is a promising technique in the areas of automated judgments, natural language generation, and retrieval augmented generation (RAG) systems.Join researchers from Microsoft and experts from industry and academia for a discussion on using LLMs for evaluation in information retrieval at LLM4Eval Workshop WSDM 2025 (opens in new tab), March 14, 2025, in Hanover, Germany.This interactive workshop will cover automated judgments, RAG pipeline evaluation, altering human evaluation, robustness, and trustworthiness of LLMs for evaluation in addition to their impact on real-world applications. The organizers believe that the information retrieval community can significantly contribute to this growing research area by designing, implementing, analyzing, and evaluating various aspects of LLMs with applications to LLM4Eval tasks.Learn more about the workshopMicrosoft Research | In case you missed itMicrosoft Team Uses Diffusion Model For Materials ScienceJanuary 21, 2025Finding a new material for a target application is like finding a needle in a haystack, write the authors of a blog post at Microsoft, where they have been working on just such a program, something called, aptly, MatterGen. Microsoft AutoGen v0.4: A turning point toward more intelligent AI agents for enterprise developersJanuary 18, 2025The world of AI agents is undergoing a revolution, and Microsofts release of AutoGen v0.4 this week marked a significant leap forward in this journey. Positioned as a robust, scalable and extensible framework, AutoGen represents Microsofts latest attempt to address the challenges of building multi-agent systems for enterprise applications. 2 AI breakthroughs unlock new potential for health and scienceJanuary 17, 2025Two new research papers published this week in scientific journals, one in Nature and one in Nature Machine Intelligence, show how generative AI foundation models can exponentially speed up scientific discovery of new materials and help doctors access and analyze radiology results faster. ChatGPT gets proactive with 'Tasks'January 15, 2025Good morning, AI enthusiasts. OpenAIs AI agent era just got its unofficial start with ChatGPT gaining the ability to schedule and manage daily tasks. With Tasks rolling out and mysterious Operator whispers in the air, is OpenAI finally ready to move from chatbots to full-on autonomous assistants? Mayo Clinic and Microsoft partner to advance generative AI in radiologyJanuary 15, 2025The Mayo Clinic is seeking to advance the use of generative artificial intelligence in imaging through a new collaboration with Microsoft Research. The duo made the announcement during the 43rd Annual J.P. Morgan Healthcare Conference taking place now in San Francisco. View more news and awards Opens in a new tab
0 Σχόλια ·0 Μοιράστηκε ·79 Views