NVIDIA
NVIDIA
This is the Official NVIDIA Page
11 people like this
268 Posts
2 Photos
0 Videos
0 Reviews
Recent Updates
  • How Scaling Laws Drive Smarter, More Powerful AI
    blogs.nvidia.com
    Just as there are widely understood empirical laws of nature for example, what goes up must come down, or every action has an equal and opposite reaction the field of AI was long defined by a single idea: that more compute, more training data and more parameters makes a better AI model.However, AI has since grown to need three distinct laws that describe how applying compute resources in different ways impacts model performance. Together, these AI scaling laws pretraining scaling, post-training scaling and test-time scaling, also called long thinking reflect how the field has evolved with techniques to use additional compute in a wide variety of increasingly complex AI use cases.The recent rise of test-time scaling applying more compute at inference time to improve accuracy has enabled AI reasoning models, a new class of large language models (LLMs) that perform multiple inference passes to work through complex problems, while describing the steps required to solve a task. Test-time scaling requires intensive amounts of computational resources to support AI reasoning, which will drive further demand for accelerated computing.What Is Pretraining Scaling?Pretraining scaling is the original law of AI development. It demonstrated that by increasing training dataset size, model parameter count and computational resources, developers could expect predictable improvements in model intelligence and accuracy.Each of these three elements data, model size, compute is interrelated. Per the pretraining scaling law, outlined in this research paper, when larger models are fed with more data, the overall performance of the models improves. To make this feasible, developers must scale up their compute creating the need for powerful accelerated computing resources to run those larger training workloads.This principle of pretraining scaling led to large models that achieved groundbreaking capabilities. It also spurred major innovations in model architecture, including the rise of billion- and trillion-parameter transformer models, mixture of experts models and new distributed training techniques all demanding significant compute.And the relevance of the pretraining scaling law continues as humans continue to produce growing amounts of multimodal data, this trove of text, images, audio, video and sensor information will be used to train powerful future AI models.Pretraining scaling is the foundational principle of AI development, linking the size of models, datasets and compute to AI gains. Mixture of experts, depicted above, is a popular model architecture for AI training.What Is Post-Training Scaling?Pretraining a large foundation model isnt for everyone it takes significant investment, skilled experts and datasets. But once an organization pretrains and releases a model, they lower the barrier to AI adoption by enabling others to use their pretrained model as a foundation to adapt for their own applications.This post-training process drives additional cumulative demand for accelerated computing across enterprises and the broader developer community. Popular open-source models can have hundreds or thousands of derivative models, trained across numerous domains.Developing this ecosystem of derivative models for a variety of use cases could take around 30x more compute than pretraining the original foundation model.Developing this ecosystem of derivative models for a variety of use cases could take around 30x more compute than pretraining the original foundation model.Post-training techniques can further improve a models specificity and relevance for an organizations desired use case. While pretraining is like sending an AI model to school to learn foundational skills, post-training enhances the model with skills applicable to its intended job. An LLM, for example, could be post-trained to tackle a task like sentiment analysis or translation or understand the jargon of a specific domain, like healthcare or law.The post-training scaling law posits that a pretrained models performance can further improve in computational efficiency, accuracy or domain specificity using techniques including fine-tuning, pruning, quantization, distillation, reinforcement learning and synthetic data augmentation.Fine-tuning uses additional training data to tailor an AI model for specific domains and applications. This can be done using an organizations internal datasets, or with pairs of sample model input and outputs.Distillation requires a pair of AI models: a large, complex teacher model and a lightweight student model. In the most common distillation technique, called offline distillation, the student model learns to mimic the outputs of a pretrained teacher model.Reinforcement learning, or RL, is a machine learning technique that uses a reward model to train an agent to make decisions that align with a specific use case. The agent aims to make decisions that maximize cumulative rewards over time as it interacts with an environment for example, a chatbot LLM that is positively reinforced by thumbs up reactions from users. This technique is known as reinforcement learning from human feedback (RLHF). Another, newer technique, reinforcement learning from AI feedback (RLAIF), instead uses feedback from AI models to guide the learning process, streamlining post-training efforts.Best-of-n sampling generates multiple outputs from a language model and selects the one with the highest reward score based on a reward model. Its often used to improve an AIs outputs without modifying model parameters, offering an alternative to fine-tuning with reinforcement learning.Search methods explore a range of potential decision paths before selecting a final output. This post-training technique can iteratively improve the models responses.To support post-training, developers can use synthetic data to augment or complement their fine-tuning dataset. Supplementing real-world datasets with AI-generated data can help models improve their ability to handle edge cases that are underrepresented or missing in the original training data.Post-training scaling refines pretrained models using techniques like fine-tuning, pruning and distillation to enhance efficiency and task relevance.What Is Test-Time Scaling?LLMs generate quick responses to input prompts. While this process is well suited for getting the right answers to simple questions, it may not work as well when a user poses complex queries. Answering complex questions an essential capability for agentic AI workloads requires the LLM to reason through the question before coming up with an answer.Its similar to the way most humans think when asked to add two plus two, they provide an instant answer, without needing to talk through the fundamentals of addition or integers. But if asked on the spot to develop a business plan that could grow a companys profits by 10%, a person will likely reason through various options and provide a multistep answer.Test-time scaling, also known as long thinking, takes place during inference. Instead of traditional AI models that rapidly generate a one-shot answer to a user prompt, models using this technique allocate extra computational effort during inference, allowing them to reason through multiple potential responses before arriving at the best answer.On tasks like generating complex, customized code for developers, this AI reasoning process can take multiple minutes, or even hours and can easily require over 100x compute for challenging queries compared to a single inference pass on a traditional LLM, which would be highly unlikely to produce a correct answer in response to a complex problem on the first try.This AI reasoning process can take multiple minutes, or even hours and can easily require over 100x compute for challenging queries compared to a single inference pass on a traditional LLM.This test-time compute capability enables AI models to explore different solutions to a problem and break down complex requests into multiple steps in many cases, showing their work to the user as they reason. Studies have found that test-time scaling results in higher-quality responses when AI models are given open-ended prompts that require several reasoning and planning steps.The test-time compute methodology has many approaches, including:Chain-of-thought prompting: Breaking down complex problems into a series of simpler steps.Sampling with majority voting: Generating multiple responses to the same prompt, then selecting the most frequently recurring answer as the final output.Search: Exploring and evaluating multiple paths present in a tree-like structure of responses.Post-training methods like best-of-n sampling can also be used for long thinking during inference to optimize responses in alignment with human preferences or other objectives.Test-time scaling enhances inference by allocating extra compute to improve AI reasoning, enabling models to tackle complex, multi-step problems effectively.How Test-Time Scaling Enables AI ReasoningThe rise of test-time compute unlocks the ability for AI to offer well-reasoned, helpful and more accurate responses to complex, open-ended user queries. These capabilities will be critical for the detailed, multistep reasoning tasks expected of autonomous agentic AI and physical AI applications. Across industries, they could boost efficiency and productivity by providing users with highly capable assistants to accelerate their work.In healthcare, models could use test-time scaling to analyze vast amounts of data and infer how a disease will progress, as well as predict potential complications that could stem from new treatments based on the chemical structure of a drug molecule. Or, it could comb through a database of clinical trials to suggest options that match an individuals disease profile, sharing its reasoning process about the pros and cons of different studies.In retail and supply chain logistics, long thinking can help with the complex decision-making required to address near-term operational challenges and long-term strategic goals. Reasoning techniques can help businesses reduce risk and address scalability challenges by predicting and evaluating multiple scenarios simultaneously which could enable more accurate demand forecasting, streamlined supply chain travel routes, and sourcing decisions that align with an organizations sustainability initiatives.And for global enterprises, this technique could be applied to draft detailed business plans, generate complex code to debug software, or optimize travel routes for delivery trucks, warehouse robots and robotaxis.AI reasoning models are rapidly evolving. OpenAI o1-mini and o3-mini, DeepSeek R1, and Google DeepMinds Gemini 2.0 Flash Thinking were all introduced in the last few weeks, and additional new models are expected to follow soon.Models like these require considerably more compute to reason during inference and generate correct answers to complex questions which means that enterprises need to scale their accelerated computing resources to deliver the next generation of AI reasoning tools that can support complex problem-solving, coding and multistep planning.Learn about the benefits of NVIDIA AI for accelerated inference.
    0 Comments ·0 Shares ·16 Views
  • Safety First: Leading Partners Adopt NVIDIA Cybersecurity AI to Safeguard Critical Infrastructure
    blogs.nvidia.com
    The rapid evolution of generative AI has created countless opportunities for innovation across industry and research. As is often the case with state-of-the-art technology, this evolution has also shifted the landscape of cybersecurity threats, creating new security requirements. Critical infrastructure cybersecurity is advancing to thwart the next wave of emerging threats in the AI era.Leading operational technology (OT) providers today showcased at the S4 conference for industrial control systems (ICS) and OT cybersecurity how theyre adopting the NVIDIA cybersecurity AI platform to deliver real-time threat detection and critical infrastructure protection.Armis, Check Point, CrowdStrike, Deloitte and World Wide Technology (WWT) are integrating the platform to help customers bolster critical infrastructure, such as energy, utilities and manufacturing facilities, against cyber threats.Critical infrastructure operates in highly complex environments, where the convergence of IT and OT, often accelerated by digital transformation, creates a perfect storm of vulnerabilities. Traditional cybersecurity measures are no longer sufficient to address these emerging threats.By harnessing NVIDIAs cybersecurity AI platform, these partners can provide exceptional visibility into critical infrastructure environments, achieving robust and adaptive security while delivering operational continuity.The platform integrates NVIDIAs accelerated computing and AI, featuring NVIDIA BlueField-3 DPUs, NVIDIA DOCA and the NVIDIA Morpheus AI cybersecurity framework, part of the NVIDIA AI Enterprise. This combination enables real-time threat detection, empowering cybersecurity professionals to respond swiftly at the edge and across networks.Unlike conventional solutions that depend on intrusive methods or software agents, BlueField-3 DPUs function as a virtual security overlay. They inspect network traffic and safeguard host integrity without disrupting operations. Acting as embedded sensors within each server, they stream telemetry data to NVIDIA Morpheus, enabling detailed monitoring of host activities, network traffic and application behaviors seamlessly and without operational impact.Driving Cybersecurity Innovation Across IndustriesIntegrating Armis Centrix, Armis AI-powered cyber exposure management platform, with NVIDIA cybersecurity AI helps secure critical infrastructure like energy, manufacturing, healthcare and transportation. OT environments are increasingly targeted by sophisticated cyber threats, requiring robust solutions that ensure both security and operational continuity, said Nadir Izrael, chief technology officer and cofounder of Armis. Combining Armis unmatched platform for OT security and cyber exposure management with NVIDIA BlueField-3 DPUs enables organizations to comprehensively protect cyber-physical systems without disrupting operations.CrowdStrike is helping secure critical infrastructure such as ICS and OT by deploying its CrowdStrike Falcon security agent on BlueField-3 DPUs to boost real-time AI-powered threat detection and response.OT environments are under increasing threat, demanding AI-powered security that adapts in real time, said Raj Rajamani, head of products at CrowdStrike. By integrating NVIDIA BlueField-3 DPUs with the CrowdStrike Falcon platform, were extending industry-leading protection to critical infrastructure without disrupting operations delivering unified protection at the edge and helping organizations stay ahead of modern threats.Deloitte is driving customers digital transformation, enabled by NVIDIAs cybersecurity AI platform, to help meet the demands of breakthrough technologies that require real-time, granular visibility into data center networks to defend against increasingly sophisticated threats.Protecting OT and ICS systems is becoming increasingly challenging as organizations embrace digital transformation and interconnected technologies, said Dmitry Dudorov, an AI security leader at Deloitte U.K. Harnessing NVIDIAs cybersecurity AI platform can enable organizations to determine threat detection, enhance resilience and safeguard their infrastructure to accelerate their efforts. A Safer Future, Powered by AINVIDIAs cybersecurity AI platform, combined with the expertise of ecosystem partners, offers a powerful and scalable solution to protect critical infrastructure environments against evolving threats. Bringing NVIDIA AI and accelerated computing to the forefront of OT security can help organizations protect what matters most now and in the future.Learn more by attending the NVIDIA GTC global AI conference, running March 17-21, where Armis, Check Point and CrowdStrike cybersecurity leaders will host sessions about their collaborations with NVIDIA.
    0 Comments ·0 Shares ·28 Views
  • What Are Foundation Models?
    blogs.nvidia.com
    Editors note: This article, originally published on March 13, 2023, has been updated.The mics were live and tape was rolling in the studio where the Miles Davis Quintet was recording dozens of tunes in 1956 for Prestige Records.When an engineer asked for the next songs title, Davis shot back, Ill play it, and tell you what it is later.Like the prolific jazz trumpeter and composer, researchers have been generating AI models at a feverish pace, exploring new architectures and use cases. According to the 2024 AI Index report from the Stanford Institute for Human-Centered Artificial Intelligence, 149 foundation models were published in 2023, more than double the number released in 2022.In a 2021 paper, researchers reported that foundation models are finding a wide array of uses.They said transformer models, large language models (LLMs), vision language models (VLMs) and other neural networks still being built are part of an important new category they dubbed foundation models.Foundation Models DefinedA foundation model is an AI neural network trained on mountains of raw data, generally with unsupervised learning that can be adapted to accomplish a broad range of tasks.Two important concepts help define this umbrella category: Data gathering is easier, and opportunities are as wide as the horizon.No Labels, Lots of OpportunityFoundation models generally learn from unlabeled datasets, saving the time and expense of manually describing each item in massive collections.Earlier neural networks were narrowly tuned for specific tasks. With a little fine-tuning, foundation models can handle jobs from translating text to analyzing medical images to performing agent-based behaviors.I think weve uncovered a very small fraction of the capabilities of existing foundation models, let alone future ones, said Percy Liang, the centers director, in the opening talk of the first workshop on foundation models.AIs Emergence and HomogenizationIn that talk, Liang coined two terms to describe foundation models:Emergence refers to AI features still being discovered, such as the many nascent skills in foundation models. He calls the blending of AI algorithms and model architectures homogenization, a trend that helped form foundation models. (See chart below.)The field continues to move fast.A year after the group defined foundation models, other tech watchers coined a related term generative AI. Its an umbrella term for transformers, large language models, diffusion models and other neural networks capturing peoples imaginations because they can create text, images, music, software, videos and more.Generative AI has the potential to yield trillions of dollars of economic value, said executives from the venture firm Sequoia Capital who shared their views in a recent AI Podcast.A Brief History of Foundation ModelsWe are in a time where simple methods like neural networks are giving us an explosion of new capabilities, said Ashish Vaswani, an entrepreneur and former senior staff research scientist at Google Brain who led work on the seminal 2017 paper on transformers.That work inspired researchers who created BERT and other large language models, making 2018 a watershed moment for natural language processing, a report on AI said at the end of that year.Google released BERT as open-source software, spawning a family of follow-ons and setting off a race to build ever larger, more powerful LLMs. Then it applied the technology to its search engine so users could ask questions in simple sentences.In 2020, researchers at OpenAI announced another landmark transformer, GPT-3. Within weeks, people were using it to create poems, programs, songs, websites and more.Language models have a wide range of beneficial applications for society, the researchers wrote.Their work also showed how large and compute-intensive these models can be. GPT-3 was trained on a dataset with nearly a trillion words, and it sports a whopping 175 billion parameters, a key measure of the power and complexity of neural networks. In 2024, Google released Gemini Ultra, a state-of-the-art foundation model that requires 50 billion petaflops.This chart highlights the exponential growth in training compute requirements for notable machine learning models since 2012. (Source: Artificial Intelligence Index Report 2024)I just remember being kind of blown away by the things that it could do, said Liang, speaking of GPT-3 in a podcast.The latest iteration, ChatGPT trained on 10,000 NVIDIA GPUs is even more engaging, attracting over 100 million users in just two months. Its release has been called the iPhone moment for AI because it helped so many people see how they could use the technology.One timeline describes the path from early AI research to ChatGPT. (Source: blog.bytebytego.com)Going MultimodalFoundation models have also expanded to process and generate multiple data types, or modalities, such as text, images, audio and video. VLMs are one type of multimodal models that can understand video, image and text inputs while producing text or visual output.Trained on 355,000 videos and 2.8 million images,Cosmos Nemotron 34B is a leading VLM that enables the ability to query and summarize images and video from the physical or virtual world.From Text to ImagesAbout the same time ChatGPT debuted, another class of neural networks, called diffusion models, made a splash. Their ability to turn text descriptions into artistic images attracted casual users to create amazing images that went viral on social media.The first paper to describe a diffusion model arrived with little fanfare in 2015. But like transformers, the new technique soon caught fire.In a tweet, Midjourney CEO David Holz revealed that his diffusion-based, text-to-image service has more than 4.4 million users. Serving them requires more than 10,000 NVIDIA GPUs mainly for AI inference, he said in an interview (subscription required).Toward Models That Understand the Physical WorldThe next frontier of artificial intelligence is physical AI, which enables autonomous machines like robots and self-driving cars to interact with the real world.AI performance for autonomous vehicles or robots requires extensive training and testing. To ensure physical AI systems are safe, developers need to train and test their systems on massive amounts of data, which can be costly and time-consuming.World foundation models, which can simulate real-world environments and predict accurate outcomes based on text, image, or video input, offer a promising solution.Physical AI development teams are using NVIDIA Cosmos world foundation models, a suite of pre-trained autoregressive and diffusion models trained on 20 million hours of driving and robotics data, with the NVIDIA Omniverse platform to generate massive amounts of controllable, physics-based synthetic data for physical AI. Awarded the Best AI And Best Overall Awards at CES 2025, Cosmos world foundation models are open models that can be customized for downstream use cases or improve precision on a specific task using use case-specific data.Dozens of Models in UseHundreds of foundation models are now available. One paper catalogs and classifies more than 50 major transformer models alone (see chart below).The Stanford group benchmarked 30 foundation models, noting the field is moving so fast they did not review some new and prominent ones.Startup NLP Cloud, a member of the NVIDIA Inception program that nurtures cutting-edge startups, says it uses about 25 large language models in a commercial offering that serves airlines, pharmacies and other users. Experts expect that a growing share of the models will be made open source on sites like Hugging Faces model hub.Experts note a rising trend toward releasing foundation models as open source.Foundation models keep getting larger and more complex, too.Thats why rather than building new models from scratch many businesses are already customizing pretrained foundation models to turbocharge their journeys into AI, using online services like NVIDIA AI Foundation Models.The accuracy and reliability of generative AI is increasing thanks to techniques like retrieval-augmented generation, aka RAG, that lets foundation models tap into external resources like a corporate knowledge base.AI Foundations for BusinessAnother new framework, the NVIDIA NeMo framework, aims to let any business create its own billion- or trillion-parameter transformers to power custom chatbots, personal assistants and other AI applications.It created the 530-billion parameter Megatron-Turing Natural Language Generation model (MT-NLG) that powers TJ, the Toy Jensen avatar that gave part of the keynote at NVIDIA GTC last year.Foundation models connected to 3D platforms like NVIDIA Omniverse will be key to simplifying development of the metaverse, the 3D evolution of the internet. These models will power applications and assets for entertainment and industrial users.Factories and warehouses are already applying foundation models inside digital twins, realistic simulations that help find more efficient ways to work.Foundation models can ease the job of training autonomous vehicles and robots that assist humans on factory floors and logistics centers. They also help train autonomous vehicles by creating realistic environments like the one below.New uses for foundation models are emerging daily, as are challenges in applying them.Several papers on foundation and generative AI models describing risks such as:amplifying bias implicit in the massive datasets used to train models,introducing inaccurate or misleading information in images or videos, andviolating intellectual property rights of existing works.Given that future AI systems will likely rely heavily on foundation models, it is imperative that we, as a community, come together to develop more rigorous principles for foundation models and guidance for their responsible development and deployment, said the Stanford paper on foundation models.Current ideas for safeguards include filtering prompts and their outputs, recalibrating models on the fly and scrubbing massive datasets.These are issues were working on as a research community, said Bryan Catanzaro, vice president of applied deep learning research at NVIDIA. For these models to be truly widely deployed, we have to invest a lot in safety.Its one more field AI researchers and developers are plowing as they create the future.
    0 Comments ·0 Shares ·35 Views
  • NVIDIA CEO Awarded for Advancing Precision Medicine With Accelerated Computing, AI
    blogs.nvidia.com
    NVIDIAs contributions to accelerating medical imaging, genomics, computational chemistry and AI-powered robotics were honored Friday at the Precision Medicine World Conference in Santa Clara, California, where NVIDIA founder and CEO Jensen Huang received a Luminary award.The Precision Medicine World Conference brings together healthcare leaders, top global researchers and innovators across biotechnology. Its Luminary award recognizes people transforming healthcare by advancing precision medicine in the clinic.For nearly two decades, NVIDIA has advanced computing in healthcare working with researchers and industry leaders to build instruments that enable scientists to better understand life sciences, medical imaging and genomics.We built, if you will, a computational instrument. Not a gene sequencer and all the incredible scientific instruments that you all talk about here in our case, it was a programmable scientific instrument, Huang said in his acceptance speech. We built it in service of researchers and scientists as you strive to better understand life in our universe.The first use of accelerated computing in life sciences was in the 2000s and the introduction of the NVIDIA CUDA parallel computing platform in 2006 paved the path for researchers to demonstrate how NVIDIA GPUs could be used in medical imaging applications like CT reconstruction.NVIDIA developed and continues to develop GPUs that are at the heart of AI and machine learning that are changing the world, including precision medicine, said Dr. Gad Getz, an internationally acclaimed leader in cancer genomics and the director of bioinformatics at the Massachusetts General Hospital, as he presented the award.Today, NVIDIA AI and accelerated computing is impacting analysis, interpretation and translation of sequencing data, new sequencing technologies, imaging data, spatial technologies, single-cell genomics, proteomics, molecular dynamics and drug development, as well as the large language models that can be used by doctors, patients, students and teachers to learn this field, Getz said.Advancing Precision Medicine With Accelerated ComputingHuang spoke about the ways AI will support the work of doctors, scientists and researchers advancing medicine. By investing in AI, he explained, research organizations and businesses can set up a powerful flywheel that continuously improves in accuracy, efficiency and insights by integrating additional data and feedback from every expert who interacts with it over time.Even though people say you want humans in the loop with AI, in fact, the opposite is true. You want AI in the loop with humans, Huang said. The reason for that is because when the AI is in the loop with humans, it codifies our life experience. If theres an AI in the loop with every single researcher, scientist, engineer and marketer every single employee in your company that AI in the loop codifies that life experience and keeps it in the company.Looking ahead, Huang said that in the coming years, AI will advance with incredible speed and revolutionize the healthcare industry. AI will help doctors predict, diagnose and treat disease in ways we never thought possible. It will scan a patients genome in seconds, identifying risks before symptoms even appear. AI will build a digital twin of us and model how a tumor evolves, predicting which treatments will work best.I wouldnt be surprised if before 2030, within this decade, were representing basically all cells, said Huang. We have a representation of it, we understand the language of it, and we can predict what happens.Huang predicts that surgical robots will perform minimally invasive procedures with unparalleled precision, robotic caregivers will assist nurses and other healthcare professionals, and robotic labs will run experiments around the clock, accelerating drug discovery. AI assistants, he said, will let doctors focus on what matters most to them: patients.In his talk, Huang also thanked the medical research community and highlighted how great breakthroughs come from partnerships between technology companies, researchers, biotech firms and healthcare leaders. Over 4,000 healthcare companies are part of the NVIDIA Inception program designed to help startups evolve faster.Learn more about accelerated computing in healthcare at NVIDIA GTC, a global AI conference taking place March 17-21 in San Jose, California.
    0 Comments ·0 Shares ·57 Views
  • Technovation Empowers Girls in AI, Making AI Education More Inclusive and Engaging
    blogs.nvidia.com
    Tara Chklovski has spent much of her career inspiring young women to take on some of the worlds biggest challenges using technology.The founder and CEO of education nonprofit Technovation joined the AI Podcast in 2019 to discuss the AI Family Challenge. Now, she returns to explain how inclusive AI makes the world a better and, crucially, less boringplace.In this episode of the NVIDIA AI Podcast, Chklovski and Anshita Saini, a Technovation alumna and member of the technical staff at OpenAI, explore how the nonprofit empowers girls worldwide through technology education.They discuss the organizations growth from its early days to its current focus on AI education and real-world problem-solving.Anshita Saini speaking at the Technovation World Summit event.In addition, Saini shares her journey from creating an app that helped combat a vaping crisis at her high school, to her first exposure to AI, through to her current role working on ChatGPT. She also talks about Wiser AI, an initiative she recently founded to support women leaders and other underrepresented voices in artificial intelligence.The AI Podcast Tara Chklovksi, Anshita Saini on Technovation Pioneering AI Education for Innovation Episode 245Technovation is preparing the next generation of female leaders in AI and technology. Learn about the opportunity to mentor a team of girls for the 2025 season.And learn more about the latest technological advancements by registering for NVIDIA GTC, the conference for the era of AI, taking place March 17-21.Time Stamps2:21 Recognizing AIs revolutionary potential in 2016.5:39 Technovations pioneering approach to incorporating ChatGPT in education.12:17 Saini builds an app through Technovation that addressed a real problem at her high school.29:12 The importance of having women represented on software development teams.You Might Also LikeNVIDIAs Louis Stewart on How AI Is Shaping Workforce DevelopmentLouis Stewart, head of strategic initiatives for NVIDIAs global developer ecosystem, discusses why workforce development is crucial for maximizing AI benefits. He emphasizes the importance of AI education, inclusivity and public-private partnerships in preparing the global workforce for the future. Engaging with AI tools and understanding their impact on the workforce landscape is vital to ensuring these changes benefit everyone.Currents of Change: ITIFs Daniel Castro on Energy-Efficient AI and Climate ChangeAI is everywhere. So, too, are concerns about advanced technologys environmental impact. Daniel Castro, vice president of the Information Technology and Innovation Foundation and director of its Center for Data Innovation, discusses his AI energy use report that addresses misconceptions about AIs energy consumption. He also talks about the need for continued development of energy-efficient technology.How AI Can Enhance Disability Inclusion and EducationU.S. Special Advisor on International Disability Rights at the U.S. Department of State Sara Minkara and Timothy Shriver, chairman of the board of Special Olympics, discuss AIs potential to enhance disability inclusion and education. They discuss the need to hear voices from the disability community in conversations about AI development and policy. They also cover why building an inclusive future is good for societys collective cultural, financial and social well-being.Subscribe to the AI PodcastGet the AI Podcast through Amazon Music, Apple Podcasts, Google Podcasts, Google Play, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, SoundCloud, Spotify, Stitcher and TuneIn.
    0 Comments ·0 Shares ·50 Views
  • AI-Designed Proteins Take on Deadly Snake Venom
    blogs.nvidia.com
    Every year, venomous snakes kill over 100,000 people and leave 300,000 more with devastating injuries amputations, paralysis and permanent disabilities. The victims are often farmers, herders and children in rural communities across sub-Saharan Africa, South Asia and Latin America. For them, a snakebite isnt just a medical crisis its an economic catastrophe.Treatment hasnt changed in over a century. Antivenoms derived from the blood of immunized animals are expensive, difficult to manufacture and often ineffective against the deadliest toxins. Worse, they require refrigeration and trained medical staff, making them unreachable for many who need them most.Now, a team led by Susana Vzquez Torres, a computational biologist working in Nobel Prize winner David Bakers renowned protein design lab at the University of Washington, has used AI to create entirely new proteins that neutralize lethal snake venom in laboratory tests faster, cheaper and more effectively than traditional antivenoms. Their research, published in Nature, introduces a new class of synthetic proteins that successfully protect animals from otherwise lethal doses of snake venom toxins.Susana Vazquez Torres conducts drug-development research. Credit: Ian C. Haydon, UW Medicine Institute for Protein DesignHow AI Cracked the Code on VenomFor over a century, antivenom production has relied on animal immunization, requiring thousands of snake milkings and plasma extractions. Torres and her team hope to replace this with AI-driven protein design, compressing years of work into weeks.Using NVIDIA Ampere and L40 GPUs, the Baker Lab used its deep learning models, including RFdiffusion and ProteinMPNN, to generate millions of potential antitoxin structures in silico, or in computer simulations. Instead of screening a vast number of these proteins in a lab, they used AI tools to predict how the designer proteins would interact with snake venom toxins, rapidly homing in on the most promising designs.The results were remarkable:Newly designed proteins bound tightly to three-finger toxins (3FTx), the deadliest components of elapid venom, effectively neutralizing their toxic effects.Lab tests confirmed their high stability and neutralization capability.Mouse studies showed an 80-100% survival rate following exposure to lethal neurotoxins.The AI-designed proteins were small, heat-resistant and easy to manufacture no cold storage required.A Lifeline for the Most Neglected VictimsUnlike traditional antivenoms, which cost hundreds of dollars per dose, it may be possible to mass-produce these AI-designed proteins at low cost, making life-saving treatment available where its needed most.Many snakebite victims cant afford antivenom or delay seeking care due to cost and accessibility barriers. In some cases, the financial burden of treatment can push entire families deeper into poverty. With an accessible, affordable and shelf-stable antidote, millions of lives and livelihoods could be saved.Beyond Snakebites: The Future of AI-Designed MedicineThis research isnt just about snakebites. The same AI-driven approach could be used to design precision treatments for viral infections, autoimmune diseases and other hard-to-treat conditions, according to the researchers.By replacing trial-and-error drug development with algorithmic precision, researchers using AI to design proteins are working to make life-saving medicines more affordable and accessible worldwide.Torres and her collaborators including researchers from the Technical University of Denmark, University of Northern Colorado and Liverpool School of Tropical Medicine are now focused on preparing these venom-neutralizing proteins for clinical testing and large-scale production.If successful, this AI-driven advancement could save lives, and uplift families and communities around the world.
    0 Comments ·0 Shares ·82 Views
  • When the Earth Talks, AI Listens
    blogs.nvidia.com
    AI built for speech is now decoding the language of earthquakes.A team of researchers from the Earth and environmental sciences division at Los Alamos National Laboratory repurposed Metas Wav2Vec-2.0, an AI model designed for speech recognition, to analyze seismic signals from Hawaiis 2018 Klauea volcano collapse.Their findings, published in Nature Communications, suggest that faults emit distinct signals as they shift patterns that AI can now track in real time. While this doesnt mean AI can predict earthquakes, the study marks an important step toward understanding how faults behave before a slip event.Seismic records are acoustic measurements of waves passing through the solid Earth, said Christopher Johnson, one of the studys lead researchers. From a signal processing perspective, many similar techniques are applied for both audio and seismic waveform analysis.The AI model was tested using data from the 2018 collapse of Hawaiis Klauea caldera, which triggered months of earthquakes and reshaped the volcanic landscape. The lava lake in Halemaumau during the 2020-2021 eruption (USGS/F. Trusdell) is a striking reminder of Klaueas ongoing activity.Big earthquakes dont just shake the ground they upend economies. In the past five years, quakes in Japan, Turkey and California have caused tens of billions of dollars in damage and displaced millions of people.Thats where AI comes in. Led by Johnson, along with Kun Wang and Paul Johnson, the Los Alamos team tested whether speech-recognition AI could make sense of fault movements deciphering the tremors like words in a sentence.To test their approach, the team used data from the dramatic 2018 collapse of Hawaiis Klauea caldera, which triggered a series of earthquakes over three months.The AI analyzed seismic waveforms and mapped them to real-time ground movement, revealing that faults might speak in patterns resembling human speech.Speech recognition models like Wav2Vec-2.0 are well-suited for this task because they excel at identifying complex, time-series data patterns whether involving human speech or the Earths tremors.The AI model outperformed traditional methods, such as gradient-boosted trees, which struggle with the unpredictable nature of seismic signals. Gradient-boosted trees build multiple decision trees in sequence, refining predictions by correcting previous errors at each step.However, these models struggle with highly variable, continuous signals like seismic waveforms. In contrast, deep learning models like Wav2Vec-2.0 excel at identifying underlying patterns.How AI Was Trained to Listen to the EarthUnlike previous machine learning models that required manually labeled training data, the researchers used a self-supervised learning approach to train Wav2Vec-2.0. The model was pretrained on continuous seismic waveforms and then fine-tuned using real-world data from Klaueas collapse sequence.NVIDIA accelerated computing played a crucial role in processing vast amounts of seismic waveform data in parallel. High-performance NVIDIA GPUs accelerated training, enabling the AI to efficiently extract meaningful patterns from continuous seismic signals.Whats Still Missing: Can AI Predict Earthquakes?While the AI showed promise in tracking real-time fault shifts, it was less effective at forecasting future displacement. Attempts to train the model for near-future predictions essentially, asking it to anticipate a slip event before it happens yielded inconclusive results.We need to expand the training data to include continuous data from other seismic networks that contain more variations in naturally occurring and anthropogenic signals, he explained.A Step Toward Smarter Seismic MonitoringDespite the challenges in forecasting, the results mark an intriguing advancement in earthquake research. This study suggests that AI models designed for speech recognition may be uniquely suited to interpreting the intricate, shifting signals faults generate over time.This research, as applied to tectonic fault systems, is still in its infancy, Johnson. The study is more analogous to data from laboratory experiments than large earthquake fault zones, which have much longer recurrence intervals. Extending these efforts to real-world forecasting will require further model development with physics-based constraints.So, no, speech-based AI models arent predicting earthquakes yet. But this research suggests they could one day if scientists can teach it to listen more carefully.Read the full paper, Automatic Speech Recognition Predicts Contemporaneous Earthquake Fault Displacement, to dive deeper into the science behind this groundbreaking research.
    0 Comments ·0 Shares ·97 Views
  • Building More Builders: Gooey.AI Makes AI More Accessible Across Communities
    blogs.nvidia.com
    When non-technical users can create and deploy reliable AI workflows, organizations can do more to serve their clientelePlatforms for developing no- and low-code solutions are bridging the gap between powerful AI models and everyone whod like to harness them.Gooey.AI, a member of the NVIDIA Inception program for cutting-edge startups, offers one such platform, enabling teams to tap into multiple AI tools to improve productivity for frontline workers across the globe. Cofounders Sean Blagsvedt and Archana Prasad join the NVIDIA AI Podcast to discuss how the startups platform is making AI development accessible to developers and non-coders alike.The founders detail Gooey.AIs evolution from a British Council-funded arts project to a comprehensive, open-source, cloud-hosted platform serving over 1 million users in diverse industries like agriculture, healthcare and frontline services. The companys vision centers on democratizing AI development through shareable AI recipes, as well as helping ensure responsible implementation and representation of historically underserved communities in AI model-building.Prasad and Blagsvedt discuss unique applications, such as multilingual chatbots that support African farmers via messaging apps and AI assistants that help heating, ventilation, and air conditioning technicians access technical documentation.Given the rapid adoption of low-code AI platforms is helping organizations of all sizes and charters overcome technical barriers while improving access to expertise, Blagsvedt noted, You cant [create] good technology that changes the world just by focusing on the technology you have to find the problem worth solving.The AI Podcast AI for Everyone: How Gooey.AI Empowers Global Frontline Workers with Low Code Workflows Episode 244Learn more about the latest advancements in AI by registering for NVIDIA GTC, the conference for the era of AI, taking place March 17-21.Time Stamps00:31 How a development platform began life as a British Council arts project called Dara.network.17:53 Working with the Gates Foundation, DigitalGreen and Opportunity International on agricultural chatbots.33:21 The influence of HTML standards and Kubernetes on Gooey.AIs approach.You Might Also LikeNVIDIAs Louis Stewart on How AI Is Shaping Workforce DevelopmentLouis Stewart, head of strategic initiatives for NVIDIAs global developer ecosystem, discusses why workforce development is crucial for maximizing AI benefits. He emphasizes the importance of AI education, inclusivity and public-private partnerships in preparing the global workforce for the future. Engaging with AI tools and understanding their impact on the workforce landscape is vital for ensuring these changes benefit everyone.Living Optics CEO Robin Wang on Democratizing Hyperspectral ImagingStep into the realm of the unseen with Robin Wang, CEO of Living Optics. Living Optics hyperspectral imaging camera, which can capture visual data across 96 colors, reveals details invisible to the human eye. Potential applications are as diverse as monitoring plant health to detecting cracks in bridges. Living Optics aims to empower users across industries to gain new insights from richer, more informative datasets fueled by hyperspectral imaging technology.Yotta CEO Sunil Gupta on Supercharging Indias Fast-Growing AI MarketIndias AI market is expected to be massive. Yotta Data Services is setting its sights on supercharging it. Sunil Gupta, cofounder, managing director and CEO of Yotta Data Services, details the companys Shakti Cloud offering, which provides scalable GPU services for enterprises of all sizes. Yotta is the first Indian cloud services provider in the NVIDIA Partner Network, and its Shakti Cloud is Indias fastest AI supercomputing infrastructure, with 16 exaflops of compute capacity supported by over 16,000 NVIDIA H100 GPUs.Subscribe to the AI PodcastGet theAI PodcastthroughAmazon Music,Apple Podcasts,Google Podcasts,Google Play,Castbox, DoggCatcher,Overcast,PlayerFM, Pocket Casts,Podbay,PodBean, PodCruncher, PodKicker,SoundCloud,Spotify,StitcherandTuneIn.
    0 Comments ·0 Shares ·89 Views
  • Medieval Mayhem Arrives With Kingdom Come: Deliverance II on GeForce NOW
    blogs.nvidia.com
    GeForce NOW celebrates its fifth anniversary this February with a lineup of five major releases. The month kicks off with Kingdom Come: Deliverance II. Prepare for a journey back in time Warhorse Studios newest medieval role-playing game (RPG) comes to GeForce NOW on its launch day, bringing 15th-century Bohemia to devices everywhere.Experience the highly anticipated sequels stunning open world at GeForce RTX quality in the cloud, available to stream across devices at launch. It leads seven games joining the GeForce NOW library of over 2,000 titles, along with MARVEL vs. CAPCOM Fighting Collection: Arcade Classics.Chainmail Meets the CloudThe fate of a kingdom rests in your hands.Kingdom Come: Deliverance II continues the epic, open-world RPG saga set in the brutal and realistic medieval world of Bohemia. Continue the story of Henry, a blacksmiths son turned warrior, as he navigates political intrigue and warfare. Explore a world twice the size of the original, whether in the bustling streets of Kuttenberg or the picturesque Bohemian Paradise.The game builds on its predecessors realistic combat system by introducing crossbows, early firearms and a host of new weapons, while refining its already sophisticated melee combat mechanics. Navigate a complex narrative full of difficult decisions, forge alliances with powerful figures, engage in tactical large-scale battles and face moral dilemmas that impact both the journey and fate of the kingdom all while experiencing a historically rich environment faithful to the period.The game also features enhanced graphics powered by GeForce RTX, making it ideal to stream on GeForce NOW even without a game-ready rig. Experience all the medieval action at up to 4K and 120 frames per second with eight-hour sessions using an Ultimate membership, or 1440p and 120 fps with six-hour sessions using a Performance membership. Enjoy seamless gameplay, stunning visuals and smooth performance throughout the vast, immersive world of Bohemia.Sound the Alarm for New GamesMove out the way!Experience every aspect of a paramedics life in Ambulance Life: A Paramedic Simulator from Nacon Games. Quickly reach the accident site, take care of the injured and apply first aid. Each accident is different. Its up to players to adapt and make the right choices while being fast and efficient. Explore three different Districts containing a variety of environments. At each accident site, analyze the situation to precisely determine the right treatment for each patient. Build a reputation, unlock new tools and get assigned to new districts with thrilling new situations.Look for the following games available to stream in the cloud this week:Kingdom Come: Deliverance II (New release on Steam, Feb. 4)Sid Meiers Civilization VII (New release on Steam and Epic Games Store, Advanced access on Feb. 5)Ambulance Life: A Paramedic Simulator (New Release on Steam, Feb. 6)SWORN (New release on Steam, Feb. 6)Alan Wake (Xbox, available on the Microsoft Store)Ashes of the Singularity: Escalation (Xbox, available on the Microsoft Store)Far Cry: New Dawn (New release on PC Game Pass, Feb. 4)What are you planning to play this weekend? Let us know on X or in the comments below.Ultimate medieval game weapon go! NVIDIA GeForce NOW (@NVIDIAGFN) February 5, 2025
    0 Comments ·0 Shares ·87 Views
  • AI Pays Off: Survey Reveals Financial Industrys Latest Technological Trends
    blogs.nvidia.com
    The financial services industry is reaching an important milestone with AI, as organizations move beyond testing and experimentation to successful AI implementation, driving business results.NVIDIAs fifth annual State of AI in Financial Services report shows how financial institutions have consolidated their AI efforts to focus on core applications, signaling a significant increase in AI capability and proficiency.AI Helps Drive Revenue and Save CostsCompanies investing in AI are seeing tangible benefits, including increased revenue and cost savings.Nearly 70% of respondents report that AI has driven a revenue increase of 5% or more, with a dramatic rise in those seeing a 10-20% revenue boost. In addition, more than 60% of respondents say AI has helped reduce annual costs by 5% or more. Nearly a quarter of respondents are planning to use AI to create new business opportunities and revenue streams.The top generative AI use cases in terms of return on investment (ROI) are trading and portfolio optimization, which account for 25% of responses, followed by customer experience and engagement at 21%. These figures highlight the practical, measurable benefits of AI as it transforms key business areas and drives financial gains.Overcoming Barriers to AI SuccessHalf of management respondents said theyve deployed their first generative AI service or application, with an additional 28% planning to do so within the next six months. A 50% decline in the number of respondents reporting a lack of AI budget suggests increasing dedication to AI development and resource allocation.The challenges associated with early AI exploration are also diminishing. The survey revealed fewer companies reporting data issues and privacy concerns, as well as reduced concern over insufficient data for model training. These improvements reflect growing expertise and better data management practices within the industry.As financial services firms allocate budget and grow more savvy at data management, they can better position themselves to harness AI for enhanced operational efficiency, security and innovation across business functions.Generative AI Powers More Use CasesAfter data analytics, generative AI has emerged as the second-most-used AI workload in the financial services industry. The applications of the technology have expanded significantly, from enhancing customer experience to optimizing trading and portfolio management.Notably, the use of generative AI for customer experience, particularly via chatbots and virtual assistants, has more than doubled, rising from 25% to 60%. This surge is driven by the increasing availability, cost efficiency and scalability of generative AI technologies for powering more sophisticated and accurate digital assistants that can enhance customer interactions.More than half of the financial professionals surveyed are now using generative AI to enhance the speed and accuracy of critical tasks like document processing and report generation.Financial institutions are also poised to benefit from agentic AI systems that harness vast amounts of data from various sources and use sophisticated reasoning to autonomously solve complex, multistep problems. Banks and asset managers can use agentic AI systems to enhance risk management, automate compliance processes, optimize investment strategies and personalize customer services.Advanced AI Drives InnovationRecognizing the transformative potential of AI, companies are taking proactive steps to build AI factories specially built accelerated computing platforms equipped with full-stack AI software through cloud providers or on premises. This strategic focus on implementing high-value AI use cases is crucial to enhancing customer service, boosting revenue and reducing costs.By tapping into advanced infrastructure and software, companies can streamline the development and deployment of AI models and position themselves to harness the power of agentic AI.With industry leaders predicting at least 2x ROI on AI investments, financial institutions remain highly motivated to implement their highest-value AI use cases to drive efficiency and innovation.Download the full report to learn more about how financial services companies are using accelerated computing and AI to transform services and business operations.
    0 Comments ·0 Shares ·102 Views
  • How GeForce RTX 50 Series GPUs Are Built to Supercharge Generative AI on PCs
    blogs.nvidia.com
    NVIDIAs GeForce RTX 5090 and 5080 GPUs which are based on the groundbreaking NVIDIA Blackwell architecture offer up to 8x faster frame rates with NVIDIA DLSS 4 technology, lower latency with NVIDIA Reflex 2 and enhanced graphical fidelity with NVIDIA RTX neural shaders.These GPUs were built to accelerate the latest generative AI workloads, delivering up to 3,352 AI trillion operations per second (TOPS), enabling incredible experiences for AI enthusiasts, gamers, creators and developers.To help AI developers and enthusiasts harness these capabilities, NVIDIA at the CES trade show last month unveiled NVIDIA NIM and AI Blueprints for RTX. NVIDIA NIM microservices are prepackaged generative AI models that let developers and enthusiasts easily get started with generative AI, iterate quickly and harness the power of RTX for accelerating AI on Windows PCs. NVIDIA AI Blueprints are reference projects that show developers how to use NIM microservices to build the next generation of AI experiences.NIM and AI Blueprints are optimized for GeForce RTX 50 Series GPUs. These technologies work together seamlessly to help developers and enthusiasts build, iterate and deliver cutting-edge AI experiences on AI PCs.NVIDIA NIM Accelerates Generative AI on PCsWhile AI model development is rapidly advancing, bringing these innovations to PCs remains a challenge for many people. Models posted on platforms like Hugging Face must be curated, adapted and quantized to run on PC. They also need to be integrated into new AI application programming interfaces (APIs) to ensure compatibility with existing tools, and converted to optimized inference backends for peak performance.NVIDIA NIM microservices for RTX AI PCs and workstations can ease the complexity of this process by providing access to community-driven and NVIDIA-developed AI models. These microservices are easy to download and connect to via industry-standard APIs and span the key modalities essential for AI PCs. They are also compatible with a wide range of AI tools and offer flexible deployment options, whether on PCs, in data centers, or in the cloud.NIM microservices include everything needed to run optimized models on PCs with RTX GPUs, including prebuilt engines for specific GPUs, the NVIDIA TensorRT software development kit (SDK), the open-source NVIDIA TensorRT-LLM library for accelerated inference using Tensor Cores, and more.Microsoft and NVIDIA worked together to enable NIM microservices and AI Blueprints for RTX in Windows Subsystem for Linux (WSL2). With WSL2, the same AI containers that run on data center GPUs can now run efficiently on RTX PCs, making it easier for developers to build, test and deploy AI models across platforms.In addition, NIM and AI Blueprints harness key innovations of the Blackwell architecture that the GeForce RTX 50 series is built on, including fifth-generation Tensor Cores and support for FP4 precision.Tensor Cores Drive Next-Gen AI PerformanceAI calculations are incredibly demanding and require vast amounts of processing power. Whether generating images and videos or understanding language and making real-time decisions, AI models rely on hundreds of trillions of mathematical operations to be completed every second. To keep up, computers need specialized hardware built specifically for AI.NVIDIA GeForce RTX desktop GPUs deliver up to 3,352 AI TOPS for unmatched speed and efficiency in AI-powered workflows.In 2018, NVIDIA GeForce RTX GPUs changed the game by introducing Tensor Cores dedicated AI processors designed to handle these intensive workloads. Unlike traditional computing cores, Tensor Cores are built to accelerate AI by performing calculations faster and more efficiently. This breakthrough helped bring AI-powered gaming, creative tools and productivity applications into the mainstream.Blackwell architecture takes AI acceleration to the next level. The fifth-generation Tensor Cores in Blackwell GPUs deliver up to 3,352 AI TOPS to handle even more demanding AI tasks and simultaneously run multiple AI models. This means faster AI-driven experiences, from real-time rendering to intelligent assistants, that pave the way for greater innovation in gaming, content creation and beyond.FP4 Smaller Models, Bigger PerformanceAnother way to optimize AI performance is through quantization, a technique that reduces model sizes, enabling the models to run faster while reducing the memory requirements.Enter FP4 an advanced quantization format that allows AI models to run faster and leaner without compromising output quality. Compared with FP16, it reduces model size by up to 60% and more than doubles performance, with minimal degradation.For example, Black Forest Labs FLUX.1 [dev] model at FP16 requires over 23GB of VRAM, meaning it can only be supported by the GeForce RTX 4090 and professional GPUs. With FP4, FLUX.1 [dev] requires less than 10GB, so it can run locally on more GeForce RTX GPUs.On a GeForce RTX 4090 with FP16, the FLUX.1 [dev] model can generate images in 15 seconds with just 30 steps. With a GeForce RTX 5090 with FP4, images can be generated in just over five seconds.FP4 is natively supported by the Blackwell architecture, making it easier than ever to deploy high-performance AI on local PCs. Its also integrated into NIM microservices, effectively optimizing models that were previously difficult to quantize. By enabling more efficient AI processing, FP4 helps to bring faster, smarter AI experiences for content creation.AI Blueprints Power Advanced AI Workflows on RTX PCsNVIDIA AI Blueprints, built on NIM microservices, provide prepackaged, optimized reference implementations that make it easier to develop advanced AI-powered projects whether for digital humans, podcast generators or application assistants.At CES, NVIDIA demonstrated PDF to Podcast, a blueprint that allows users to convert a PDF into a fun podcast, and even create a Q&A with the AI podcast host afterwards.This workflow integrates seven different AI models, all working in sync to deliver a dynamic, interactive experience.The blueprint for PDF to podcast harnesses several AI models to seamlessly convert PDFs into engaging podcasts, complete with an interactive Q&A feature hosted by an AI-powered podcast host.With AI Blueprints, users can quickly go from experimenting with to developing AI on RTX PCs and workstations.NIM and AI Blueprints Coming Soon to RTX PCs and WorkstationsGenerative AI is pushing the boundaries of whats possible across gaming, content creation and more. With NIM microservices and AI Blueprints, the latest AI advancements are no longer limited to the cloud theyre now optimized for RTX PCs. With RTX GPUs, developers and enthusiasts can experiment, build and deploy AI locally, right from their PCs and workstations.NIM microservices and AI Blueprints are coming soon, with initial hardware support for GeForce RTX 50 Series, GeForce RTX 4090 and 4080, and NVIDIA RTX 6000 and 5000 professional GPUs. Additional GPUs will be supported in the future.
    0 Comments ·0 Shares ·98 Views
  • NVIDIA Blackwell Now Generally Available in the Cloud
    blogs.nvidia.com
    AI reasoning models and agents are set to transform industries, but delivering their full potential at scale requires massive compute and optimized software. The reasoning process involves multiple models, generating many additional tokens, and demands infrastructure with a combination of high-speed communication, memory and compute to ensure real-time, high-quality results.To meet this demand, CoreWeave has launched NVIDIA GB200 NVL72-based instances, becoming the first cloud service provider to make the NVIDIA Blackwell platform generally available.With rack-scale NVIDIA NVLink across 72 NVIDIA Blackwell GPUs and 36 NVIDIA Grace CPUs, scaling to up to 110,000 GPUs with NVIDIA Quantum-2 InfiniBand networking, these instances provide the scale and performance needed to build and deploy the next generation of AI reasoning models and agents.NVIDIA GB200 NVL72 on CoreWeaveNVIDIA GB200 NVL72 is a liquid-cooled, rack-scale solution with a 72-GPU NVLink domain, which enables the six dozen GPUs to act as a single massive GPU.NVIDIA Blackwell features many technological breakthroughs that accelerate inference token generation, boosting performance while reducing service costs. For example, fifth-generation NVLink enables 130TB/s of GPU bandwidth in one 72-GPU NVLink domain, and the second-generation Transformer Engine enables FP4 for faster AI performance while maintaining high accuracy.CoreWeaves portfolio of managed cloud services is purpose-built for Blackwell. CoreWeave Kubernetes Service optimizes workload orchestration by exposing NVLink domain IDs, ensuring efficient scheduling within the same rack. Slurm on Kubernetes (SUNK) supports the topology block plug-in, enabling intelligent workload distribution across GB200 NVL72 racks. In addition, CoreWeaves Observability Platform provides real-time insights into NVLink performance, GPU utilization and temperatures.CoreWeaves GB200 NVL72 instances feature NVIDIA Quantum-2 InfiniBand networking that delivers 400Gb/s bandwidth per GPU for clusters up to 110,000 GPUs. NVIDIA BlueField-3 DPUs also provide accelerated multi-tenant cloud networking, high-performance data access and GPU compute elasticity for these instances.Full-Stack Accelerated Computing Platform for Enterprise AINVIDIAs full-stack AI platform pairs cutting-edge software with Blackwell-powered infrastructure to help enterprises build fast, accurate and scalable AI agents.NVIDIA Blueprints provides pre-defined, customizable, ready-to-deploy reference workflows to help developers create real-world applications. NVIDIA NIM is a set of easy-to-use microservices designed for secure, reliable deployment of high-performance AI models for inference. NVIDIA NeMo includes tools for training, customization and continuous improvement of AI models for modern enterprise use cases. Enterprises can use NVIDIA Blueprints, NIM and NeMo to build and fine-tune models for their specialized AI agents.These software components, all part of the NVIDIA AI Enterprise software platform, are key enablers to delivering agentic AI at scale and can readily be deployed on CoreWeave.Bringing Next-Generation AI to the CloudThe general availability of NVIDIA GB200 NVL72-based instances on CoreWeave underscores the latest in the companies collaboration, focused on delivering the latest accelerated computing solutions to the cloud. With the launch of these instances, enterprises now have access to the scale and performance needed to power the next wave of AI reasoning models and agents.Customers can start provisioning GB200 NVL72-based instances through CoreWeave Kubernetes Service in the US-WEST-01 region using the gb200-4x instance ID. To get started, contact CoreWeave.
    0 Comments ·0 Shares ·103 Views
  • Accelerate DeepSeek Reasoning Models With NVIDIA GeForce RTX 50 Series AI PCs
    blogs.nvidia.com
    The recently released DeepSeek-R1 model family has brought a new wave of excitement to the AI community, allowing enthusiasts and developers to run state-of-the-art reasoning models with problem-solving, math and code capabilities, all from the privacy of local PCs.With up to 3,352 trillion operations per second of AI horsepower, NVIDIA GeForce RTX 50 Series GPUs can run the DeepSeek family of distilled models faster than anything on the PC market.A New Class of Models That ReasonReasoning models are a new class of large language models (LLMs) that spend more time on thinking and reflecting to work through complex problems, while describing the steps required to solve a task.The fundamental principle is that any problem can be solved with deep thought, reasoning and time, just like how humans tackle problems. By spending more time and thus compute on a problem, the LLM can yield better results. This phenomenon is known as test-time scaling, where a model dynamically allocates compute resources during inference to reason through problems.Reasoning models can enhance user experiences on PCs by deeply understanding a users needs, taking actions on their behalf and allowing them to provide feedback on the models thought process unlocking agentic workflows for solving complex, multi-step tasks such as analyzing market research, performing complicated math problems, debugging code and more.The DeepSeek DifferenceThe DeepSeek-R1 family of distilled models is based on a large 671-billion-parameter mixture-of-experts (MoE) model. MoE models consist of multiple smaller expert models for solving complex problems. DeepSeek models further divide the work and assign subtasks to smaller sets of experts.DeepSeek employed a technique called distillation to build a family of six smaller student models ranging from 1.5-70 billion parameters from the large DeepSeek 671-billion-parameter model. The reasoning capabilities of the larger DeepSeek-R1 671-billion-parameter model were taught to the smaller Llama and Qwen student models, resulting in powerful, smaller reasoning models that run locally on RTX AI PCs with fast performance.Peak Performance on RTXInference speed is critical for this new class of reasoning models. GeForce RTX 50 Series GPUs, built with dedicated fifth-generation Tensor Cores, are based on the same NVIDIA Blackwell GPU architecture that fuels world-leading AI innovation in the data center. RTX fully accelerates DeepSeek, offering maximum inference performance on PCs.Throughput performance of the Deepseek-R1 distilled family of models across GPUs on the PC.Experience DeepSeek on RTX in Popular ToolsNVIDIAs RTX AI platform offers the broadest selection of AI tools, software development kits and models, opening access to the capabilities of DeepSeek-R1 on over 100 million NVIDIA RTX AI PCs worldwide, including those powered by GeForce RTX 50 Series GPUs.High-performance RTX GPUs make AI capabilities always available even without an internet connection and offer low latency and increased privacy because users dont have to upload sensitive materials or expose their queries to an online service.Experience the power of DeepSeek-R1 and RTX AI PCs through a vast ecosystem of software, including Llama.cpp, Ollama, LM Studio, AnythingLLM, Jan.AI, GPT4All and OpenWebUI, for inference. Plus, use Unsloth to fine-tune the models with custom data.
    0 Comments ·0 Shares ·144 Views
  • DeepSeek-R1 Now Live With NVIDIA NIM
    blogs.nvidia.com
    DeepSeek-R1 is an open model with state-of-the-art reasoning capabilities. Instead of offering direct responses, reasoning models like DeepSeek-R1 perform multiple inference passes over a query, conducting chain-of-thought, consensus and search methods to generate the best answer.Performing this sequence of inference passes using reason to arrive at the best answer is known as test-time scaling. DeepSeek-R1 is a perfect example of this scaling law, demonstrating why accelerated computing is critical for the demands of agentic AI inference.As models are allowed to iteratively think through the problem, they create more output tokens and longer generation cycles, so model quality continues to scale. Significant test-time compute is critical to enable both real-time inference and higher-quality responses from reasoning models like DeepSeek-R1, requiring larger inference deployments.R1 delivers leading accuracy for tasks demanding logical inference, reasoning, math, coding and language understanding while also delivering high inference efficiency.To help developers securely experiment with these capabilities and build their own specialized agents, the 671-billion-parameter DeepSeek-R1 model is now available as an NVIDIA NIM microservice preview on build.nvidia.com. The DeepSeek-R1 NIM microservice can deliver up to 3,872 tokens per second on a single NVIDIA HGX H200 system.Developers can test and experiment with the application programming interface (API), which is expected to be available soon as a downloadable NIM microservice, part of the NVIDIA AI Enterprise software platform.The DeepSeek-R1 NIM microservice simplifies deployments with support for industry-standard APIs. Enterprises can maximize security and data privacy by running the NIM microservice on their preferred accelerated computing infrastructure. Using NVIDIA AI Foundry with NVIDIA NeMo software, enterprises will also be able to create customized DeepSeek-R1 NIM microservices for specialized AI agents.DeepSeek-R1 a Perfect Example of Test-Time ScalingDeepSeek-R1 is a large mixture-of-experts (MoE) model. It incorporates an impressive 671 billion parameters 10x more than many other popular open-source LLMs supporting a large input context length of 128,000 tokens. The model also uses an extreme number of experts per layer. Each layer of R1 has 256 experts, with each token routed to eight separate experts in parallel for evaluation.Delivering real-time answers for R1 requires many GPUs with high compute performance, connected with high-bandwidth and low-latency communication to route prompt tokens to all the experts for inference. Combined with the software optimizations available in the NVIDIA NIM microservice, a single server with eight H200 GPUs connected using NVLink and NVLink Switch can run the full, 671-billion-parameter DeepSeek-R1 model at up to 3,872 tokens per second. This throughput is made possible by using the NVIDIA Hopper architectures FP8 Transformer Engine at every layer and the 900 GB/s of NVLink bandwidth for MoE expert communication.Getting every floating point operation per second (FLOPS) of performance out of a GPU is critical for real-time inference. The next-generation NVIDIA Blackwell architecture will give test-time scaling on reasoning models like DeepSeek-R1 a giant boost with fifth-generation Tensor Cores that can deliver up to 20 petaflops of peak FP4 compute performance and a 72-GPU NVLink domain specifically optimized for inference.Get Started Now With the DeepSeek-R1 NIM MicroserviceDevelopers can experience the DeepSeek-R1 NIM microservice, now available on build.nvidia.com. Watch how it works:With NVIDIA NIM, enterprises can deploy DeepSeek-R1 with ease and ensure they get the high efficiency needed for agentic AI systems.See notice regarding software product information.
    0 Comments ·0 Shares ·135 Views
  • GeForce NOW Celebrates Five Years of Cloud Gaming With AAA Blockbusters
    blogs.nvidia.com
    GeForce NOW turns five this February. Five incredible years of high-performance gaming have been made possible thanks to the members whove joined the cloud gaming platform on its remarkable journey.Since exiting beta in 2020, GeForce NOW has changed how gamers access and enjoy their favorite titles. The cloud has come a long way, introducing groundbreaking new features and supporting over 2,000 games from celebrated publishers for members to play.Five years of cloud gaming excellence deserves a celebration. As part of an epic February lineup of 17 games coming this month, every week, GeForce NOW will deliver a major game release in the cloud. This includes the highly anticipated Kingdom Come: Deliverance II from Warhorse Studios, Avowed from Obsidian Entertainment and Sid Meiers Civilization VII from 2K Games. Make sure to stay tuned to GFN Thursdays to see what else is in store.This GFN Thursday, check out the nine titles available to stream this week, including standout title Pax Dei, a medieval massively multiplayer online (MMO) game from Mainframe Industries. Whether seeking mythical exploration or heart-pounding sci-fi combat thrills, GeForce NOW provides unforgettable experiences for every kind of gamer.Magical New GamesBuild a kingdom one medieval dream at a time.Pax Dei is a vast, social sandbox MMO where myths are real, ghosts wander and magic shapes a breathtaking medieval world. Choose a path and forge a legacy as a master builder, fearless explorer, skilled warrior or dedicated craftsman. Build thriving villages in the Heartlands, craft resources alongside Clans and venture into the dangerous Wilderness to battle dark forces, uncover ancient secrets and vie for power. The further one goes, the greater the challenges and the rewards. In Pax Dei, every action shapes the story in a dynamic, living world. The Steam version arrives this week in the cloud, with the Epic Games Store version coming soon.Look for the following games available to stream in the cloud this week:Space Engineers 2 (New release on Steam, Jan. 27)Eternal Strands (New release on Steam, Jan. 28)Orcs Must Die! Deathtrap (New release on Steam, Jan. 28)Sniper Elite: Resistance (New release on Steam and Xbox, available on PC Game Pass, Jan. 30)Heart of the Machine (New release on Steam, Jan. 31)Citizen Sleeper 2: Starward Vector (New release on Steam, Jan. 31)Dead Island 2 (Xbox, available on PC Game Pass)Pax Dei (Steam)Sifu (Steam)Heres what to expect for the rest of February:Kingdom Come Deliverance II (New release on Steam, Feb. 4)Ambulance Life: A Paramedic Simulator (New Release on Steam, Feb. 6)SWORN (New release on Steam, Feb. 6)Sid Meiers Civilization VII (New release on Steam and Epic Games Store, Feb. 11)Legacy: Steel & Sorcery (New release on Steam, Feb. 12)Tomb Raider IV-VI Remastered (New release on Steam, Feb. 14)Avowed (New release on Steam, Battle.net and Xbox, available on PC Game Pass, Feb. 18)Lost Records: Bloom & Rage (New release on Steam, Feb. 18)Abiotic Factor (Steam)Alan Wake (Xbox, available on the Microsoft Store)Ashes of the Singularity: Escalation (Xbox, available on the Microsoft Store)The Dark Crystal: Age of Resistance Tactics (Xbox, available on the Microsoft Store)HUMANITY (Steam)Murky Divers (Steam)Somerville (Xbox, available on the Microsoft Store)Songs of Silence (Steam)UNDER NIGHT IN-BIRTH II Sys:Celes (Steam)Joyful JanuaryIn addition to the 14 games announced last month, 15 more joined the GeForce NOW library:Road 96 (New release on Xbox, available on PC Game Pass, Jan. 7)Aloft (New release on Steam, Jan. 15)Assetto Corsa EVO (New release on Steam, Jan. 16)Among Us (Xbox, available on PC Game Pass)Amnesia: Collection (Xbox, available on the Microsoft Store)DREDGE (Epic Games Store)Generation Zero (Xbox, available on PC Game Pass)HOT WHEELS UNLEASHED 2 Turbocharged (Xbox, available on PC Game Pass)Kingdom Come: Deliverance (Xbox, available on the Microsoft Store)Lawn Mowing Simulator (Xbox, available on the Microsoft Store)Marvel Rivals (Steam)Sins of a Solar Empire: Rebellion (Xbox, available on the Microsoft Store)SMITE 2 (Steam)STORY OF SEASONS: Friends of Mineral Town (Xbox, available on the Microsoft Store)Townscaper (Xbox, available on the Microsoft Store)What are you planning to play this weekend? Let us know on X or in the comments below.morning! predict your most played game of 2025 below NVIDIA GeForce NOW (@NVIDIAGFN) January 27, 2025
    0 Comments ·0 Shares ·123 Views
  • Lights, Camera, Action: New NVIDIA Broadcast AI Features Now Streaming With GeForce RTX 50 Series GPUs
    blogs.nvidia.com
    New GeForce RTX 5090 and RTX 5080 GPUs built on the NVIDIA Blackwell architecture are now available to power generative AI content creation and accelerate creative performance.GeForce RTX 5090 and RTX 5080 GPUs feature fifth-generation Tensor Cores with support for FP4, reducing the VRAM requirements to run generative AI models while doubling performance. For example, Black Forest Labs FLUX models available on Hugging Face this week at FP4 precision require less than 10GB of VRAM, compared with over 23GB at FP16. With a GeForce RTX 5090 GPU, the FLUX.1 [dev] model can generate images in just over five seconds, compared with 15 seconds on FP16 or 10 seconds on FP8 on a GeForce RTX 4090 GPU.GeForce RTX 50 Series GPUs also come equipped with ninth-generation encoders and sixth-generation decoders that add support for 4:2:2 and increase encoding quality for HEVC and AV1. Fourth-generation RT Cores paired with DLSS 4 provide creators with super-smooth 3D rendering viewports.The GeForce RTX 5090 is a content creation powerhouse. PC WorldThe GeForce RTX 5090 GPU includes 32GB of ultra-fast GDDR7 memory and 1,792 GB/sec of total memory bandwidth a 77% bandwidth increase over the GeForce RTX 4090 GPU. It also includes three encoders and two decoders, reducing export times by a third compared with the prior generation.The GeForce RTX 5080 GPU features 16GB of GDDR7 memory, providing up to 960 GB/sec of total memory bandwidth a 34% increase over the GeForce RTX 4080 GPU. And it includes two encoders and two decoders to boost video editing workloads.The NVIDIA GeForce RTX 5080 FE is notable on its own as a viable powerhouse option for any creative pro Creative BloqThe latest version of the NVIDIA Broadcast app is now available, adding two new beta AI effects Studio Voice and Virtual Key Light and improvements to existing ones, along with an updated user interface for better usability.In addition, the January NVIDIA Studio Driver with support for the GeForce RTX 5090 and 5080 GPUs is ready for installation today. For automatic Studio Driver notifications, download the NVIDIA app, including an update for RTX Video Super Resolution expanding the lineup of GeForce RTX GPUs that can run RTX Video Super Resolution for higher-quality video.Use the GeForce RTX graphics card product finder to pick up GeForce RTX 5090 and RTX 5080 GPUs or a prebuilt system today.Lights, Camera, BroadcastThe latest NVIDIA Broadcast app release features two new AI effects Studio Voice and Virtual Key Light both currently in beta.Studio Voice enhances a users microphone to match that of a high-quality microphone. Virtual Key Light relights subjects to deliver even lighting, as if a physical key light was defining the form and dimension of an individual. The new effects require a GeForce RTX 4080 or 5080 GPU or higher, and are designed for chatting streams and podcasts these are not recommended for gaming.The app update also improves voice quality with the Background Noise Removal feature, adds gaze stability and subtle random eye movements for a more natural appearance with Eye Contact, and improves foreground and background separation with Virtual Background.The updated NVIDIA Broadcast app interface.Theres also an updated user interface that allows users to apply more effects simultaneously and includes a side-by-side camera preview option, a GPU utilization meter and more.Developers can integrate these effects directly into applications with NVIDIA Maxine Windows software development kits (SDKs) or by accessing them as an NVIDIA NIM microservice.The updated NVIDIA Broadcast app is available for download today.Accelerating Creative WorkflowsFor video editors, all GeForce RTX 50 Series GPUs include 4:2:2 hardware support and can decode a single video source at up to 8K at 75 frames per second (fps) or nine video sources at 4K at 30 fps per decoder, enabling smooth multi-camera video editing.The GeForce RTX 5090 is currently unmatched in the consumer GPU market nothing can touch it in terms of performance, with virtually any workload AI, content creation, gaming, you name it. Hot HardwareThe GeForce RTX 5090 is equipped with three encoders and two decoders. These multi-encoder and -decoder setups enable the GeForce RTX 5090 GPU to export video 40% faster than the GeForce RTX 4090 GPU and at 4x speed compared with the GeForce RTX 3090 GPU.GeForce RTX 50 Series GPUs also feature the ninth-generation NVIDIA Encoder (NVENC) with a 5% improvement in video quality on HEVC and AV1 encoding. The new AV1 Ultra Quality mode achieves 5% more compression at the same quality versus the previous generation, and the sixth-generation NVIDIA decoder achieves 2x decode speeds for H.264 over the prior version. The AV1 Ultra Quality mode will also be available to GeForce RTX 40 Series users.Video editing applications Blackmagic Designs DaVinci Resolve and Wondershare Filmora have integrated these technologies.Livestreamers also benefit from the ninth-generation NVENC with a 5% video quality improvement for HEVC and AV1 meaning that video quality looks like it used 5% more bitrate in Twitch with the Twitch Enhanced Broadcasting beta, YouTube or Discord. This improvement is measured using BD-BR PSNR, the standard for measuring video quality by comparing what bitrate matches the same video quality between two encoders.3D artists benefit from the 32GB of memory in GeForce RTX 5090 GPUs, allowing them to work on massive 3D projects and across multiple platforms simultaneously with smooth viewport movement. GeForce RTX 50 Series GPUs with fourth-generation RT Cores run 3D applications 40% faster.DLSS 4 is now available in D5 Render and is coming in February to Chaos Vantage, two popular professional-grade 3D apps for architects, animators and designers. D5 Render will support DLSS 4s new Multi Frame Generation feature to boost frame rates by using AI to generate up to three frames per rendered frame. This enables animators to smoothly navigate a scene with 4x as many frames, or render 3D content at 60 fps or more.Developers can learn more about integrating these new tools into their apps via SDKs.Stay tuned for more updates on the GeForce RTX 50 Series, app performance and compatibility, and emerging AI technologies.Every month brings new creative app updates and optimizations powered by the NVIDIA Studio. Follow NVIDIA Studio on Instagram, X and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter.See notice regarding software product information.
    0 Comments ·0 Shares ·120 Views
  • Leveling Up User Experiences With Agentic AI, From Bots to Autonomous Agents
    blogs.nvidia.com
    AI agents with advanced perception and cognition capabilities are making digital experiences more dynamic and personalized across retail, finance, entertainment and other industries.In this episode of the NVIDIA AI Podcast, Chris Covert, director of product experiences at Inworld AI, highlights how intelligent digital humans and characters are reshaping interactive experiences, from gaming to healthcare.With expertise on the intersection of autonomous systems and human-centered design, Covert explains the different stages of AI agents from basic conversational interfaces to fully autonomous systems. He emphasizes that the key to developing meaningful AI experiences is focusing on user value rather than technology alone.The AI Podcast AI Agents Take Digital Experiences to the Next Level in Gaming and Beyond, Featuring Chris Covert from Inworld AI Episode 243In addition, Covert discusses how livestreaming and recording software company Streamlabs announced a collaboration with Inworld and NVIDIA at this years CES trade show, unveiling an AI-powered streaming assistant that can provide real-time commentary, clip gameplay moments and interact dynamically with streamers thanks to NVIDIA ACE integrations.Learn more about the latest advancements in agentic AI and other technologies by registering for NVIDIA GTC, the conference for the era of AI, taking place March 17-21 at the San Jose Convention Center.Time Stamps5:34 The definition of digital humans and their current state in industries.10:30 The evolution of AI agents.18:10 The design philosophy behind building digital humans and why teams should start with a moonshot approach.You Might Also LikeHow World Foundation Models Will Advance Physical AIWorld foundation models are powerful neural networks that can simulate and predict outcomes in physical environments, enabling teams to enhance AI workflows and development. Ming-Yu Liu, vice president of research at NVIDIA and an IEEE Fellow, joined the NVIDIA AI Podcast to discuss how world foundation models will impact various industries.How Roblox Uses Generative AI to Enhance User ExperiencesRoblox is a colorful online platform that aims to reimagine the way that people come together. Now, generative AI is augmenting that vision. Anupam Singh, vice president of AI and growth engineering at Roblox, explains how the company uses the technology to enhance virtual experiences, power coding assistants to help creators, and increase inclusivity and user safety.Exploring AI-Powered Filmmaking With Cuebrics Pinar Seyhan DemirdagCuebric is on a mission to offer new solutions in filmmaking and content creation through immersive, two-and-a-half-dimensional cinematic environments. The companys AI-powered application aims to help creators quickly bring their ideas to life, making high-quality production more accessible. Pinar Seyhan Demirdag, cofounder and CEO of Cuebric, talks about the current landscape of content creation and the role of AI in simplifying the creative process.
    0 Comments ·0 Shares ·134 Views
  • Amphitrite Rides AI Wave to Boost Maritime Shipping, Ocean Cleanup With Real-Time Weather Prediction and Simulation
    blogs.nvidia.com
    Named after Greek mythologys goddess of the sea, France-based startup Amphitrite is fusing satellite data and AI to simulate and predict oceanic currents and weather.Its work thats making waves in maritime-shipping and oceanic litter-collection operations.Amphitrites AI models powered by the NVIDIA AI and Earth-2 platforms provide insights on positioning vessels to best harness the power of ocean currents, helping ships know when best to travel, as well as the optimal course. This helps users reduce travel times, fuel consumption and, ultimately, carbon emissions.Were at a turning point on the modernization of oceanic atmospheric forecasting, said Alexandre Stegner, cofounder and CEO of Amphitrite. Theres a wide portfolio of applications that can use these domain-specific oceanographic AI models first and foremost, were using them to help foster the energy transition and alleviate environmental issues.https://blogs.nvidia.com/wp-content/uploads/2025/01/amphitrite-suez-canal.mp4Optimizing Routes Based on Currents and WeatherFounded by expert oceanographers, Amphitrite a member of the NVIDIA Inception program for cutting-edge startups distinguishes itself from other weather modeling companies with its domain-specific expertise.Amphitrites fine-tuned, three-kilometer-scale AI models focus on analyzing one parameter at a time, making them more accurate than global numerical modeling methods for the variable of interest. Read more in this paper showcasing the AI method, dubbed ORCAst, trained on NVIDIA GPUs.Depending on the users needs, such variables include the current of the ocean within the first 10 meters of the surface critical in helping ships optimize their travel and minimize fuel consumption as well as the impacts of extreme waves and wind.Its only with NVIDIA accelerated computing that we can achieve optimal performance and parallelization when analyzing data on the whole ocean, said Evangelos Moschos, cofounder and chief technology officer of Amphitrite.Using the latest NVIDIA AI technologies to predict ocean currents and weather in detail, ships can ride or avoid waves, optimize routes and enhance safety while saving energy and fuel.The amount of public satellite data thats available is still much larger than the number of ways people are using this information, Moschos said. Fusing AI and satellite imagery, Amphitrite can improve the accuracy of global ocean current analyses by up to 2x compared with traditional methods.Fine-Tuned to Handle Oceans of DataThe startups AI models, tuned to handle seas of data on the ocean, are based on public data from NASA and the European Space Agency including its Sentinel-3 satellite.Plus, Amphitrite offers the worlds first forecast model incorporating data from the Surface Water and Ocean Topography (SWOT) mission a satellite jointly developed and operated by NASA and French space agency CNES, in collaboration with the Canadian Space Agency and UK Space Agency.SWOT provides an unprecedented resolution of the ocean surface, Moschos said.While weather forecasting technologies have traditionally relied on numerical modeling and computational fluid dynamics, these approaches are harder to apply to the ocean, Moschos explained. This is because oceanic currents often deal with nonlinear physics. Theres also simply less observational data available on the ocean than on atmospheric weather.Computer vision and AI, working with real-time satellite data, offer higher reliability for oceanic current and weather modeling than traditional methods.Amphitrite trains and runs its AI models using NVIDIA H100 GPUs on premises and in the cloud and is building on the FourCastNet model, part of Earth-2, to develop its computer vision models for wave prediction.According to a case study along the Mediterranean Sea, the NVIDIA-powered Amphitrite fine-scale routing solution helped reduce one shipping lines carbon emissions by 10%.Through NVIDIA Inception, Amphitrite gained technical support when building its on-premises infrastructure, free cloud credits for NVIDIA GPU instances on Amazon Web Services, as well as opportunities to collaborate with NVIDIA experts on using the latest simulation technologies, like Earth-2 and FourCastNet.Customers Set Sail With Amphitrites ModelsEnterprises and organizations across the globe are using Amphitrites AI models to optimize their operations and make them more sustainable.CMA-CGM, Genavir, Louis Dreyfus Armateurs and Orange Marine are among the shipping and oceanographic companies analyzing currents using the startups solutions.In addition, Amphitrite is working with a nongovernmental organization to help track and remove pollution in the Pacific Ocean. The initiative uses Amphitrites models to analyze currents and follow plastics that drift from a garbage patch off the coast of California.Moschos noted that another way the startup sets itself apart is by having an AI team led by computer vision scientist Hannah Bull that comprises majority women, some of whom are featured in the image above.This is still rare in the industry, but its something were really proud of on the technical front, especially since we founded the company in honor of Amphitrite, a powerful but often overlooked female figure in history, Moschos said.Learn more about NVIDIA Earth-2.
    0 Comments ·0 Shares ·118 Views
  • AI Maps Titans Methane Clouds in Record Time
    blogs.nvidia.com
    Methane clouds on Titan, Saturns largest moon, are more than just a celestial oddity theyre a window into one of the solar systems most complex climates.Until now, mapping them has been slow and grueling work. Enter AI: a team from NASA, UC Berkeley and Frances Observatoire des Sciences de lUnivers just changed the game.Using NVIDIA GPUs, the researchers trained a deep learning model to analyze years of Cassini data in seconds. Their approach could reshape planetary science, turning what took days into moments.We were able to use AI to greatly speed up the work of scientists, increasing productivity and enabling questions to be answered that would otherwise be impractical, said Zach Yahn, Georgia Tech PhD student and lead author of the study.Read the full paper, Rapid Automated Mapping of Clouds on Titan With Instance Segmentation.How It WorksAt the projects core is Mask R-CNN a deep learning model that doesnt just detect objects. It outlines them pixel by pixel. Trained on hand-labeled images of Titan, it mapped the moons elusive clouds: patchy, streaky and barely visible through a smoggy atmosphere.The team used transfer learning, starting with a model trained on COCO (a dataset of everyday images), and fine-tuned it for Titans unique challenges. This saved time and demonstrated how planetary scientists, who may not always have access to the vast computing resources necessary to train large models from scratch, can still use technologies like transfer learning to apply AI to their data and projects, Yahn explained.The models potential goes far beyond Titan. Many other Solar System worlds have cloud formations of interest to planetary science researchers, including Mars and Venus. Similar technology might also be applied to volcanic flows on Io, plumes on Enceladus, linea on Europa and craters on solid planets and moons, he added.Fast Science, Powered by NVIDIANVIDIA GPUs made this speed possible, processing high-resolution images and generating cloud masks with minimal latency work that traditional hardware would struggle to handle.NVIDIA GPUs have become a mainstay for space scientists. Theyve helped analyze Webb Telescope data, model Mars landings and scan for extraterrestrial signals. Now, theyre helping researchers decode Titan.Whats NextThis AI leap is just the start. Missions like NASAs Europa Clipper and Dragonfly will flood researchers with data. AI can help handle it, processing it onboard, mid-mission, and even prioritizing findings in real time. Challenges remain, like creating hardware fit for spaces harsh conditions, but the potential is undeniable.Methane clouds on Titan hold mysteries. Researchers are now unraveling them faster than ever with help from new AI tools accelerated by NVIDIA GPUs.Read the full paper, Rapid Automated Mapping of Clouds on Titan With Instance Segmentation.Image Credit: NASA Jet Propulsion Laboratory
    0 Comments ·0 Shares ·114 Views
  • Fast, Low-Cost Inference Offers Key to Profitable AI
    blogs.nvidia.com
    Businesses across every industry are rolling out AI services this year. For Microsoft, Oracle, Perplexity, Snap and hundreds of other leading companies, using the NVIDIA AI inference platform a full stack comprising world-class silicon, systems and software is the key to delivering high-throughput and low-latency inference and enabling great user experiences while lowering cost.NVIDIAs advancements in inference software optimization and the NVIDIA Hopper platform are helping industries serve the latest generative AI models, delivering excellent user experiences while optimizing total cost of ownership. The Hopper platform also helps deliver up to 15x more energy efficiency for inference workloads compared to previous generations.AI inference is notoriously difficult, as it requires many steps to strike the right balance between throughput and user experience.But the underlying goal is simple: generate more tokens at a lower cost. Tokens represent words in a large language model (LLM) system and with AI inference services typically charging for every million tokens generated, this goal offers the most visible return on AI investments and energy used per task.Full-stack software optimization offers the key to improving AI inference performance and achieving this goal.Cost-Effective User ThroughputBusinesses are often challenged with balancing the performance and costs of inference workloads. While some customers or use cases may work with an out-of-the-box or hosted model, others may require customization. NVIDIA technologies simplify model deployment while optimizing cost and performance for AI inference workloads. In addition, customers can experience flexibility and customizability with the models they choose to deploy.NVIDIA NIM microservices, NVIDIA Triton Inference Server and the NVIDIA TensorRT library are among the inference solutions NVIDIA offers to suit users needs:NVIDIA NIM inference microservices are prepackaged and performance-optimized for rapidly deploying AI foundation models on any infrastructure cloud, data centers, edge or workstations.NVIDIA Triton Inference Server, one of the companys most popular open-source projects, allows users to package and serve any model regardless of the AI framework it was trained on.NVIDIA TensorRT is a high-performance deep learning inference library that includes runtime and model optimizations to deliver low-latency and high-throughput inference for production applications.Available in all major cloud marketplaces, the NVIDIA AI Enterprise software platform includes all these solutions and provides enterprise-grade support, stability, manageability and security.With the framework-agnostic NVIDIA AI inference platform, companies save on productivity, development, and infrastructure and setup costs. Using NVIDIA technologies can also boost business revenue by helping companies avoid downtime and fraudulent transactions, increase e-commerce shopping conversion rates and generate new, AI-powered revenue streams.Cloud-Based LLM InferenceTo ease LLM deployment, NVIDIA has collaborated closely with every major cloud service provider to ensure that the NVIDIA inference platform can be seamlessly deployed in the cloud with minimal or no code required. NVIDIA NIM is integrated with cloud-native services such as:Amazon SageMaker AI, Amazon Bedrock Marketplace, Amazon Elastic Kubernetes ServiceGoogle Clouds Vertex AI, Google Kubernetes EngineMicrosoft Azure AI Foundry coming soon, Azure Kubernetes ServiceOracle Cloud Infrastructures data science tools, Oracle Cloud Infrastructure Kubernetes EnginePlus, for customized inference deployments, NVIDIA Triton Inference Server is deeply integrated into all major cloud service providers.For example, using the OCI Data Science platform, deploying NVIDIA Triton is as simple as turning on a switch in the command line arguments during model deployment, which instantly launches an NVIDIA Triton inference endpoint.Similarly, with Azure Machine Learning, users can deploy NVIDIA Triton either with no-code deployment through the Azure Machine Learning Studio or full-code deployment with Azure Machine Learning CLI. AWS provides one-click deployment for NVIDIA NIM from SageMaker Marketplace and Google Cloud provides a one-click deployment option on Google Kubernetes Engine (GKE). Google Cloud provides a one-click deployment option on Google Kubernetes Engine, while AWS offers NVIDIA Triton on its AWS Deep Learning containers.The NVIDIA AI inference platform also uses popular communication methods for delivering AI predictions, automatically adjusting to accommodate the growing and changing needs of users within a cloud-based infrastructure.From accelerating LLMs to enhancing creative workflows and transforming agreement management, NVIDIAs AI inference platform is driving real-world impact across industries. Learn how collaboration and innovation are enabling the organizations below to achieve new levels of efficiency and scalability.Serving 400 Million Search Queries Monthly With Perplexity AIPerplexity AI, an AI-powered search engine, handles over 435 million monthly queries. Each query represents multiple AI inference requests. To meet this demand, the Perplexity AI team turned to NVIDIA H100 GPUs, Triton Inference Server and TensorRT-LLM.Supporting over 20 AI models, including Llama 3 variations like 8B and 70B, Perplexity processes diverse tasks such as search, summarization and question-answering. By using smaller classifier models to route tasks to GPU pods, managed by NVIDIA Triton, the company delivers cost-efficient, responsive service under strict service level agreements.Through model parallelism, which splits LLMs across GPUs, Perplexity achieved a threefold cost reduction while maintaining low latency and high accuracy. This best-practice framework demonstrates how IT teams can meet growing AI demands, optimize total cost of ownership and scale seamlessly with NVIDIA accelerated computing.Reducing Response Times With Recurrent Drafter (ReDrafter)Open-source research advancements are helping to democratize AI inference. Recently, NVIDIA incorporated Redrafter, an open-source approach to speculative decoding published by Apple, into NVIDIA TensorRT-LLM.ReDrafter uses smaller draft modules to predict tokens in parallel, which are then validated by the main model. This technique significantly reduces response times for LLMs, particularly during periods of low traffic.Transforming Agreement Management With DocusignDocusign, a leader in digital agreement management, turned to NVIDIA to supercharge its Intelligent Agreement Management platform. With over 1.5 million customers globally, Docusign needed to optimize throughput and manage infrastructure expenses while delivering AI-driven insights.NVIDIA Triton provided a unified inference platform for all frameworks, accelerating time to market and boosting productivity by transforming agreement data into actionable insights. Docusigns adoption of the NVIDIA inference platform underscores the positive impact of scalable AI infrastructure on customer experiences and operational efficiency.NVIDIA Triton makes our lives easier, said Alex Zakhvatov, senior product manager at Docusign. We no longer need to deploy bespoke, framework-specific inference servers for our AI models. We leverage Triton as a unified inference server for all AI frameworks and also use it to identify the right production scenario to optimize cost- and performance-saving engineering efforts.Enhancing Customer Care in Telco With AmdocsAmdocs, a leading provider of software and services for communications and media providers, built amAIz, a domain-specific generative AI platform for telcos as an open, secure, cost-effective and LLM-agnostic framework. Amdocs is using NVIDIA DGX Cloud and NVIDIA AI Enterprise software to provide solutions based on commercially available LLMs as well as domain-adapted models, enabling service providers to build and deploy enterprise-grade generative AI applications.Using NVIDIA NIM, Amdocs reduced the number of tokens consumed for deployed use cases by up to 60% in data preprocessing and 40% in inferencing, offering the same level of accuracy with a significantly lower cost per token, depending on various factors and volumes used. The collaboration also reduced query latency by approximately 80%, ensuring that end users experience near real-time responses. This acceleration enhances user experiences across commerce, customer service, operations and beyond.Revolutionizing Retail With AI on SnapShopping for the perfect outfit has never been easier, thanks to Snaps Screenshop feature. Integrated into Snapchat, this AI-powered tool helps users find fashion items seen in photos. NVIDIA Triton played a pivotal role in enabling Screenshops pipeline, which processes images using multiple frameworks, including TensorFlow and PyTorch.Snaps Screenshop AI workflow.By consolidating its pipeline onto a single inference serving platform, Snap significantly reduced development time and costs while ensuring seamless deployment of updated models. The result is a frictionless user experience powered by AI.We didnt want to deploy bespoke inference serving platforms for our Screenshop pipeline, a TF-serving platform for TensorFlow and a TorchServe platform for PyTorch, explained Ke Ma, a machine learning engineer at Snap. Tritons framework-agnostic design and support for multiple backends like TensorFlow, PyTorch and ONNX was very compelling. It allowed us to serve our end-to-end pipeline using a single inference serving platform, which reduces our inference serving costs and the number of developer days needed to update our models in production.Following the successful launch of the Screenshop service on NVIDIA Triton, Ma and his team turned to NVIDIA TensorRT to further enhance their systems performance. By applying the default NVIDIA TensorRT settings during the compilation process, the Screenshop team immediately saw a 3x surge in throughput, estimated to deliver a staggering 66% cost reduction.Financial Freedom Powered by AI With WealthsimpleWealthsimple, a Canadian investment platform managing over C$30 billion in assets, redefined its approach to machine learning with NVIDIAs AI inference platform. By standardizing its infrastructure, Wealthsimple slashed model delivery time from months to under 15 minutes, eliminating downtime and empowering teams to deliver machine learning as a service.By adopting NVIDIA Triton and running its models through AWS, Wealthsimple achieved 99.999% uptime, ensuring seamless predictions for over 145 million transactions annually. This transformation highlights how robust AI infrastructure can revolutionize financial services.NVIDIAs AI inference platform has been the linchpin in our organizations ML success story, revolutionizing our model deployment, reducing downtime and enabling us to deliver unparalleled service to our clients, said Mandy Gu, senior software development manager at Wealthsimple.Elevating Creative Workflows With Lets EnhanceAI-powered image generation has transformed creative workflows and can be applied to enterprise use cases such as creating personalized content and imaginative backgrounds for marketing visuals. While diffusion models are powerful tools for enhancing creative workflows, the models can be computationally expensive.To optimize its workflows using the Stable Diffusion XL model in production, Lets Enhance, a pioneering AI startup, chose the NVIDIA AI inference platform.Product images with backgrounds created using Lets Enhance platform powered by SDXL.Lets Enhances latest product, AI Photoshoot, uses the SDXL model to transform plain product photos into beautiful visual assets for e-commerce websites and marketing campaigns.With NVIDIA Tritons robust support for various frameworks and backends, coupled with its dynamic batching feature set, Lets Enhance was able to seamlessly integrate the SDXL model into existing AI pipelines with minimal involvement from engineering teams, freeing up their time for research and development efforts.Accelerating Cloud-Based Vision AI With OCIOracle Cloud Infrastructure (OCI) integrated NVIDIA Triton to power its Vision AI service, enhancing prediction throughput by up to 76% and reducing latency by 51%. These optimizations improved customer experiences with applications including automating toll billing for transit agencies and streamlining invoice recognition for global businesses.With Tritons hardware-agnostic capabilities, OCI has expanded its AI services portfolio, offering robust and efficient solutions across its global data centers.Our AI platform is Triton-aware for the benefit of our customers, said Tzvi Keisar, a director of product management for OCIs data science service, which handles machine learning for Oracles internal and external users.Real-Time Contextualized Intelligence and Search Efficiency With MicrosoftAzure offers one of the widest and broadest selections of virtual machines powered and optimized by NVIDIA AI. These virtual machines encompass multiple generations of NVIDIA GPUs, including NVIDIA Blackwell and NVIDIA Hopper systems.Building on this rich history of engineering collaboration, NVIDIA GPUs and NVIDIA Triton now help accelerate AI inference in Copilot for Microsoft 365. Available as a dedicated physical keyboard key on Windows PCs, Microsoft 365 Copilot combines the power of LLMs with proprietary enterprise data to deliver real-time contextualized intelligence, enabling users to enhance their creativity, productivity and skills.Microsoft Bing also used NVIDIA inference solutions to address challenges including latency, cost and speed. By integrating NVIDIA TensorRT-LLM techniques, Microsoft significantly improved inference performance for its Deep Search feature, which powers optimized web results.Deep search walkthrough courtesy of MicrosoftMicrosoft Bing Visual Search enables people around the world to find content using photographs as queries. The heart of this capability is Microsofts TuringMM visual embedding model that maps images and text into a shared high-dimensional space. Because it operates on billions of images across the web, performance is critical.Microsoft Bing optimized the TuringMM pipeline using NVIDIA TensorRT and NVIDIA acceleration libraries including CV-CUDA and nvImageCodec. These efforts resulted in a 5.13x speedup and significant TCO reduction.Unlocking the Full Potential of AI Inference With Hardware InnovationImproving the efficiency of AI inference workloads is a multifaceted challenge that demands innovative technologies across hardware and software.NVIDIA GPUs are at the forefront of AI enablement, offering high efficiency and performance for AI models. Theyre also the most energy efficient: NVIDIA accelerated computing on the NVIDIA Blackwell architecture has cut the energy used per token generation by 100,000x in the past decade for inference of trillion-parameter AI models.The NVIDIA Grace Hopper Superchip, which combines NVIDIA Grace CPU and Hopper GPU architectures using NVIDIA NVLink-C2C, delivers substantial inference performance improvements across industries.Meta Andromeda is using the superchip for efficient and high-performing personalized ads retrieval. By creating deep neural networks with increased compute complexity and parallelism, on Facebook and Instagram it has achieved an 8% ad quality improvement on select segments and a 6% recall improvement.With optimized retrieval models and low-latency, high-throughput and memory-IO aware GPU operators, Andromeda offers a 100x improvement in feature extraction speed compared to previous CPU-based components. This integration of AI at the retrieval stage has allowed Meta to lead the industry in ads retrieval, addressing challenges like scalability and latency for a better user experience and higher return on ad spend.As cutting-edge AI models continue to grow in size, the amount of compute required to generate each token also grows. To run state-of-the-art LLMs in real time, enterprises need multiple GPUs working in concert. Tools like the NVIDIA Collective Communication Library, or NCCL, enable multi-GPU systems to quickly exchange large amounts of data between GPUs with minimal communication time.Future AI Inference InnovationsThe future of AI inference promises significant advances in both performance and cost.The combination of NVIDIA software, novel techniques and advanced hardware will enable data centers to handle increasingly complex and diverse workloads. AI inference will continue to drive advancements in industries such as healthcare and finance by enabling more accurate predictions, faster decision-making and better user experiences.Learn more about how NVIDIA is delivering breakthrough inference performance results and stay up to date with the latest AI inference performance updates.
    0 Comments ·0 Shares ·134 Views
  • Baldurs Gate 3 Mod Support Launches in the Cloud
    blogs.nvidia.com
    GeForce NOW is expanding mod support for hit game Baldurs Gate 3 in collaboration with Larian Studios and mod.io for Ultimate and Performance members.This expanded mod support arrives alongside seven new games joining the cloud this week.Level Up GamingTime to roll for initiative adventurers in the Forgotten Realms can now enjoy a range of curated mods uploaded to mod.io for Baldurs Gate 3. Ultimate and Performance members can enhance their Baldurs Gate 3 journeys across realms and devices with a wide array of customization options. Stay tuned to GFN Thursday for more information on expanding mod support for more of the games PC mods at a later time.Downloading mods is easy choose the desired mods from the Baldurs Gate 3 in-game mod menu, and theyll stay enabled across sessions. Or subscribe to the mods via mod.io to load them automatically when launching the game from GeForce NOW. Read more details in the knowledge article.Learn more about how curated mods are made available for Baldurs Gate 3 players and read the curation guidelines.GeForce NOW members can bring their unique adventures across devices, including NVIDIA SHIELD TVs, underpowered laptops, Macs, Chromebooks and handheld devices like the Steam Deck. Whether battling mind flayers in the living room or crafting spells on the go, GeForce NOW delivers experiences that are seamless, immersive and portable as a Bag of Holding.NOW PlayingRing around the rosie.Jtunnslayer: Hordes of Hel is a gripping roguelike horde-survivor game set in the dark realms of Norse Mythology. Fight waves of enemies to earn divine blessings of ancient Viking Deities, explore hostile worlds and face powerful bosses. Become a god-like warrior in this ultimate showdown.Jtunnslayer: Hordes of Hel (New release on Jan 21, Steam)Among Us (Xbox, available on PC Game Pass)Amnesia: Collection (Xbox, available on the Microsoft Store)Lawn Mowing Simulator (Xbox, available on the Microsoft Store)Sins of a Solar Empire: Rebellion (Xbox, available on the Microsoft Store)STORY OF SEASONS: Friends of Mineral Town (Xbox, available on the Microsoft Store)Townscaper (Xbox, available on the Microsoft Store)What are you planning to play this weekend? Let us know on X or in the comments below.what's the last game you ever beat? NVIDIA GeForce NOW (@NVIDIAGFN) January 22, 2025
    0 Comments ·0 Shares ·162 Views
  • How AI Helps Fight Fraud in Financial Services, Healthcare, Government and More
    blogs.nvidia.com
    Companies and organizations are increasingly using AI to protect their customers and thwart the efforts of fraudsters around the world.Voice security company Hiya found that 550 million scam calls were placed per week in 2023, with INTERPOL estimating that scammers stole $1 trillion from victims that same year. In the U.S., one of four noncontact-list calls were flagged as suspected spam, with fraudsters often luring people into Venmo-related or extended warranty scams.Traditional methods of fraud detection include rules-based systems, statistical modeling and manual reviews. These methods have struggled to scale to the growing volume of fraud in the digital era without sacrificing speed and accuracy. For instance, rules-based systems often have high false-positive rates, statistical modeling can be time-consuming and resource-intensive, and manual reviews cant scale rapidly enough.In addition, traditional data science workflows lack the infrastructure required to analyze the volumes of data involved in fraud detection, leading to slower processing times and limiting real-time analysis and detection.Plus, fraudsters themselves can use large language models (LLMs) and other AI tools to trick victims into investing in scams, giving up their bank credentials or buying cryptocurrency.But AI coupled with accelerated computing systems can be used to check AI and help mitigate all of these issues.Businesses that integrate robust AI fraud detection tools have seen up to a 40% improvement in fraud detection accuracy helping reduce financial and reputational damage to institutions.These technologies offer robust infrastructure and solutions for analyzing vast amounts of transactional data and can quickly and efficiently recognize fraud patterns and identify abnormal behaviors.AI-powered fraud detection solutions provide higher detection accuracy by looking at the whole picture instead of individual transactions, catching fraud patterns that traditional methods might overlook. AI can also help reduce false positives, tapping into quality data to provide context about what constitutes a legitimate transaction. And, importantly, AI and accelerated computing provide better scalability, capable of handling massive data networks to detect fraud in real time.How Financial Institutions Use AI to Detect FraudFinancial services and banking are the front lines of the battle against fraud such as identity theft, account takeover, false or illegal transactions, and check scams. Financial losses worldwide from credit card transaction fraud are expected to reach $43 billion by 2026.AI is helping enhance security and address the challenge of escalating fraud incidents.Banks and other financial service institutions can tap into NVIDIA technologies to combat fraud. For example, the NVIDIA RAPIDS Accelerator for Apache Spark enables faster data processing to handle massive volumes of transaction data. Banks and financial service institutions can also use the new NVIDIA AI workflow for fraud detection harnessing AI tools like XGBoost and graph neural networks (GNNs) with NVIDIA RAPIDS, NVIDIA Triton and NVIDIA Morpheus to detect fraud and reduce false positives.BNY Mellon improved fraud detection accuracy by 20% using NVIDIA DGX systems. PayPal improved real-time fraud detection by 10% running on NVIDIA GPU-powered inference, while lowering server capacity by nearly 8x. And Swedbank trained generative adversarial networks on NVIDIA GPUs to detect suspicious activities.US Federal Agencies Fight Fraud With AIThe United States Government Accountability Office estimates that the government loses up to $521 billion annually due to fraud, based on an analysis of fiscal years 2018 to 2022. Tax fraud, check fraud and improper payments to contractors, in addition to improper payments under the Social Security and Medicare programs have become a massive drag on the governments finances.While some of this fraud was inflated by the recent pandemic, finding new ways to combat fraud has become a strategic imperative. As such, federal agencies have turned to AI and accelerated computing to improve fraud detection and prevent improper payments.For example, the U.S. Treasury Department began using machine learning in late 2022 to analyze its trove of data and mitigate check fraud. The department estimated that AI helped officials prevent or recover more than $4 billion in fraud in fiscal year 2024.Along with the Treasury Department, agencies such as the Internal Revenue Service have looked to AI and machine learning to close the tax gap including tax fraud which was estimated at $606 billion in tax year 2022. The IRS has explored the use of NVIDIAs accelerated data science frameworks such as RAPIDS and Morpheus to identify anomalous patterns in taxpayer records, data access and common vulnerability and exposures. LLMs combined with retrieval-augmented generation and RAPIDS have also been used to highlight records that may not be in alignment with policies.How AI Can Help Healthcare Stem Potential FraudAccording to the U.S. Department of Justice, healthcare fraud, waste and abuse may account for as much as 10% of all healthcare expenditures. Other estimates have deemed that percentage closer to 3%. Medicare and Medicaid fraud could be near $100 billion. Regardless, healthcare fraud is a problem worth hundreds of billions of dollars.The additional challenge with healthcare fraud is that it can come from all directions. Unlike the IRS or the financial services industry, the healthcare industry is a fragmented ecosystem of hospital systems, insurance companies, pharmaceutical companies, independent medical or dental practices, and more. Fraud can occur at both provider and patient levels, putting pressure on the entire system.Common types of potential healthcare fraud include:Billing for services not renderedUpcoding: billing for a more expensive service than the one renderedUnbundling: multiple bills for the same serviceFalsifying recordsUsing someone elses insuranceForged prescriptionsThe same AI technologies that help combat fraud in financial services and the public sector can also be applied to healthcare. Insurance companies can use pattern and anomaly detection to look for claims that seem atypical, either from the provider or the patient, and scrutinize billing data for potentially fraudulent activity. Real-time monitoring can detect suspicious activity at the source, as its happening. And automated claims processing can help reduce human error and detect inconsistencies while improving operational efficiency.Data processing through NVIDIA RAPIDS can be combined with machine learning and GNNs or other types of AI to help better detect fraud at every layer of the healthcare system, assisting patients and practitioners everywhere dealing with high costs of care.AI for Fraud Detection Could Save Billions of DollarsFinancial services, the public sector and the healthcare industry are all using AI for fraud detection to provide a continuous defense against one of the worlds biggest drains on economic activity.The NVIDIA AI platform supports the entire fraud detection and identity verification pipeline from data preparation to model training to deployment with tools like NVIDIA RAPIDS, NVIDIA Triton Inference Server and NVIDIA Morpheus on the NVIDIA AI Enterprise software platform.Learn more about NVIDIA solutions for fraud detection with AI and accelerated computing.
    0 Comments ·0 Shares ·149 Views
  • The Future of Marketing: How AI Agents Can Enhance Customer Journeys in Retail
    blogs.nvidia.com
    AI agents which can understand, adapt to and support each users unique journey are making online shopping and digital marketing more efficient and personalized. Plus, these intelligent systems are poised to turn marketing interactions into valuable customer research data.In this episode of the NVIDIA AI Podcast, Jon Heller, co-CEO and founder of Firsthand, discusses how the companys Brand Agents are revolutionizing the relationship between consumers, marketers and publishers. By using a companys own knowledge, Firsthand Brand Agents act as AI-powered guides that engage customers on a brands website and beyond assisting at every step of the customers journey, from finding solutions to making purchases.Drawing on decades of industry experience including leadership roles at advertising companies DoubleClick and FreeWheel, Heller explains Firsthands vision of AI as a new medium rather than just a technology.The AI Podcast Firsthands Jon Heller Shares How AI Agents Enhance Consumer Journeys in Retail Episode 242AI agents are enabling companies in the retail and consumer-packaged goods (CPG) industries to increase internal efficiency and productivity while improving customer service.Two of the top use cases for generative AI in retail are: personalized marketing and advertising, and digital shopping assistants or copilots. Learn more about AIs rapid integration across businesses in NVIDIAs second annual State of AI in Retail and CPG survey report.Time Stamps2:10 How large language models revealed a new approach to digital marketing.12:46 How Firsthand Brand Agents can use various AI capabilities beyond traditional chat.16:33 How Firsthand Brand Agents create a connected customer journey by maintaining context across touchpoints.23:57 The technical challenges in building agents while maintaining brand safety.30:10 How AI can generate unprecedented insights into consumer needs and preferences.You Might Also LikeImbue CEO Kanjun Qiu on Transforming AI Agents Into Personal CollaboratorsImbue is building software infused with intelligence through collaborative AI systems that work alongside users. CEO Kanjun Qiu discusses her companys approach to AI agents and compares the personal computer revolution of the late 1970s and 80s to todays AI agent transformation.Media.Monks Lewis Smithingham on Enhancing Media and Marketing With AIMedia.Monks platform Wormhole streamlines marketing and content creation with AI-powered insights. Hear from Lewis Smithingham, senior vice president of innovation and special operations at Media.Monks, as he addresses AIs potential in entertainment and advertising.Snowflakes Baris Gultekin on Unlocking the Value of Data With Large Language ModelsSnowflake is using AI to help enterprises transform data into insights and applications. Baris Gultekin, head of AI at Snowflake, explains how the companys AI Data Cloud platform separates the storage of data from compute, enabling organizations across the world to connect via cloud technology and work on a unified platform.Subscribe to the AI PodcastGet theAI PodcastthroughAmazon Music,Apple Podcasts,Google Podcasts,Google Play,Castbox, DoggCatcher,Overcast,PlayerFM, Pocket Casts,Podbay,PodBean, PodCruncher, PodKicker,SoundCloud,Spotify,StitcherandTuneIn.
    0 Comments ·0 Shares ·141 Views
  • Into the Omniverse: OpenUSD Workflows Advance Physical AI for Robotics, Autonomous Vehicles
    blogs.nvidia.com
    Editors note: This post is part of Into the Omniverse, a series focused on how developers, 3D practitioners and enterprises can transform their workflows using the latest advances in Universal Scene Description (OpenUSD) and NVIDIA Omniverse.The next frontier of AI is physical AI. Physical AI models can understand instructions and perceive, interact and perform complex actions in the real world to power autonomous machines like robots and self-driving cars.Similar to how large language models can process and generate text, physical AI models can understand the world and generate actions. To do this, these models must be trained in simulation environments to comprehend physical dynamics, like gravity, friction or inertia and understand geometric and spatial relationships, as well as the principles of cause and effect.Global leaders in software development and professional services are using NVIDIA Omniverse, powered by OpenUSD, to build new products and services that will accelerate the development of AI and controllable simulations to enable the creation of true-to-reality virtual worlds, known as digital twins, that can be used to train physical AI with unprecedented accuracy and detail.Generate Exponentially More Synthetic Data With Omniverse and NVIDIA CosmosAt CES, NVIDIA announced generative AI models and blueprints that expand Omniverse integration further into physical AI applications such as robotics, autonomous vehicles and vision AI.Among these announcements was NVIDIA Cosmos, a platform of state-of-the-art generative world foundation models, advanced tokenizers, guardrails and an accelerated video processing pipeline all designed to accelerate physical AI development.Developing physical AI models is a costly, resource- and time-intensive process that requires vast amounts of real-world data and testing. Cosmos world foundation models (WFM), which predict future world states as videos based on multimodal inputs, provide an easy way for developers to generate massive amounts of photoreal, physics-based synthetic data to train and evaluate AI for robotics, autonomous vehicles and machines. Developers can also fine-tune Cosmos WFMs to build downstream world models or improve quality and efficiency for specific physical AI use cases.When paired with Omniverse, Cosmos creates a powerful synthetic data multiplication engine. Developers can use Omniverse to create 3D scenarios, then feed the outputs into Cosmos to generate controlled videos and variations. This can drastically accelerate the development of physical AI systems such as autonomous vehicles and robots by rapidly generating exponentially more training data covering a variety of environments and interactions.OpenUSD ensures the data in these scenarios is seamlessly integrated and consistently represented, enhancing the realism and effectiveness of the simulations.Leading robotics and automotive companies, including 1X, Agile Robots, Agility Robotics, Figure AI, Foretellix, Fourier, Galbot, Hillbot, IntBot, Neura Robotics, Skild AI, Virtual Incision, Waabi and XPENG, along with ridesharing giant Uber, are among the first to adopt Cosmos.Learn more about how world foundation models will advance physical AI by listening to the NVIDIA AI Podcast episode with Ming-Yu Liu, vice president of research at NVIDIA.See Cosmos in Action for Physical AI Use CasesCosmos WFMs are revolutionizing industries by providing a unified framework for developing, training and deploying large-scale AI models across various applications. Enterprises in the automotive, industrial and robotics sectors can harness the power of generative physical AI and simulation to accelerate innovation and operational efficiency.Humanoid robots: The NVIDIA Isaac GR00T Blueprint for synthetic motion generation helps developers generate massive synthetic motion datasets to train humanoid robots using imitation learning. With GR00T workflows, users can capture human actions and use Cosmos to exponentially increase the size and variety of the dataset, making it more robust for training physical AI systems.Autonomous vehicles: Autonomous vehicle (AV) simulation powered by Omniverse Sensor RTX application programming interfaces lets AV developers replay driving data, generate new ground-truth data and perform closed-loop testing to accelerate their pipelines. With Cosmos, developers can generate synthetic driving scenarios to amplify training data by orders of magnitude, accelerating physical AI model development for autonomous vehicles. Global ridesharing giant Uber is partnering with NVIDIA to accelerate autonomous mobility. Rich driving datasets from Uber, combined with Cosmos and NVIDIA DGX Cloud, can help AV partners build stronger AI models more efficiently.Industrial settings: Mega is an Omniverse Blueprint for developing, testing and optimizing physical AI and robot fleets at scale in a USD-based digital twin before deployment in factories and warehouses. The blueprint uses Omniverse Cloud Sensor RTX APIs to simultaneously render multisensor data from any type of intelligent machine, enabling high-fidelity sensor simulation at scale. Cosmos can enhance Mega by generating synthetic edge case scenarios to amplify training data, significantly improving the robustness and efficiency of training robots in simulation. KION Group, a supply chain solutions company, is among the first to adopt Mega to drive warehouse automation in retail, consumer packaged goods, parcel services and more.Get Plugged Into the World of OpenUSDFor more on Cosmos, watch the replay of NVIDIA CEO Jensen Huangs CES keynote, and get started with Cosmos WFMs available now under an open model license on Hugging Face and the NVIDIA NGC catalog. Join the upcoming livestream on Wednesday, February 5 for a deep dive into Cosmos WFMs and physical AI workflows.Continue to optimize OpenUSD workflows with the new self-paced Learn OpenUSD curriculum for 3D developers and practitioners, available at no cost through the NVIDIA Deep Learning Institute. For more resources on OpenUSD, explore the Alliance for OpenUSD forum and the AOUSD website.Meet Cosmos, OpenUSD and physical AI experts at NVIDIA GTC, the conference for the era of AI, taking place March 17-21 at the San Jose Convention Center.Stay up to date by subscribing to NVIDIA news, joining the community, and following NVIDIA Omniverse on Instagram, LinkedIn, Medium and X.
    0 Comments ·0 Shares ·135 Views
  • NoTraffic Reduces Road Delays, Carbon Emissions With NVIDIA AI and Accelerated Computing
    blogs.nvidia.com
    More than 90 million new vehicles are introduced to roads across the globe every year, leading to an annual 12% increase in traffic congestion according to NoTraffic, a member of the NVIDIA Inception program for cutting-edge startups and the NVIDIA Metropolis vision AI ecosystem.Still, 99% of the worlds traffic signals run on fixed timing plans, leading to unnecessary congestion and delays.To reduce such inefficiencies, mitigate car accidents and reduce carbon emissions from vehicles, NoTraffics AI Mobility platform predicts road scenarios, helps ensure continuous traffic flow, minimizes stops and optimizes safety at intersections across the U.S., Canada and elsewhere.The platform which enables road infrastructure management at both local-intersection and city-grid scale integrates NVIDIA-powered software and hardware at the edge, under a cloud-based operating system.Its built using the NVIDIA Jetson edge AI platform, NVIDIA accelerated computing and the NVIDIA Metropolis vision AI developer stack.With NVIDIA accelerated computing, we achieved a 3x speedup in AI training and doubled AI Mobilitys energy efficiency, said Uriel Katz, cofounder and chief technology officer of NoTraffic. These optimizations in time, money and energy efficiency are all bolstered by NVIDIA Jetson, which sped our image preprocessing tasks by 40x compared with a CPU-only workflow. Plus, GPU-accelerated NVIDIA CUDA libraries increased our model throughput by 30x.These libraries include the NVIDIA TensorRT ecosystem of application programming interfaces for high-performance deep learning inference and the NVIDIA cuDNN library of primitives for deep neural networks.Taming Traffic in Tuscon, Vancouver and BeyondIn Tuscon, Arizona, more than 80 intersections are tapping into the NoTraffic AI Mobility platform, which has enabled up to a 46% reduction in road delays during rush hours and a half-mile reduction in peak queue length.The work is an expansion of NoTraffics initial deployment on Tuscons West Ajo Way. That effort led to an average delay reduction of 23% for drivers.Since installation, NoTraffic technology has helped free Tucson drivers from over 1.25 million hours stuck in traffic, the company estimates, representing an economic benefit of over $24.3 million. The company has also tracked a nearly 80% reduction in red-light runners since its platform was deployed, helping improve safety at Tucson intersections.By reducing travel times, drivers have also saved over $1.6 million in gas, cutting emissions and improving air quality to make the equivalent impact of planting 650,000 trees.In Vancouver, Canada, the University of British Columbia (UBC) is using the NoTraffic platform and Rogers Communications 5G-connected, AI-enabled smart-traffic platform to reduce both pedestrian delays and greenhouse gas emissions.Rogers Communications 5G networks provide robust and stable connectivity to the sensors embedded on the traffic poles.This advanced network infrastructure enhances the NoTraffic platforms efficacy and scalability, as the improved speed and reduced latency of 5G networks means traffic data can be processed in real time. This is critical for predicting numerous potential traffic scenarios, adjusting signal timings and prioritizing road users accordingly.With AI Mobility deployed at seven intersections across the campus, the university experienced an up to 40% reduction in pedestrian delays and significant decreases in vehicle wait time.In addition, UBC reduces 74 tons of carbon dioxide emissions each year thanks to the NoTraffic and Rogers solution, which is powered by NVIDIA edge AI and accelerated computing.The platform is also in action on the roads of Phoenix, Arizona; Baltimore, Maryland; and in 35 states through 200+ agencies across the U.S. and Canada.Honk If You Love Reducing Congestion, Carbon EmissionsThe NoTraffic AI Mobility platform offers local AI-based predictions that, based on sensor inputs at multiple intersections, analyze numerous traffic scenarios up to two minutes in advance.It can adapt to real-time changes in traffic patterns and volumes, send messages between intersections and run optimization algorithms that control traffic signals to improve overall transportation efficiency and safety through cloud connectivity.Speedups in the AI Mobility platform mean quicker optimizations of traffic signals and reduced congestion on the roads means reduced carbon emissions from vehicles.NoTraffic estimates that for every city optimized with this platform, eight hours of traffic time could be saved per driver. Plus, with over 300,000 signalized intersections in the U.S., the company says this could result in a total of $14 billion in economic savings per year.Learn more about the NVIDIA Metropolis platform and how its used in smart cities and spaces.
    0 Comments ·0 Shares ·162 Views
  • Fantastic Four-ce Awakens: Season One of Marvel Rivals Joins GeForce NOW
    blogs.nvidia.com
    Time to suit up, members. The multiverse is about to get a whole lot cloudier as GeForce NOW opens a portal to the first season of hit game Marvel Rivals from NetEase Games.Members can now game in a new dimension with expanded support for virtual- and mixed-reality devices. This weeks GeForce NOW app update 2.0.70 begins rolling out compatibility for Apple Vision Pro spatial computers, Meta Quest 3 and 3S, and Pico 4 and 4 Ultra devices.Plus, no GFN Thursday is complete without new games. Get ready for seven new titles joining the cloud this week, including multiplayer online battle arena game SMITE 2.Invisible No MoreSink your teeth into the Fantastic Four.Eternal night falls for Marvel Rivals, the superhero, team-based player vs. player shooter that lets players assemble an ever-evolving all-star squad of Super Heroes and Super Villains battling with unique powers across a dynamic lineup of destructible maps from the Marvel Multiverse.The Fantastic Four will be playable in season one of the game. For Eternal Night Falls, Invisible Woman and Mister Fantastic will be released in the first half of the season, followed by Human Torch and The Thing in the second. Season one will also feature three new maps, special events and an all-new Doom Match game mode.Stream it all with a GeForce NOW membership across devices, from an underpowered laptop, Mac devices, a Steam Deck or the supported platform of virtual- and mixed-reality devices.Head in the CloudsHeadset on, latency gone.The latest GeForce NOW app update is expanding cloud streaming capabilities to Apple Vision Pro spatial computers, Meta Quest 3 and 3S, and Pico 4 and 4 Ultra virtual- and mixed-reality headsets starting this week.These newly supported devices will give members access to an extensive library of games to stream through GeForce NOW. Members can gain access by visiting play.geforcenow.com or via the Android-native client on the PICO store. The rollout will be complete on Tuesday, Jan. 21.Members will be able to transform their space into a personal gaming theater by playing, on massive virtual screens, their favorite PC games, such as the latest season of Marvel Rivals, Dragon Age and more. With access to NVIDIA technologies, including ray tracing and NVIDIA DLSS on supported games, these devices now provide an enhanced visual experience with the highest frame rates and lowest latency.Here Comes The NewBecome a god and wage war.SMITE 2 is now free to play and has brought a huge update to mark the start of open beta. New god Aladdin joins, along with SMITE 1 fan favourites Geb, Agni, Mulan and Ullr bringing the total god roster to 45. Twenty of the gods now feature Aspects an optional spin on each gods ability kit that opens up even more strategic options. The 3v3 mode Joust has also arrived, featuring a brand-new, Arthurian-themed map. Assault and Duel game modes are also available. Finally, the Conquest mode brings a wealth of updates to the map, features and balance.Hyper Light Breaker (New release on Steam, Jan. 14)Aloft (New release on Steam, Jan. 15)Assetto Corsa EVO (New release on Steam, Jan. 16)Generation Zero (Xbox, available on PC Game Pass)HOT WHEELS UNLEASHED 2 Turbocharged (Xbox, available on PC Game Pass)SMITE 2 (Steam)Voidwrought (Steam)What are you planning to play this weekend? Let us know on X or in the comments below.Which role is your main? Tank Damage Support NVIDIA GeForce NOW (@NVIDIAGFN) January 15, 2025
    0 Comments ·0 Shares ·162 Views
  • NVIDIA Releases NIM Microservices to Safeguard Applications for Agentic AI
    blogs.nvidia.com
    AI agents are poised to transform productivity for the worlds billion knowledge workers with knowledge robots that can accomplish a variety of tasks. To develop AI agents, enterprises need to address critical concerns like trust, safety, security and compliance.New NVIDIA NIM microservices for AI guardrails part of the NVIDIA NeMo Guardrails collection of software tools are portable, optimized inference microservices that help companies improve the safety, precision and scalability of their generative AI applications.Central to the orchestration of the microservices is NeMo Guardrails, part of the NVIDIA NeMo platform for curating, customizing and guardrailing AI. NeMo Guardrails helps developers integrate and manage AI guardrails in large language model (LLM) applications. Industry leaders Amdocs, Cerence AI and Lowes are among those using NeMo Guardrails to safeguard AI applications.Developers can use the NIM microservices to build more secure, trustworthy AI agents that provide safe, appropriate responses within context-specific guidelines and are bolstered against jailbreak attempts. Deployed in customer service across industries like automotive, finance, healthcare, manufacturing and retail, the agents can boost customer satisfaction and trust.One of the new microservices, built for moderating content safety, was trained using the Aegis Content Safety Dataset one of the highest-quality, human-annotated data sources in its category. Curated and owned by NVIDIA, the dataset is publicly available on Hugging Face and includes over 35,000 human-annotated data samples flagged for AI safety and jailbreak attempts to bypass system restrictions.NVIDIA NeMo Guardrails Keeps AI Agents on TrackAI is rapidly boosting productivity for a broad range of business processes. In customer service, its helping resolve customer issues up to 40% faster. However, scaling AI for customer service and other AI agents requires secure models that prevent harmful or inappropriate outputs and ensure the AI application behaves within defined parameters.NVIDIA has introduced three new NIM microservices for NeMo Guardrails that help AI agents operate at scale while maintaining controlled behavior:Content safety NIM microservice that safeguards AI against generating biased or harmful outputs, ensuring responses align with ethical standards.Topic control NIM microservice that keeps conversations focused on approved topics, avoiding digression or inappropriate content.Jailbreak detection NIM microservice that adds protection against jailbreak attempts, helping maintain AI integrity in adversarial scenarios.By applying multiple lightweight, specialized models as guardrails, developers can cover gaps that may occur when only more general global policies and protections exist as a one-size-fits-all approach doesnt properly secure and control complex agentic AI workflows.Small language models, like those in the NeMo Guardrails collection, offer lower latency and are designed to run efficiently, even in resource-constrained or distributed environments. This makes them ideal for scaling AI applications in industries such as healthcare, automotive and manufacturing, in locations like hospitals or warehouses.Industry Leaders and Partners Safeguard AI With NeMo GuardrailsNeMo Guardrails, available to the open-source community, helps developers orchestrate multiple AI software policies called rails to enhance LLM application security and control. It works with NVIDIA NIM microservices to offer a robust framework for building AI systems that can be deployed at scale without compromising on safety or performance.Amdocs, a leading global provider of software and services to communications and media companies, is harnessing NeMo Guardrails to enhance AI-driven customer interactions by delivering safer, more accurate and contextually appropriate responses.Technologies like NeMo Guardrails are essential for safeguarding generative AI applications, helping make sure they operate securely and ethically, said Anthony Goonetilleke, group president of technology and head of strategy at Amdocs. By integrating NVIDIA NeMo Guardrails into our amAIz platform, we are enhancing the platforms Trusted AI capabilities to deliver agentic experiences that are safe, reliable and scalable. This empowers service providers to deploy AI solutions safely and with confidence, setting new standards for AI innovation and operational excellence.Cerence AI, a company specializing in AI solutions for the automotive industry, is using NVIDIA NeMo Guardrails to help ensure its in-car assistants deliver contextually appropriate, safe interactions powered by its CaLLM family of large and small language models.Cerence AI relies on high-performing, secure solutions from NVIDIA to power our in-car assistant technologies, said Nils Schanz, executive vice president of product and technology at Cerence AI. Using NeMo Guardrails helps us deliver trusted, context-aware solutions to our automaker customers and provide sensible, mindful and hallucination-free responses. In addition, NeMo Guardrails is customizable for our automaker customers and helps us filter harmful or unpleasant requests, securing our CaLLM family of language models from unintended or inappropriate content delivery to end users.Lowes, a leading home improvement retailer, is leveraging generative AI to build on the deep expertise of its store associates. By providing enhanced access to comprehensive product knowledge, these tools empower associates to answer customer questions, helping them find the right products to complete their projects and setting a new standard for retail innovation and customer satisfaction.Were always looking for ways to help associates to above and beyond for our customers, said Chandhu Nair, senior vice president of data, AI and innovation at Lowes. With our recent deployments of NVIDIA NeMo Guardrails, we ensure AI-generated responses are safe, secure and reliable, enforcing conversational boundaries to deliver only relevant and appropriate content.To further accelerate AI safeguards adoption in AI application development and deployment in retail, NVIDIA recently announced at the NRF show that its NVIDIA AI Blueprint for retail shopping assistants incorporates NeMo Guardrails microservices for creating more reliable and controlled customer interactions during digital shopping experiences.Consulting leaders Taskus, Tech Mahindra and Wipro are also integrating NeMo Guardrails into their solutions to provide their enterprise clients safer, more reliable and controlled generative AI applications.NeMo Guardrails is open and extensible, offering integration with a robust ecosystem of leading AI safety model and guardrail providers, as well as AI observability and development tools. It supports integration with ActiveFences ActiveScore, which filters harmful or inappropriate content in conversational AI applications, and provides visibility, analytics and monitoring.Hive, which provides its AI-generated content detection models for images, video and audio content as NIM microservices, can be easily integrated and orchestrated in AI applications using NeMo Guardrails.The Fiddler AI Observability platform easily integrates with NeMo Guardrails to enhance AI guardrail monitoring capabilities. And Weights & Biases, an end-to-end AI developer platform, is expanding the capabilities of W&B Weave by adding integrations with NeMo Guardrails microservices. This enhancement builds on Weights & Biases existing portfolio of NIM integrations for optimized AI inferencing in production.NeMo Guardrails Offers Open-Source Tools for AI Safety TestingDevelopers ready to test the effectiveness of applying safeguard models and other rails can use NVIDIA Garak an open-source toolkit for LLM and application vulnerability scanning developed by the NVIDIA Research team.With Garak, developers can identify vulnerabilities in systems using LLMs by assessing them for issues such as data leaks, prompt injections, code hallucination and jailbreak scenarios. By generating test cases involving inappropriate or incorrect outputs, Garak helps developers detect and address potential weaknesses in AI models to enhance their robustness and safety.AvailabilityNVIDIA NeMo Guardrails microservices, as well as NeMo Guardrails for rail orchestration and the NVIDIA Garak toolkit, are now available for developers and enterprises. Developers can get started building AI safeguards into AI agents for customer service using NeMo Guardrails with this tutorial.See notice regarding software product information.
    0 Comments ·0 Shares ·141 Views
  • How AI Is Enhancing Surgical Safety and Education
    blogs.nvidia.com
    Troves of unwatched surgical video footage are finding new life, fueling AI tools that help make surgery safer and enhance surgical education. The Surgical Data Science Collective (SDSC) is transforming global surgery through AI-driven video analysis, helping to close the gaps in surgical training and practice.In this episode of the NVIDIA AI Podcast, Margaux Masson-Forsythe, director of machine learning at SDSC, discusses the unique challenges of doing AI research as a nonprofit, how the collective distills insights from massive amounts of video data and ways AI can help address the stark reality that five billion people still lack access to safe surgery.The AI Podcast How SDSC Uses AI to Transform Surgical Training and Practice Episode 241Learn more about SDSC, and hear more about the future of AI in healthcare by listening to the J.P. Morgan Healthcare Conference talk by Kimberly Powell, vice president of healthcare at NVIDIA.Time Stamps8:01 What are the opportunities and challenges of analyzing surgical videos?12:50 Masson-Forsythe on trying new models and approaches to stay on top of the field.18:14 How does a nonprofit approach conducting AI research?24:05 How the community can get involved with SDSC.You Might Also LikeCofounder of Annalise.ai Aengus Tran on Using AI as a Spell Check for Health ChecksHarrison.ai has developed annalise.ai, an AI system that automates radiology image analysis to improve diagnosis speed and accuracy, and is now working on Franklin.ai to enhance histopathology diagnosis. CEO Aengus Tran emphasizes the importance of using AI in healthcare to reduce misdiagnoses and improve patient outcomes.Matice Founder Jessica Whited on Harnessing Regenerative Species for Medical BreakthroughsScientists at Matice Biosciences, cofounded by regenerative biologist Jessica Whited, are using AI to study the tissue regeneration capabilities of animals like salamanders and planarians, with the goal of developing treatments to help humans heal from injuries without scarring.Cardiac Clarity: Dr. Keith Channon Talks Revolutionizing Heart Health With AI Caristo Diagnostics has developed an AI-powered solution called Caristo that detects coronary inflammation in cardiac CT scans by analyzing radiometric features in the surrounding fat tissue, helping physicians improve treatment plans and risk predictions.Subscribe to the AI PodcastGet the AI Podcast through Amazon Music, Apple Podcasts, Google Podcasts, Google Play, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, SoundCloud, Spotify, Stitcher and TuneIn.
    0 Comments ·0 Shares ·146 Views
  • x.com
    RTNVIDIA GeForceNVIDIA's Bryan Catanzaro and Edward Liu walkthrough new capabilities & improved technologies in DLSS 4: New Multi Frame Generation for RTX 50 Series Improved Frame Generation for RTX 40 & 50 Series Enhanced Ray Reconstruction, Super Resolution, and DLAA for all RTX GPUs
    0 Comments ·0 Shares ·111 Views
  • x.com
    WIN THE NEXT GENERATION OF RTX The #GeForceRTX50 sweepstakes is here & we're giving you multiple chances to WIN a GeForce RTX 5090!Here's your first chance to enter: Like this post Comment #GeForceRTX50
    0 Comments ·0 Shares ·113 Views
More Stories