• VENTUREBEAT.COM
    A new, open source text-to-speech model called Dia has arrived to challenge ElevenLabs, OpenAI and more
    Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A two-person startup by the name of Nari Labs has introduced Dia, a 1.6 billion parameter text-to-speech (TTS) model designed to produce naturalistic dialogue directly from text prompts — and one of its creators claims it surpasses the performance of competing proprietary offerings from the likes of ElevenLabs, Google’s hit NotebookLM AI podcast generation product. It could also threaten uptake of OpenAI’s recent gpt-4o-mini-tts. “Dia rivals NotebookLM’s podcast feature while surpassing ElevenLabs Studio and Sesame’s open model in quality,” said Toby Kim, one of the co-creators of Nari and Dia, on a post from his account on the social network X. In a separate post, Kim noted that the model was built with “zero funding,” and added across a thread: “…we were not AI experts from the beginning. It all started when we fell in love with NotebookLM’s podcast feature when it was released last year. We wanted more—more control over the voices, more freedom in the script. We tried every TTS API on the market. None of them sounded like real human conversation.” Kim further credited Google for giving him and his collaborator access to the company’s Tensor Processing Unit chips (TPUs) for training Dia through Google’s Research Cloud. Dia’s code and weights — the internal model connection set — is now available for download and local deployment by anyone from Hugging Face or Github. Individual users can try generating speech from it on a Hugging Face Space. Advanced controls and more customizable features Dia supports nuanced features like emotional tone, speaker tagging, and nonverbal audio cues—all from plain text. Users can mark speaker turns with tags like [S1] and [S2], and include cues like (laughs), (coughs), or (clears throat) to enrich the resulting dialogue with nonverbal behaviors. These tags are correctly interpreted by Dia during generation—something not reliably supported by other available models, according to the company’s examples page. The model is currently English-only and not tied to any single speaker’s voice, producing different voices per run unless users fix the generation seed or provide an audio prompt. Audio conditioning, or voice cloning, lets users guide speech tone and voice likeness by uploading a sample clip. Nari Labs offers example code to facilitate this process and a Gradio-based demo so users can try it without setup. Comparison with ElevenLabs and Sesame Nari offers a host of example audio files generated by Dia on its Notion website, comparing it to other leading speech-to-text rivals, specifically ElevenLabs Studio and Sesame CSM-1B, the latter a new text-to-speech model from Oculus VR headset co-creator Brendan Iribe that went somewhat viral on X earlier this year. Side-by-side examples shared by Nari Labs show how Dia outperforms the competition in several areas: In standard dialogue scenarios, Dia handles both natural timing and nonverbal expressions better. For example, in a script ending with (laughs), Dia interprets and delivers actual laughter, whereas ElevenLabs and Sesame output textual substitutions like “haha”. For example, here’s Dia… …and the same sentence spoken by ElevenLabs Studio In multi-turn conversations with emotional range, Dia demonstrates smoother transitions and tone shifts. One test included a dramatic, emotionally-charged emergency scene. Dia rendered the urgency and speaker stress effectively, while competing models often flattened delivery or lost pacing. Dia uniquely handles nonverbal-only scripts, such as a humorous exchange involving coughs, sniffs, and laughs. Competing models failed to recognize these tags or skipped them entirely. Even with rhythmically complex content like rap lyrics, Dia generates fluid, performance-style speech that maintains tempo. This contrasts with more monotone or disjointed outputs from ElevenLabs and Sesame’s 1B model. Using audio prompts, Dia can extend or continue a speaker’s voice style into new lines. An example using a conversational clip as a seed showed how Dia carried vocal traits from the sample through the rest of the scripted dialogue. This feature isn’t robustly supported in other models. In one set of tests, Nari Labs noted that Sesame’s best website demo likely used an internal 8B version of the model rather than the public 1B checkpoint, resulting in a gap between advertised and actual performance. Model access and tech specs Developers can access Dia from Nari Labs’ GitHub repository and its Hugging Face model page. The model runs on PyTorch 2.0+ and CUDA 12.6 and requires about 10GB of VRAM. Inference on enterprise-grade GPUs like the NVIDIA A4000 delivers roughly 40 tokens per second. While the current version only runs on GPU, Nari plans to offer CPU support and a quantized release to improve accessibility. The startup offers both a Python library and CLI tool to further streamline deployment. Dia’s flexibility opens use cases from content creation to assistive technologies and synthetic voiceovers. Nari Labs is also developing a consumer version of Dia aimed at casual users looking to remix or share generated conversations. Interested users can sing up via email to a waitlist for early access. Fully open source The model is distributed under a fully open source Apache 2.0 license, which means it can be used for commercial purposes — something that will obviously appeal to enterprises or indie app developers. Nari Labs explicitly prohibits usage that includes impersonating individuals, spreading misinformation, or engaging in illegal activities. The team encourages responsible experimentation and has taken a stance against unethical deployment. Dia’s development credits support from the Google TPU Research Cloud, Hugging Face’s ZeroGPU grant program, and prior work on SoundStorm, Parakeet, and Descript Audio Codec. Nari Labs itself comprises just two engineers—one full-time and one part-time—but they actively invite community contributions through its Discord server and GitHub. With a clear focus on expressive quality, reproducibility, and open access, Dia adds a distinctive new voice to the landscape of generative speech models. Daily insights on business use cases with VB Daily If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI. Read our Privacy Policy Thanks for subscribing. Check out more VB newsletters here. An error occured.
    0 Commenti 0 condivisioni 32 Views
  • VENTUREBEAT.COM
    Windsurf: OpenAI’s potential $3B bet to drive the ‘vibe coding’ movement
    A Windsurf deal would allow OpenAI to own more of the full-stack coding experience (and it would be its most expensive acquisition to date).Read More
    0 Commenti 0 condivisioni 31 Views
  • VENTUREBEAT.COM
    Google’s Gemini 2.5 Flash introduces ‘thinking budgets’ that cut AI costs by 600% when turned down
    Google's new Gemini 2.5 Flash AI model introduces adjustable "thinking budgets" that let businesses pay only for the reasoning power they need, balancing advanced capabilities with cost efficiency.Read More
    0 Commenti 0 condivisioni 32 Views
  • WWW.THEVERGE.COM
    OpenAI tells judge it would buy Chrome from Google
    If Google is forced to sell off Chrome, ChatGPT’s head of product told a judge today that OpenAI would be interested in buying the browser, Reuters reports. Google breaking off Chrome is a proposed remedy by the US Department of Justice in US v. Google, in which Judge Amit Mehta ruled last year that the company is a monopolist in online search. The remedies phase of the trial began on Monday. Google plans to appeal the ruling. The OpenAI exec, Nick Turley, also testified that OpenAI had contacted Google last year about a potential partnership that would allow ChatGPT to use Google’s search technology. ChatGPT can pull from Bing’s search information, and while Turley apparently did not specifically discuss Microsoft, he noted that OpenAI has had “significant quality issues” with a company referred to as “Provider No. 1,” according to Bloomberg. “We believe having multiple partners, and in particular Google’s API, would enable us to provide a better product to users,” OpenAI said in an email shown at the trial, Reuters says. Google chose not to partner with OpenAI, and Turley said that “we have no partnership with Google today.” OpenAI has also been working on its own search index, and while OpenAI originally wanted to have ChatGPT use it for 80 percent of searches by the end of 2025, the company now believes reaching that milestone will take years, Turley testified, according to Bloomberg.
    0 Commenti 0 condivisioni 31 Views
  • WWW.THEVERGE.COM
    The Oscars officially don’t care if films use AI
    The Academy of Motion Picture Arts and Sciences acknowledged the existence of generative AI yesterday in new rule changes for its annual Oscars awards ceremony. Rather than dictate its use or require disclosures, the Academy simply says using AI doesn’t, on its own, hurt a movie’s chances — but that how it’s used could. Here’s what the Academy says in a passage added to its film eligibility guidelines: With regard to Generative Artificial Intelligence and other digital tools used in the making of the film, the tools neither help nor harm the chances of achieving a nomination. The Academy and each branch will judge the achievement, taking into account the degree to which a human was at the heart of the creative authorship when choosing which movie to award. As The New York Times notes, the organization almost went further by requiring filmmakers to disclose whether they used AI in creating a movie. The mention of AI is a first for the Academy’s rules, as the Times writes, and a significant one given lengthy actor and writer Hollywood strikes that started in 2023 and were, in part, prompted by the rise of the technology and its perceived threat to creative workers in the industry.  The Academy didn’t just address AI with the new rule changes. Another new rule states that members are only eligible to participate in the final round of voting if they’ve watched all of the films being considered for a given category. But as the Times notes, it’s an honor-system requirement, as voters self-certify that they did so, and don’t have to prove it beyond that.
    0 Commenti 0 condivisioni 31 Views
  • TOWARDSDATASCIENCE.COM
    How to Get Performance Data from Power BI with DAX Studio
    Introduction To put things straight: I will not discuss how to optimize DAX Code today. More articles will follow, concentrating on common mistakes and how to avoid them. But, before we can understand the performance metrics, we need to understand the architecture of the Tabular model in Power Bi.The same architecture applies to Tabular models in SQL Server Analysis Services. Any Tabular model has two Engines: Storage Engine Formula Engines These two have distinct properties and fulfill different tasks in a Tabular model. Let’s investigate them. Storage Engine The Storage Engine is the interface between the DAX Query and the data stored in the Tabular model. This Engine takes any given DAX query and sends queries to the Vertipaq Storage engine, which stores the data in the data model. The Storage Engine uses a language called xmSQL to query the data model. This language is based on the standard SQL language but has fewer capabilities and supports only simple arithmetic operators (+, -, /, *, =, <>, and IN). To aggregate data, xmSQL supports SUM, MIN, MAX, COUNT, and DCOUNT (Distinct Count). Then it supports GROUP BY, WHERE, and JOINs. It will help if you have a basic understanding of SQL Queries when you try to understand xmSQL. If you don’t know SQL, it will be helpful to learn the basics when digging deeper into analyzing bad-performing DAX code. The most important fact is that the Storage Engine is multi-threaded. Therefore, when the Storage Engine executes a query, it will use multiple CPU-Cores to speed up query execution. Lastly, the Storage Engine can Cache queries and the results. Consequently, repeated execution of the same query will speed up the execution because the result can be retrieved from the cache. Formula Engine The Formula Engine is the DAX engine. All functions, which the Storage Engine cannot execute, are executed by the Formula Engine. Usually, the Storage Engine retrieves the data from the data model and passes the result to the Formula Engine. This operation is called materialization, as the data is stored in memory to be processed by the Formula Engine. As you can imagine, it is crucial to avoid large materializations. The Storage Engine can call the Formula Engine when an xmSQL-Query contains functions that the Storage Engine cannot execute.This is operation id called CallbackDataID and should be avoided, if possible. Crucially, the Formula engine is single-threaded and has no Cache. This means: No parallelism by using multiple CPU Cores No re-use of repeated execution of the same query This means we want to offload as many operations as possible to the Storage engine. Unfortunately, it is impossible to directly define which part of our DAX-Code is executed by which Engine. We must avoid specific patterns to ensure that the correct engine completes the work in the least amount of time. And this is another story that can fill entire books. But how can we see how much time is used by each Engine? Getting the Performance data We need to have DAX Studio on our machine to get Performance Metrics. We can find the download link for DAX Studio in the References Section below. If you cannot install the Software, you can get a portable DAX version from the same site. Download the ZIP file and unpack it in any local folder. Then you can start DAXStudio.exe, and you get all features without limitations. But first, we need to get the DAX Query from Power BI. First, we need to start Performance Analyzer in Power BI Desktop: Figure 1 – Start Performance Analyzer in Power BI Desktop (Figure by the Author) As soon as we see the Performance Analyzer Pane, we can start recording the performance data and the DAX query for all Visuals: Figure 2 – Start recording of Performance data and DAX query (Figure by the Author) First, we must click on Start Recording Then click on “Refresh Visuals” to restart the rendering of all Visuals of the actual page. We can click on one of the rows in the list and notice that the corresponding Visual is also activated. When we expand on one of the rows in the report, we see a few rows and a link to copy the DAX query to the Clipboard. Figure 3 – Select the Visual and copy the query (Figure by the Author) As we can see, Power BI needed 80’606 milliseconds to complete the rendering of the Matrix Visual. The DAX query alone used 80’194 milliseconds. This is a highly poor-performing measure used in this visual. Now, we can start DAX Studio.In case we have DAX Studio installed on our machine, we will find it in the External Tool Ribbon: Figure 4 – Start DAX Studio as an External Tool (Figure by the Author) DAX Studio will automatically be connected to the Power BI Desktop file. In case that we must start DAX Studio manually, we can manually connect to the Power BI file as well: Figure 5 – Manually connect DAX Studio to Power BI Desktop (Figure by the Author) After the connection is established, an empty query is opened in DAX Studio. On the bottom part of the DAX Studio Window, you will see a Log section where you can see what happens. But, before pasting the DAX Query from Power BI Desktop, we have to start Server Timings in DAX Studio (Right top corner of the DAX Studio Window): Figure 6 – Start Server Timings in DAX Studio (Figure by the Author) After pasting the Query to the Empty Editor, we have to Enable the “Clear on Run” Button and execute the query. Figure 7 – Enabling “Clear on Run” Feature (Figure by the Author) “Clear on Run” ensures the Storage Engine Cache is cleared before executing the Query. Clearing the Cache before measuring performance metrics is the best practice to ensure a consistent starting point for the measurement. After executing the query, we will get a Server Timings page at the bottom of the DAX Studio Window: Figure 8 – Server Timings Window in DAX Studio (Figure by the Author) Now we see a lot of information, which we will explore next. Interpreting the data On the left side of Server Timings, we will see the execution timings: Figure 9 – Execution Timings (Figure by the Author) Here we see the following numbers: Total – The total execution time in milliseconds (ms) SE CPU – The sum of the CPU time spent by the Storage Engine (SE) to execute the Query.Usually, this number is greater than the Total time because of the parallel execution using multiple CPU Cores FE – The time spent by the Formula Engine (FE) and the percentage of the total execution time SE – The time spent by the Storage Engine (FE) and the percentage of the total execution time SE Queries – The number of Storage Engine Queries needed for the DAX Query SE Cache – The use of Storage Engine Cache, if any As a rule of thumb: The larger the percentage of Storage Engine time, compared to Formula Engine time, the better. The middle section shows a list of Storage Engine Queries: Figure 10 – List of Storage Engine queries (Figure by the Author) This list shows how many SE Queries have been executed for the DAX Query and includes some statistical columns: Line – Index line. Usually, we will not see all the lines. But we can see all lines by clicking on the Cache and Internal buttons on the top right corner of the Server Timings Pane. But we will not find them very useful, as they are an internal representation of the visible queries. Sometimes it can be helpful to see the Cache queries and see what part of the query has been accelerated by the SE Cache. Subclass – Normally “Scan” Duration – Time spent for each SE Query CPU – CPU Time spent for each SE Query Par. – Parallelism of each SE Query Rows and KB – Size of the materialization by the SE Query Waterfall – Timing sequence by the SE Queries Query – The beginning of each SE Query In this case, the first SE Query returned 12’527’422 rows to the Formula engine (The number of rows in the entire Fact table) using 1 GB of Memory. This is not good, as large materializations like these are performance killers. This clearly signifies that we made a big mistake with your DAX Code. Lastly, we can read the actual xmSQL Code: Figure 11 – Storage  Engine Query Code (Figure by the Author) Here we can see the xmSQL code and try to understand the Problem of the DAX Query. In this case, we see that there is a highlighted CallbackDataID. DAX Studio highlights all CallbackDataID in the Query text and makes all queries in the query list bold, which contains a CallbackDataID. We can see that, in this case, an IF() function is pushed to the Formula Engine (FE), as the SE cannot process this function. But SE knows that FE can do it. So, it calls the FE for each row in the result. In this case, over 12 million times. As we can see from the timing, this takes a lot of time. Now we know that we have written bad DAX Code and the SE calls the FE many times to execute a DAX function. And we know that we use 1 GB of RAM to execute the query. Moreover, we know that the parallelism is only 1.9 times, which could be much better. What it should look like The DAX query contains only the Query created by Power BI Desktop. But in most cases, we need the Code of the Measure. DAX Studio offers a feature called “Define Measures” to get the DAX Code of the Measure: Add one of two blank lines in the Query Place the cursor on the first (empty) line Find the Measure in the Data Model Right-click on the Measure and click on Define Measure Figure 12 – Define Measure in DAX Studio (Figure by the Author) 5. If our Measure calls another Measure, we can click on Define Dependent Measures. In this case, DAX Studio extracts the code of all Measures used by the selected Measure The result is a DEFINE statement followed by one or more MEASURE Statements containing the DAX code of our guilty Measure. After optimizing the code, I executed the new Query and took the Server Timings to compare them to the original Data: Figure 13 – Comparing slow a fast DAX code (Figure by the Author) Now, the entire query took only 55 ms, and SE created a materialization of only 19 Rows. The parallelism is at 2.6 times, which is better than 1.9 times. It looks like the SE didn’t need that much processing power to increase parallelism. This is a very good sign. The optimization worked very well after looking at these numbers. Conclusion We need some information when we have a slow Visual in your Power BI Report. The first step is to use Performance Analyzer in Power BI Desktop to see where time is spent rendering the result of the Visual. When we see that it takes much time to execute the DAX Query, we need DAX Studio to find out the problem and try to fix it. I didn’t cover any methods to optimize DAX in this article, as it wasn’t my aim to do it. But now that I have laid down the foundation to get and understand the performance metrics available in DAX Studio, I can write further articles to show how to optimize DAX code, what you should avoid, and why. I’m looking forward to the journey with you. References Download DAX Studio for free here: https://www.sqlbi.com/tools/dax-studio/ Free SQLBI Tools Training: DAX Tools Video Course – SQLBI SQLBI offers DAX-Optimization training as well. I use the Contoso sample dataset, like in my previous articles. You can download the ContosoRetailDW Dataset for free from Microsoft here. The Contoso Data can be freely used under the MIT License, as described here. The post How to Get Performance Data from Power BI with DAX Studio appeared first on Towards Data Science.
    0 Commenti 0 condivisioni 29 Views
  • WWW.YOUTUBE.COM
    ChatGPT Opens A Research Lab…For $2!
    ChatGPT Opens A Research Lab…For $2!
    0 Commenti 0 condivisioni 29 Views
  • WWW.YOUTUBE.COM
    NVIDIA’s New AI: The Age of Real Time Game Making Is Here!
    NVIDIA’s New AI: The Age of Real Time Game Making Is Here!
    0 Commenti 0 condivisioni 27 Views
  • 0 Commenti 0 condivisioni 27 Views
  • 0 Commenti 0 condivisioni 28 Views