WWW.FXGUIDE.COM
Googles NotebookLM: everyone has a podcast about AI, even AI itself
NotebookLM is a personal AI research assistant powered by Googles LLM, Gemini 1.5 Pro. If you upload a document or PDF to NotebookLM, it not only can summarise the information as Llama3 or ChatGPT does, but it can also produce a podcast-like discussion between two people about the content. It is incredible. (See our tests below).I uploaded a 40-page, highly technical academic paper, and within a few minutes, there was an 11-minute discussion about it, with jokes, slang, and apparent interest in the topic. The voices sound natural and conversational. If you have a 20-minute commute, you could upload a couple of complex SIGGRAPH papers and listen to your own personal podcast about the papers on your way to work. It is a jaw-dropping next step in conversational agents and LLMs.Will it replace podcasting? Well, as brilliant as it is, it does not generate insightful opinions about the value or applicability of the technology; it does not discuss whether this will be successful or require further development before it is adopted. It only discusses the PDF from the perspective of the paper itself, augmented by the LLMs general understanding of the topic. It is also not guaranteed to be accurate, it does not have any real understanding of the content, it is not significantly more clever than OpenGPT or any other LLM.It seems to be much like RAG + LLM. A Retrieval-augmented generation (RAG) is an AI framework for improving the quality of LLM-generated responses by grounding the model on external sources of knowledge. Vectors are produced that aid in the LLMs focus of knowledge so the system can use an LLM but have localised, specialised responses. In the case of NotebookLM, this is then filtered and presented via a conversation between two inferred commentators.Test Drive (What NotebookLM said about this article).Summary: The source discusses a new AI tool called NotebookLM, which uses a large language model (LLM) to summarise and discuss scientific research papers in a conversational format. It compares this tool to other AI frameworks like RAG (Retrieval-Augmented Generation) and explores potential impacts on the VFX industry. While recognising the potential for disruption, the source argues that these technologies may create new opportunities by enabling technical artists to better understand complex subjects and lead to the creation of novel visual experiences. The author emphasizes the need for VFX professionals to adapt and leverage these advancements to ensure their continued relevance and value.Audio Test Drive (This is NotebookLM discussing the article).Here is a NotebookML conversation audio version. Note that it made a mistake in the first minute regarding SIGGRAPH- but this software is labelled as Experimental.https://www.fxguide.com/wp-content/uploads/2024/09/NotebookMLFXG.m4aTest Drive (What NotebookLM said about my 40 page Academic article).https://www.fxguide.com/wp-content/uploads/2024/09/ISR.m4aImpact for VFX?The two voices sound remarkably natural, insanely so. Given the current trajectory of AI, we can only be a few beats away from you uploading audio and having voice-cloned versions so that these base responses could sound like you, your partner, or your favourite podcaster. The technology is presented by Google as an AI collaborative virtual research assistant. After all, the rate of essential advances coming out in this field alone makes keeping up to date feel impossible so a little AI help sounds sensible if not necessary.So why does this matter for VFX? Is this the dumbing down of knowledge into Knowledge McNuggets, or is it a way to bridge complex topics so anyone can gain introductory expertise on even the most complex subject?Apart from the apparent use of complex subjects to make them more accessible for technical artists, how does this impact VFX? I would argue that this or the latest advances from Runways video to video or SORAs GenAI all provide massive disruption, but also, they invite our industrys creativity for technical problem-solving. GenAI videos are not engaging dramas or brilliant comedies. Video inference is hard to direct and complex to piece into a narrative. And NotebookLM will be hard-pushed to be as engaging as any two good people on a podcast. But they are insanely clever new technologies, so they invite people like VFX artists to make the leap from technology demos to sticky, engaging real-world use cases. My whole career I have seen tech at conferences and then discussed with friends later, I cant wait to see how ILM/Framestore/WetaFx will use that tech and make something brilliant to watch. As an industry, we are suffering massive reductions in production volume that are hurting many VFX communities. I dont think this is due to AI, but in parallel to those structural issues, we need to find ways to make this tech useful. At the moment, it is stunningly surprising and often cool, but how do we use this to create entirely new viewer experiences that people want?It is not an easy problem to solve, but viewed as input technology and not the final solution, many of these new technologies could create jobs. I dont believe AI will generate millions of new Oscar-level films, but I also dont believe it will be the death of our industry. Five years ago, it was predicted wed all be in self-driving cars by now. It has not happened. Four years ago, radiologists would all be out of a job, and so it goes.If we assume NotebookLM is both an incredibly spectacular jump in technology and not going to replace humans what could you use it for? What powerful user experiences could you use it for? Theme park and location-based entertainment? AVP Sport agents/avatars? A new form of gaming? A dog friendly training tool ?AI is producing incredible affordances in visual and creative domains, why cant the Visual Effects industry be the basis of a new Visual AI industry that takes this tech and really makes it useful for people?
0 التعليقات
0 المشاركات
74 مشاهدة