0 Комментарии
0 Поделились
138 Просмотры
Каталог
Каталог
-
Войдите, чтобы отмечать, делиться и комментировать!
-
VENTUREBEAT.COMLOreal Cell BioPrint analyzes your skin in five minutesLOral Groupe announced at CES 2025 theLOral Cell BioPrint, a hardware device that provides customized skin analysis in just five minutes.Read More0 Комментарии 0 Поделились 146 Просмотры
-
WWW.MARKTECHPOST.COMVITA-1.5: A Multimodal Large Language Model that Integrates Vision, Language, and Speech Through a Carefully Designed Three-Stage Training MethodologyThe development of multimodal large language models (MLLMs) has brought new opportunities in artificial intelligence. However, significant challenges persist in integrating visual, linguistic, and speech modalities. While many MLLMs perform well with vision and text, incorporating speech remains a hurdle. Speech, a natural medium for human interaction, plays an essential role in dialogue systems, yet the differences between modalitiesspatial versus temporal data representationscreate conflicts during training. Traditional systems relying on separate automatic speech recognition (ASR) and text-to-speech (TTS) modules are often slow and impractical for real-time applications.Researchers from NJU, Tencent Youtu Lab, XMU, and CASIA have introduced VITA-1.5, a multimodal large language model that integrates vision, language, and speech through a carefully designed three-stage training methodology. Unlike its predecessor, VITA-1.0, which depended on external TTS modules, VITA-1.5 employs an end-to-end framework, reducing latency and streamlining interaction. The model incorporates vision and speech encoders along with a speech decoder, enabling near real-time interactions. Through progressive multimodal training, it addresses conflicts between modalities while maintaining performance. The researchers have also made the training and inference code publicly available, fostering innovation in the field.Technical Details and BenefitsVITA-1.5 is built to balance efficiency and capability. It uses vision and audio encoders, employing dynamic patching for image inputs and downsampling techniques for audio. The speech decoder combines non-autoregressive (NAR) and autoregressive (AR) methods to ensure fluent and high-quality speech generation. The training process is divided into three stages:Vision-Language Training: This stage focuses on vision alignment and understanding, using descriptive captions and visual question answering (QA) tasks to establish a connection between visual and linguistic modalities.Audio Input Tuning: The audio encoder is aligned with the language model using speech-transcription data, enabling effective audio input processing.Audio Output Tuning: The speech decoder is trained with text-speech paired data, enabling coherent speech outputs and seamless speech-to-speech interactions.These strategies effectively address modality conflicts, allowing VITA-1.5 to handle image, video, and speech data seamlessly. The integrated approach enhances its real-time usability, eliminating common bottlenecks in traditional systems.Results and InsightsEvaluations of VITA-1.5 on various benchmarks demonstrate its robust capabilities. The model performs competitively in image and video understanding tasks, achieving results comparable to leading open-source models. For example, on benchmarks like MMBench and MMStar, VITA-1.5s vision-language capabilities are on par with proprietary models like GPT-4V. Additionally, it excels in speech tasks, achieving low character error rates (CER) in Mandarin and word error rates (WER) in English. Importantly, the inclusion of audio processing does not compromise its visual reasoning abilities. The models consistent performance across modalities highlights its potential for practical applications.ConclusionVITA-1.5 represents a thoughtful approach to resolving the challenges of multimodal integration. By addressing conflicts between vision, language, and speech modalities, it offers a coherent and efficient solution for real-time interactions. Its open-source availability ensures that researchers and developers can build upon its foundation, advancing the field of multimodal AI. VITA-1.5 not only enhances current capabilities but also points toward a more integrated and interactive future for AI systems.Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also,dont forget to follow us onTwitter and join ourTelegram Channel andLinkedIn Group. Dont Forget to join our60k+ ML SubReddit. FREE UPCOMING AI WEBINAR (JAN 15, 2025): Boost LLM Accuracy with Synthetic Data and Evaluation IntelligenceJoin this webinar to gain actionable insights into boosting LLM model performance and accuracy while safeguarding data privacy.The post VITA-1.5: A Multimodal Large Language Model that Integrates Vision, Language, and Speech Through a Carefully Designed Three-Stage Training Methodology appeared first on MarkTechPost.0 Комментарии 0 Поделились 141 Просмотры
-
WWW.MARKTECHPOST.COMAutoGraph: An Automatic Graph Construction Framework based on LLMs for RecommendationEnhancing user experiences and boosting retention using recommendation systems is an effective and ever-evolving strategy used by many industries, such as e-commerce, streaming services, social media, etc. These systems must analyze complex relationships between users, items, and contextual factors to suggest precisely what the user might want. However, the existing recommendation systems are static, relying on substantial historical data to build connections effectively. In cold start scenarios, which are heavily prevalent, mapping the relationships becomes impossible, weakening these systems even further. Researchers from the Shanghai Jiao Tong University and Huawei Noahs Ark Lab have introduced AutoGraph to address these issues. This framework automatically builds graphs incorporating dynamic adjustments and leverages LLMs for better contextual understanding.Commonly, graph-based recommendation systems are employed. Current systems, however, require people to set the features manually and their connections in a graph, consuming much time. Also, rules are set beforehand, limiting how these graphs could adapt. Incorporating unstructured data, which potentially has rich semantic information about user preferences, is also a significant issue. Therefore, there is a need for a new method that can resolve the data sparsity issues and the failure to capture nuanced relationships and adjust to user preferences in real-time.AutoGraph is an innovative framework to enhance recommendation systems leveraging Large Language Models (LLMs) and Knowledge Graphs (KGs). The methodology of AutoGraph is based on these features:Utilization of Pre-trained LLMs: The framework leverages pre-trained LLMs to analyze user input. It can draw relationships based on the analysis of natural language, even those that are apparently hidden.Knowledge Graph Construction: After the relationship extraction, LLMs generate graphs, which can be seen as structured representations of user preferences. Algorithms optimize such graphs to remove less relevant connections in an attempt to maximize the quality of the graph in its entirety.Integration with Graph Neural Networks (GNNs): The final step of the proposed method is to integrate the created knowledge graph with regular Graph Neural Networks. GNNs can provide more accurate recommendations by using both node features and graph structure, and they are sensitive to personal preferences and more significant trends among users.To evaluate the proposed frameworks efficacy, authors benchmarked against traditional recommendation techniques using e-commerce and streaming services datasets. There was a significant gain in the precision of recommendations, which shows that the framework is competent enough to give relevant recommendations. The proposed method had improved scalability for dealing with large datasets. The framework demonstrated reduced computational requirements compared to traditional graph construction approaches. Process automation, along with the use of advanced algorithms, helped in lowering resource usage without compromising the quality of the results.The Autograph framework represents a significant leap forward in recommendation systems. Automating graph construction with LLMs addresses long-standing scalability, adaptability, and contextual awareness challenges. The frameworks success demonstrates the transformative potential of integrating LLMs into graph-based systems, setting a new benchmark for future research and applications in personalized recommendations. AutoGraph opens new avenues for personalized user experiences in diverse domains by automating the construction of dynamic, context-aware recommendation graphs. This innovation highlights the growing role of LLMs in addressing real-world challenges, revolutionizing how we approach recommendation systems.Check out the Paper. All credit for this research goes to the researchers of this project. Also,dont forget to follow us onTwitter and join ourTelegram Channel andLinkedIn Group. Dont Forget to join our60k+ ML SubReddit. Afeerah Naseem+ postsAfeerah Naseem is a consulting intern at Marktechpost. She is pursuing her B.tech from the Indian Institute of Technology(IIT), Kharagpur. She is passionate about Data Science and fascinated by the role of artificial intelligence in solving real-world problems. She loves discovering new technologies and exploring how they can make everyday tasks easier and more efficient. Follow us on X (Twitter) to get regular AI Research and Dev Updates here...0 Комментарии 0 Поделились 147 Просмотры
-
WWW.IGN.COMCES 2025: Hyperkin Unveils DualSense-Style Xbox Controller Called The CompetitorAs part of CES 2025, peripheral manufacturer Hyperkin revealed a new version of The Competitor, which is its wired pro-style gamepad for Xbox consoles and PC. Its geared towards competitive-minded players (hence, the name) and it shares similarities to the PlayStation 5s DualSense, which we consider one of the best controllers for PC as well.PlayLike the previous model of The Competitor, a major design shift from traditional Xbox controllers is the stick placement the symmetrical positioning of the analog sticks mimics what you find on PlayStation controllers. Whether or not thats a good thing will boil down to preference, but a notable improvement over the stock Xbox gamepad is that the sticks are Hall Effect as opposed to regular analog. Hall Effect parts are magnetic and create a consistent resistance, but more importantly, they have better durability and aren't prone to stick drift (which can happen on the Switch Joy-Con and PlayStation 5 DualSense). The triggers are also Hall Effect to help with precision when applying pressure. These are features found on the previous iteration of The Competitor, so the big change is in the ergonomics which draws from the DualSense, especially in the white-black color scheme.Hyperkin Competitor - PhotosThe directional pad clearly mimics its PlayStation inspiration with each direction being separated and the buttons pointed inward. Both the triggers and the bumpers are shaped accordingly and there's a mic mute button at the lower-center as well. It also comes with two programmable back buttons, which is a major feature of third-party competitive controllers and high-end first-party offerings like the DualSense Edge and Xbox Elite Series 2.Hyperkin has historically been known for its catalog of retro gaming accessories and systems, but has also been making peripherals for current platforms. When it comes to Xbox, Hyperkin made a splash by bringing back The Duke (the original Xbox controller) and The DuchesS (the Xbox S controller) for modern systems. Currently, Hyperkin has yet to reveal a release date or price for the updated version of The Competitor.We'll be getting our hands on this new model of The Competitor from Hyperkin at CES 2025, so stay tuned for our impressions. If you're looking to upgrade your gamepad, be sure to check out our current roundup of the best controllers for Xbox, PC, and PS5. For all the important gaming news at the biggest tech convention of the year, be sure to check out our roundup of everything you need to know about CES 2025.0 Комментарии 0 Поделились 164 Просмотры
-
WWW.CNET.COMThis Water Bottle Cap Is Like a Personal SodaStream You Can Take on the Go. It SparklesWhen the water went into the bottle, it was flat, tasting like boring old tap water. When it came out of the water bottle, it sparkled. Not in that highly-carbonated, almost spicy way a fresh can of seltzer tastes, but a milder sizzle, like that same can after you've been sipping from it for 10 minutes. And here's the most interesting part: The transformation happened in the bottle, not in a big countertop soda machine.The Roam SodaTop is like a handheld SodaStream. The cap is where the magic happens. A tight seal and a small CO2 cartridge are the key elements in turning boring still water into something a little more exciting. You can screw the top on a water bottle Roam sells itself, or on a compatible bottle (say, a HydroFlask).Roam expects to start selling the SodaTop directly from its website in the first quarter of 2025, for $50. The CO2 cartridges are expected to cost about 70 cents apiece.Handheld carbonation Upgrade your inbox Get cnet insider From talking fridges to iPhones, our experts are here to help make the world a little less complicated. The process is simple. You insert a CO2 cartridge into the SodaTop's special lid, release that carbon dioxide into the bottle, let out excess gas on the inside and drink. Each cartridge is good for a single liter of water.Roam says its stainless steel water bottles are designed to keep drinks cold for 24 hours. If you already have a SodaStream, Roam also has an adapter cap, which will sell for $13, that lets you carbonate your drinks from your countertop machine.Flavor is in the airThe only thing you should carbonate in the bottle is water, but that doesn't mean you can't add flavor. Flavored syrups or other drink mixes can be added after the CO2 is if you want something that tastes like, well, something. But Roam is also developing its own ways to add variety.CEO Sunjay Guleria said the company is working on flavored CO2 cartridges, which will contain the essence of a flavor and allow you to make your own zero-calorie sparkling water with just plain water and the cartridge. Imagine ginger, lemon, yuzu, grapefruit or white peach flavors -- natural sparkling water or flavored sparkling fresh in your own bottle. That's refreshing.0 Комментарии 0 Поделились 156 Просмотры
-
WWW.CNET.COMToday's NYT Connections: Sports Edition Hints and Answers for Jan. 6, #105Looking for the most recentregular Connections answers? Click here for today's Connections hints, as well as our daily answers and hints for The New York Times Mini Crossword, Wordle and Strands puzzles.Connections: Sports Editionhas a lot of clues today that look like random groupings of letters, not real words. They sure look strange out of context. Read on for hints and answers for today's Connections: Sports Edition puzzle.For now, the game is in beta, which means the Times is testing it out to see if it's popular before adding it to the site's Games app. You can play it daily for now for free and then we'll have to see if it sticks around.Read more: NYT Has a Connections Game for Sports Fans. I Tried ItHints for today's Connections: Sports Edition groupsHere are four hints for the groupings in today's Connections: Sports Edition puzzle, ranked from the easiest yellow group to the tough (and sometimes bizarre) purple group.Yellow group hint: Ouch!Green group hint: Like HR and RBIBlue group hint: Lambeau, or U.S. BankPurple group hint:Hoops helperAnswers for today's Connections: Sports Edition groupsYellow group: Worn after an injuryGreen group: Baseball stat abbreviationsBlue group: NFL stadiumsPurple group:NBA coachesRead more: Wordle Cheat Sheet: Here Are the Most Popular Letters Used in English WordsWhat are today's Connections: Sports Edition answers? The completed NYT Connections: Sports Edition puzzle for Jan. 6, 2025. NYT/Screenshot by CNETThe yellow words in today's ConnectionsThe theme is worn after an injury. The four answers are brace, cast, sling and splint.The green words in today's ConnectionsThe theme is baseball stat abbreviations. The four answers are AB, LOB, WAR and WHIP.The blue words in today's ConnectionsThe theme is NFL stadiums. The four answers are Allegiant, NRG, SoFi and Soldier.The purple words in today's ConnectionsThe theme is NBA coaches. The four answers are Finch, Lue, Nurse and Rivers.0 Комментарии 0 Поделились 161 Просмотры
-
WWW.IAMAG.COThe Art Of Karl Sissoncookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.0 Комментарии 0 Поделились 183 Просмотры
-
WWW.IAMAG.COThe Art of Henu Caulfield Joocookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.0 Комментарии 0 Поделились 177 Просмотры
-
WWW.VG247.COMIt makes total sense that Elden Ring: Nightreign won't be bringing back one of the Souls games' most recognisable featuresBut WholeIt makes total sense that Elden Ring: Nightreign won't be bringing back one of the Souls games' most recognisable featuresFrom Demon's Souls all the way to Elden Ring, one feature in particular has come to define FromSoftware's games, but we're fine letting it go for Nightreign.Image credit: Bandai Namco News by Sherif Saed Contributing Editor Published on Jan. 6, 2025 When Elden Ring: Nightreign was announced just a few weeks ago at The Game Awards, much of the discussion had a lot do with how surprising the reveal was. This is, after all, an online co-op game from a studio mostly known for its single-player experiences.While many of FromSoftwares games allow players to team up in co-op, the games are chiefly designed to be played solo. There was, however, one big mechanic that made those journeys a little less solitary, and its not one were going to see again in Nightreign.To see this content please enable targeting cookies. One of the most beloved features in most Souls and Soulslike games made by FromSoftware is the ability to leave messages for other players. The mechanic has been utilised in many different ways, not all of them useful.Typically, players could leave messages to admire a certain view, make a comment on a moment that took place at the spot where the message was left, point players to hidden treasures or illusory walls, or even just to troll and mislead them.The messaging system only consists of certain words, so players had to get a little clever with their phrasing, which was responsible for giving us plenty of jokes, many of which were naughty. Unfortunately, that entire experience wont be found in Elden Ring: Nightreign. No funny business! | Image credit: FromSoftwareGame director Junya Ishizaki confirmed this in a recent IGN Japan interview, and his reasoning actually makes a lot of sense. Ishizaki explained that the relatively short session time of 40 minutes simply doesnt allow for players to stop and read/write messages. Nightreign is designed to offer a condensed RPG experience thats made up almost entirely of combat, so theres very little use for messages.Indeed, considering the one-and-done style of the game, persistent messages that get liked and disliked by other players dont make much sense. What is returning, however, are player ghosts, which is something, at least. Another notable absence is that resting at a Site of Grace will not cause defeated (regular) enemies to respawn, which removes an element of risk from the decision to rest, so itll be interesting to see how that factors into the bigger picture in terms of challenge.These are all interesting design decisions, and they prove that FromSoftware may have been inspired by the many mods and randomisers the community created for its various games over the years.Nightreign is due out sometime this year on PC, PS4, PS5, Xbox One, and Xbox Series X/S. Our next big Nightreign reveal will likely arrive with the network test thats being held in February.0 Комментарии 0 Поделились 155 Просмотры