0 Comments
0 Shares
11 Views
Directory
Directory
-
Please log in to like, share and comment!
-
GAMERANT.COMBest Christmas Moments In Harry PotterBuckle up! Blankets, hot chocolate, and a little bit of mischief, Christmas is coming! Nothing feels more Christmas-y than re-watching the entire Harry Potter saga. Although not necessarily a saga about Christmas, it has become a staple in so many households to watch the golden trio's attempt at defeating the Dark Lord when the weather gets chilly.0 Comments 0 Shares 11 Views
-
GAMEDEV.NETIntroducing My Game! [Devlog #1 - 12/24]The production for my game, Cellar Worlds, has officially begun, and this first DevLog will share details about the game in the making.Cellar Worlds is an Action/Adventure RPG inspired by the 3DS game Ever Oasis.In a small town in the middle of nowhere, a dark force transformed everyones cellars into a doorway to other worlds. Monsters from the world infested the town and drove people from their homes. You, being stubborn and not wanting to leave your dear home, fight b0 Comments 0 Shares 8 Views
-
WWW.TECHRADAR.COMObscure Chinese PC vendor has the world's first Qualcomm PC out of the gate: QS1 Pro runs Windows 11 Pro, has Wi-Fi 7 and up to 2TB SSDThis obscure Chinese PC vendor has the world's first Qualcomm PC on offer.0 Comments 0 Shares 8 Views
-
WWW.FASTCOMPANY.COMTikTok is full of bogus, potentially dangerous medical adviceTikTok is the new doctors office, quickly becoming a go-to platform for medical advice.Unfortunately, much of that advice is sketchy at best.A new report by the healthcare software firm Tebra found 45% of medical advice on TikTok to be false or misleading. Some categories were worse offenders than others: TikTok videos about alternative medicine have the most inaccuracies, with 67% of posts flagged as misleading. (See: putting onions in your socks to cure a cold, or sticking garlic cloves up your nose for a sinus infection.) Womens health and general health topics werent much better, with 54% of advice in each category being inaccurate.Mental health content on TikTok had the lowest misinformation rate at 31%. Wellness and self-care videos were slightly worse at 37%, while advice about chronic illness was false or misleading 39% of the time. More views also doesnt equate to more reliable informationvideos with more than 5 million views were found to be 14% more likely to spread false information than those with fewer than 1 million views.Among the misleading claims on TikTok, the three most common include quick-fix weight-loss tricks, misinformation around vaccines long-term effects on fertility, and cure-all daily supplements. While some creators use scare tactics to discourage actions like wearing masks, getting vaccinated, or using birth control, others, posing as medical experts, cash in by promoting diets, supplements, and treatments that are ineffective at best, and harmful at worst.With 17% of Americans trusting TikTok as much as they do doctors, and 7% trusting the platform even more than they do medical professionals, the consequences are potentially serious. Given that nearly half of U.S. TikTok users are under 30, the app becomes a perfect storm for misleading advice targeting a young and impressionable audience. Theres also no easy way to verify whether these so-called experts have the credentials they claim, leaving users to rely on unvetted information.Consumers who blindly follow unverified health advice online are setting themselves up for trouble. The best advice? Trust your instincts. If a health claim sounds too good to be true, it probably is.0 Comments 0 Shares 8 Views
-
WWW.YANKODESIGN.COMThis Custom Tiny Home Is Like A Luxurious Mansion On WheelsModern Tiny Living designed a custom tiny home called Serenity. The tiny house features an extremely clever layout that deserves to be appreciated. The tiny home maker essentially created a mansion on wheels, taking micro-home living to a whole new level. It serves as a comfortable dwelling, and a dream home for the owner, who wanted to have outdoor adventures and a home office. Serenity merges a cozy accommodation with a private home office and an outdoor shower so it is a unique and well-equipped home. It offers the perks of tiny homes on wheels, as well as the comfort of a full-time premium residence.Design: Modern Tiny LivingThe house features a generous size of 28 feet, although it isnt exactly a super tiny home, it isnt an extremely large one either. It is a customized and modified version of MLTs original Point model, but with an extra eight feet, making it a spacious space indeed. It features an additional storage unit at the end, which makes it look even longer, while an arched front section supports a custom-designed elevated social area. The home is equipped with only one loft bedroom, which holds a king-sized bed and includes multiple large windows. You can access the bedroom through a sturdy staircase with extra-wide treads and a full-size handrail. The room also includes a safety railing which is an extension of the handrail.The bedroom also includes a tiny door that leads to a small corridor placed above the home office. So, the bedroom and the home office are connected which is quite unusual and fascinating. The home office is equipped with a concrete desk and plenty of storage including built-in shelves. It can be accessed from the outside as well, so one can enter the office from inside and outside. The opposite side of the house accommodates an elevated social area which serves as a seating area, a guest bed, and a storage solution. This feature is quite unique, and also includes a row of windows for some surreal views.This social section can accommodate a large group of people, and it also includes spacious drawers all around. There is some sub-floor storage in the middle as well. A built-in bookcase is also another storage section. The sofa can be converted into a guest bed if needed. The kitchen is quite spacious and is equipped with a premium concrete countertop that also doubles up as a modern snack bar. It has plenty of storage, in the form of cabinets, drawers and overhead cupboards. There is also space for full-size appliances. This mansion on wheels was priced at around $105,000 five years ago, and it showcases how clever customization can turn an ordinary tiny home into a luxurious abode.The post This Custom Tiny Home Is Like A Luxurious Mansion On Wheels first appeared on Yanko Design.0 Comments 0 Shares 9 Views
-
VENTUREBEAT.COMMaximum Entertainment divests Merge Games assets to Silver LiningMaximum Entertainment announced it's divested assets of the former Merge Games to new publisher Silver Lining Interactive.Read More0 Comments 0 Shares 9 Views
-
WWW.MARKTECHPOST.COMThis AI Paper by The Data Provenance Initiative Team Highlights Challenges in Multimodal Dataset Provenance, Licensing, Representation, and Transparency for Responsible DevelopmentThe advancement of artificial intelligence hinges on the availability and quality of training data, particularly as multimodal foundation models grow in prominence. These models rely on diverse datasets spanning text, speech, and video to enable language processing, speech recognition, and video content generation tasks. However, the lack of transparency regarding dataset origins and attributes creates significant barriers. Using training data that is geographically and linguistically skewed, inconsistently licensed, or poorly documented introduces ethical, legal, and technical challenges. Understanding the gaps in data provenance is essential for advancing responsible and inclusive AI technologies.AI systems face a critical issue in dataset representation and traceability, which limits the development of unbiased and legally sound technologies. Current datasets often rely heavily on a few web-based or synthetically generated sources. These include platforms like YouTube, which accounts for a significant share of speech and video datasets, and Wikipedia, which dominates text data. This dependency results in datasets failing to represent underrepresented languages and regions adequately. In addition, the unclear licensing practices of many datasets create legal ambiguities, as more than 80% of widely used datasets carry some form of undocumented or implicit restrictions despite only 33% being explicitly licensed for non-commercial use.Attempts to address these challenges have traditionally focused on narrow aspects of data curation, such as removing harmful content or mitigating bias in text datasets. However, such efforts are typically limited to single modalities and lack a comprehensive framework to evaluate datasets across modalities like speech and video. Platforms hosting these datasets, such as HuggingFace or OpenSLR, often lack the mechanisms to ensure metadata accuracy or enforce consistent documentation practices. This fragmented approach underscores the urgent need for a systematic audit of multimodal datasets that holistically considers their sourcing, licensing, and representation.To close this gap, researchers from the Data Provenance Initiative conducted the largest longitudinal audit of multimodal datasets, examining nearly 4,000 public datasets created between 1990 and 2024. The audit spanned 659 organizations from 67 countries, covering 608 languages and nearly 1.9 million hours of speech and video data. This extensive analysis revealed that web-crawled and social media platforms now account for most training data, with synthetic sources also rapidly growing. The study highlighted that while only 25% of text datasets have explicitly restrictive licenses, nearly all content sourced from platforms like YouTube or OpenAI carries implicit non-commercial constraints, raising questions about legal compliance and ethical use.The researchers applied a meticulous methodology to annotate datasets, tracing their lineage back to sources. This process uncovered significant inconsistencies in how data is licensed and documented. For instance, while 96% of text datasets include commercial licenses, over 80% of their source materials impose restrictions that are not carried forward in the datasets documentation. Similarly, video datasets highly depended on proprietary or restricted platforms, with 71% of video data originating from YouTube alone. Such findings underscore the challenges practitioners face in accessing data responsibly, particularly when datasets are repackaged or re-licensed without preserving their original terms.Notable findings from the audit include the dominance of web-sourced data, particularly for speech and video. YouTube emerged as the most significant source, contributing nearly 1 million hours to each speech and video content, surpassing other sources like audiobooks or movies. Synthetic datasets, while still a smaller portion of overall data, have grown rapidly, with models like GPT-4 contributing significantly. The audit also revealed stark geographical imbalances. North American and European organizations accounted for 93% of text data, 61% of speech data, and 60% of video data. In comparison, regions like Africa and South America collectively represented less than 0.2% across all modalities.Geographical and linguistic representation remains a persistent challenge despite nominal increases in diversity. Over the past decade, the number of languages represented in training datasets has grown to over 600, yet measures of equality in representation have shown no significant improvement. The Gini coefficient, which measures inequality, remains above 0.7 for geographical distribution and above 0.8 for language representation in text datasets, highlighting the disproportionate concentration of contributions from Western countries. For speech datasets, while representation from Asian countries like China and India has improved, African and South American organizations continue to lag far behind.The research provides several critical takeaways, offering valuable insights for developers and policymakers:Over 70% of speech and video datasets are derived from web platforms like YouTube, while synthetic sources are becoming increasingly popular, accounting for nearly 10% of all text data tokens.While only 33% of datasets are explicitly non-commercial, over 80% of source content is restricted. This mismatch complicates legal compliance and ethical use.North American and European organizations dominate dataset creation, with African and South American contributions at less than 0.2%. Linguistic diversity has grown nominally but remains concentrated in many dominant languages.GPT-4, ChatGPT, and other models have significantly contributed to the rise of synthetic datasets, which now represent a growing share of training data, particularly for creative and generative tasks.The lack of transparency and persistent Western-centric biases call for more rigorous audits and equitable practices in dataset curation.In conclusion, this comprehensive audit sheds light on the growing reliance on web-crawled and synthetic data, the persistent inequalities in representation, and the complexities of licensing in multimodal datasets. By identifying these challenges, the researchers provide a roadmap for creating more transparent, equitable, and responsible AI systems. Their work underscores the need for continued vigilance and measures to ensure that AI serves diverse communities fairly and effectively. This study is a call to action for practitioners, policymakers, and researchers to address the structural inequities in the AI data ecosystem and prioritize transparency in data provenance.Check out the Paper. All credit for this research goes to the researchers of this project. Also,dont forget to follow us onTwitter and join ourTelegram Channel andLinkedIn Group. Dont Forget to join our60k+ ML SubReddit. Sana Hassan+ postsSana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions. [Download] Evaluation of Large Language Model Vulnerabilities Report (Promoted)0 Comments 0 Shares 10 Views
-
TOWARDSAI.NETAI in Medical Imaging: A Life-Saving Revolution or Ethical Minefield?AI in Medical Imaging: A Life-Saving Revolution or Ethical Minefield? 0 like December 24, 2024Share this postLast Updated on December 24, 2024 by Editorial TeamAuthor(s): Mukundan Sankar Originally published on Towards AI. This member-only story is on us. Upgrade to access all of Medium.Photo by Accuray on UnsplashArtificial intelligence (AI) is shaking up all aspects of how we do anything, including the very core of medical imaging. Visualize a machine that analyzes a CT scan and spots early signs of cancer. Before even the most skilled human eye can. Sounds impossible, doesnt it?But behind the glossy headlines and the marvels of technology lies a darker, messier reality. We need to talk about this now!Because whats the cost of these radical shifts that AI brings? And Im not just talking dollars here. Im talking about the ethics of AI in medical imagery, where lives are literally on the line. Let me break it down because this isnt just an issue for tech nerds and medical professionals. This is about all of us, and its happening right now.AIs impact can be felt in every field, including medical imaging. AI revolutionizes this field in ways we couldnt have imagined a decade ago. Machines now accurately read and analyze X-rays, MRIs, and CT scans. For example, a recent UCLA study reported that AI detected prostate cancer with an 84% accuracy rate, while human doctors achieved 67%. Read the full blog for free on Medium.Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming asponsor. Published via Towards AITowards AI - Medium Share this post0 Comments 0 Shares 9 Views
-
TOWARDSAI.NETTAI 131: OpenAIs o3 Passes Human Experts; LLMs Accelerating With Inference Compute ScalingAuthor(s): Towards AI Editorial Team Originally published on Towards AI. What happened this week in AI by LouieOpenAI wrapped up its 12 Days of OpenAI campaign and saved the best till last with the reveal of its o3 and o3-mini reasoning models. These models are successors to the o1 series and are debatably the largest step change improvement yet in LLM capabilities on complex tasks for the first time eclipsing human experts in many domains. The o3 release drowned out the otherwise significant launch of Google Geminis 2.0 Flash Thinking Mode model its first reasoning model (in the style of o1/o3) which, unlike OpenAI, doesnt hide its thinking tokens.There is a huge amount to unpack in the o3 release the model sailed past human expert scores on many key advanced benchmarks including coding, mathematics, and PhD science. Perhaps most noteworthy was the breakthrough on the ARC-AGI benchmark (where LLMs have traditionally failed and only achieved average scores even with heavy scaffolding and brute force) for example, o3 (low efficiency) achieved 87.5% vs o1 32% just a week earlier and GPT4o at 5% in May. This score is considered human-level, further fueling debates over whether o3 edges closer to Artificial General Intelligence (AGI). Some of the best scores do come at a huge cost; however o3 on low-efficiency mode (1,024 samples) costs around $3,400 per task costing 160x vs. $20 for o3 high efficiency (6 samples and achieved 75.7%) and vs. ~$3 for o1.On the GPQA Diamond test designed for PhD-level science questions o3 scored 87.7%, compared to the 78% achieved by o1. For context, PhD holders with internet access typically score between 34% (outside their specialty) and 81% (within their domain). In coding, o3s Elo rating of 2727 on Codeforces puts it in the 99.95th percentile of competitive programmers, far exceeding the reach of most human professionals. Mathematics is another area where o3 shines, achieving 96.7% accuracy on the American Invitational Mathematics Exam (AIME), up from o1s 83.3% and just 13.4% for 4o only months earlier.This release didnt only come with a huge cost 1,000x escalation for some tasks but also the promise of huge cost savings! Due to success with model distillation and other techniques, the o3-mini outperforms the much larger o1 model released just last week on many coding and maths tasks. For example, o3-mini with medium compute achieved a much stronger Codeforce Elo in 1997 vs. o1 in 1891, but at what we eyeball as a ~7080% lower total cost.How do the models work? OpenAI still hasnt disclosed that they use reinforcement learning to improve the models reasoning during training. However, employees have posted that they are still just LLMs and use autoregression. We think the model is trained to be highly efficient at chain-of-thought reasoning exploring the most likely paths and realizing when it has made a mistake. We think the rapid progress in just 3 months between o1 and o3 is likely primarily from using synthetic data from o1s full chain of thought thinking tokens to add to the reinforcement learning dataset used for training. On the other hand, we expect the initial o1 mostly used a smaller set of human expert commissioned reasoning examples (which are missing from pre-training because people almost never type out their full internal monologue and reasoning process and instead skip to the answers!). It is also possible that o3 was built using a different, more advanced base foundation model (o1 likely used 4o) perhaps GPT-4.5 or a checkpoint of the rumored Orion or GPT-5 model leading to additional benefits.One interesting note on the new regime of inference time compute scaling is that OpenAI appears to be scaling thinking tokens both in series (up to ~100k reasoning tokens in its context window) but also in parallel with 6 (high efficiency) or 1024 samples (low efficiency) used in the ARC-AGI evaluation. It is unclear how the best answer is chosen from these it could be simple majority voting, but more likely, there is complexity and extra secret sauce here in how the best samples are automatically and rapidly searched, evaluated, and chosen. We think it is possible some form of this parallel scaling could also be taking place in the o1-Pro model available (within the $200/month ChatGPT Pro).OpenAI models rapid breakthroughs on complex benchmarks this year:Source: Towards AI, OpenAI disclosures.The models have not yet been released, and the rollout schedule is still dependent on safety testing. o3-mini is slated for release in late January 2025, with o3 following shortly after. Researchers can apply for early access to test the models, with an application deadline of January 10th, 2025. Pricing has also yet to be announced.Why should you care?So what does this all mean? LLMs can now perform to human expert standards at many tasks and these breakthroughs were achieved at an accelerating pace. Will the inference time compute scaling paradigm continue to deliver new generations every 3 months relative to the 12 years for the training time scaling regime? How will these models perform in the real world beyond their benchmarks? Will o3 models rapidly begin to transform the global economy and disrupt huge numbers of jobs, or is the cost too large a bottleneck to adoption? On which tasks will it be worth spending 170x more compute for incrementally better performance (as with Arc-AGI)? Is this model AGI already? Do you need to find a new career?While we dont think this model is AGI yet (which has wildly differing definitions in any case), we think this model is hugely significant and should be on the front page of all newspapers. It suggests that deep learning and the LLM paradigm dont have any obvious limits. Far from the slowdown and failures of new model generations covered in the media progress is faster than it has ever been on the most complex benchmarks. My key takeaway is that if we can develop a benchmark or generate a few or a few hundred detailed reasoning examples for a task category of human work, we can solve it together with extra synthetic reasoning data. (This doesnt yet apply to physical labor, but AI-based robotics are also rapidly progressing!). The price of o3 will be a large barrier initially but we expect large improvements in the cost and particularly the efficiency of running parallel samples. The o3-mini also appears to be a game changer; however, the huge cost savings will likely come at the cost of more narrow capabilities.To achieve products with high enough reliability and affordability for mass adoption we still think a large amount of work will be needed from LLM Developers to optimize and customize these models to specific industries and niche tasks including gathering industry-specific data, creating reasoning data, and creating your own evaluations. With Google Gemini also joining the reasoning model race this week and with open-source reasoning models from Alibaba Qwen and Deepseek in China, we expect competition to drive affordability and developer customization options for these models. OpenAI has already announced it will release reinforcement learning-based reasoning fine-tuning options, and we think, eventually, there will also be reasoning model distillation options to customize larger models into smaller forms. So there is no better time to convert to become an LLM Developer with our own 80+ lesson Python course and learn to harness these models!Hottest News1. OpenAI Announces OpenAI o3OpenAI announced OpenAI o3, the latest model in its o-Model Reasoning Series. Building on its predecessors, o3 showcases huge leaps in mathematical and scientific reasoning, prompting discussions about its capabilities and constraints.2. xAI Raises $6B Series CElon Musks xAI announced it raised $6 billion in a Series C funding round, bringing its value to more than $40 billion. The company said the funding would be allocated to products and infrastructure, including its Grok AI model and the multibillion-dollar supercomputer site used to train its AI models. The Colossus supercomputer scaled to 100,000 NVIDIA Hopper GPUs in record time and plans to soon add another 100k.3. OpenAI Is Offering 1 Million Free Tokens for GPT-4o and o1A user on X highlighted that OpenAI seems to be offering 1 million free tokens for GPT-4o and o1 if you share your API usage with them for training. Users can get up to 10 million tokens per day on traffic shared with OpenAI on smaller models. This is similar to Google Geminis free tier strategy for its API, where data can be used for training. We think the race for user data has become even more critical given the success of reasoning models where OpenAI could use thinking tokens from user o1 model prompts to expand its reinforcement learning data sets.4. Google Releases Its Own Reasoning AI ModelGoogle has released Gemini 2.0 Flash Thinking Mode, an experimental model trained to generate the thinking process the model goes through as part of its response. Thinking models are available in Google AI Studio and through the Gemini API.5. Microsoft AI Research Open-Sources PromptWizardResearchers from Microsoft Research India have developed and open-sourced PromptWizard, an innovative AI framework for optimizing prompts in black-box LLMs. This framework employs a feedback-driven critique-and-synthesis mechanism to iteratively refine prompt instructions and in-context examples, enhancing task performance. PromptWizard operates through two primary phases: a generation phase and a test-time inference phase.6. The Technology Innovation Institute in Abu Dhabi Released the Falcon 3 Family of ModelsThe UAE government-backed Technology Innovation Institute (TII) has announced the launch of Falcon 3, a family of open-source small language models (SLMs) designed to run efficiently on lightweight, single GPU-based infrastructures. Falcon 3 features four model sizes 1B, 3B, 7B, and 10B with base and instruction variants. According to the Hugging Face leaderboard, the models are already outperforming or closely matching popular open-source counterparts in their size class, including Metas Llama and category leader Qwen-2.5.7. Salesforce Drops Agentforce 2.0Salesforce announced Agentforce 2.0: the newest version of Agentforce, the first digital labor platform for enterprises. This release introduces a new library of pre-built skills and workflow integrations for rapid customization, the ability to deploy Agentforce in Slack, and advancements in agentic reasoning and retrieval-augmented generation (RAG).8. Patronus AI Open Sources Glider: A 3B State-of-the-Art Small Language Model (SLM) JudgePatronus AI has introduced Glider, a general-purpose 3.8B evaluation model. This open-source evaluator model provides quantitative and qualitative feedback for text inputs and outputs. It acts as a fast, inference-time guardrail for LLM systems, offering detailed reasoning chains and highlighting key phrases to enhance interpretability. Glider is built upon the Phi-3.5-mini-instruct base model and has been fine-tuned on diverse datasets spanning 685 domains and 183 evaluation criteria.Five 5-minute reads/videos to keep you learning1. Alignment Faking in Large Language ModelsAlignment faking is where someone appears to share our views or values but is, in fact, only pretending to do so. A new paper from Anthropics Alignment Science team, in collaboration with Redwood Research, provides the first empirical example of a large language model engaging in alignment faking without having been explicitly trained or instructed to do so.2. AI Safety on a Budget: Your Guide to Free, Open-Source Tools for Implementing Safer LLMsThis blog shares some free AI safety tools. It shares everything you need to know, from guardrails that steer chatbots away from disaster to datasets that help identify toxic content. It also provides insights into the AI safety landscape and how to navigate it, especially on a budget.3. Fine-Tuning LLMs for RAGThis video explains why and when you should fine-tune your LLM in a RAG system. This concept is useful for todays AI engineers playing with LLMs.4. The Real Reason Your Companys AI Isnt Working (Hint: Its Not the Technology)The underlying reason many companies struggle to make AI tools work is not the technology itself. The real challenge lies in organizational structures, cultural resistance, a lack of proper training, and insufficient time allocated for exploration. This article presents some thoughts on addressing these issues, such as investing in leadership support, encouraging cultural change, offering tailored training sessions, and fostering an environment of experimentation.5. Introducing ReACT LLM Agents: A Secret to More Capable AIA ReACT agent is a special type of AI agent that uses both Reasoning and Acting to solve the tasks or problems we assign. This article explores this concept, presents use case examples, and explains how it has the potential to make AI more capable.Repositories & ToolsAnthropic Cookbook provides code and guides designed to help developers build with Claude.Genesis is a physics platform for general-purpose robotics/embodied AI/physical AI applications.Picotron is a minimalist repository for pre-training Llama-like models with 4D Parallelism.Helicone is an open-source LLM observability platform.Top Papers of The Week1. Qwen2.5 Technical ReportThis report introduces Qwen2.5, a comprehensive series of LLMs designed to meet diverse needs. Compared to previous iterations, Qwen 2.5 has significantly improved during both the pre-training and post-training stages. The pre-training dataset has been scaled from the previous 7 trillion tokens to 18 trillion tokens, and the post-training implements intricate supervised finetuning with over 1 million samples and multistage reinforcement learning.2. Byte Latent Transformer: Patches Scale Better Than TokensThis paper introduces the Byte Latent Transformer (BLT), a new byte-level LLM architecture that matches tokenization-based LLM performance at scale with significant improvements in inference efficiency and robustness. BLT encodes bytes into dynamically sized patches, which serve as the primary units of computation. Patches are segmented based on the entropy of the next byte, allocating more compute and model capacity where increased data complexity demands it.3. Deliberative Alignment: Reasoning Enables Safer Language ModelsThis paper introduces deliberative alignment, a training paradigm that directly teaches reasoning LLMs the text of human-written and interpretable safety specifications. It trains them to reason explicitly about these specifications before answering. Open AI used deliberative alignment to align OpenAIs o-series models, enabling them to use chain-of-thought (CoT) reasoning to reflect on user prompts, identify relevant text from OpenAIs internal policies, and draft safer responses.4. Fully Open Source Moxin-7B Technical ReportThis paper introduces Moxin 7B, a fully open-source LLM developed in accordance with the Model Openness Framework (MOF). The MOF is a ranked classification system that evaluates AI models based on model completeness and openness, adhering to the principles of open science, open source, open data, and open access. Experiments show that the model performs better in zero-shot evaluation than popular 7B models.5. RAGBench: Explainable Benchmark for Retrieval-Augmented Generation SystemsThis paper introduces RAGBench, a comprehensive, large-scale RAG benchmark dataset of 100k examples. It covers five unique industry-specific domains and various RAG task types. RAGBench examples are sourced from industry corpora, such as user manuals, making it particularly relevant for industry applications.6. CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language ModelsThis paper presents an improved version of CosyVoice (streaming speech synthesis model), CosyVoice 2, which incorporates comprehensive and systematic optimizations. It introduces finite-scalar quantization to improve the codebook utilization of speech tokens and streamlines the model architecture to allow direct use of a pre-trained LLM. Additionally, it also uses a chunk-aware causal flow matching model to support various synthesis scenarios.Quick Links1. OpenAI brings ChatGPT to your landline. Call 18002428478, and OpenAIs AI-powered assistant will respond as of Wednesday afternoon. The experience is more or less identical to Advanced Voice Mode. ChatGPT responds to the questions users ask over the phone and can handle tasks such as translating a sentence into a different language.2. Google is expanding Geminis latest in-depth research mode to 40 more languages. The company launched the in-depth research mode earlier this month, allowing Google One AI premium plan users to unlock an AI-powered research assistant.3. GitHub has launched GitHub Copilot Free, an accessible version of its popular AI-powered coding assistant with limits. The new free tier for VS Code aims to expand the AI-powered code completion assistants reach to a broader audience of developers namely, those with only light usage needs and tighter budgets.Whos Hiring in AIApplied AI Finetuning Engineer @Anthropic (Multiple US locations)Generative AI for Test Case Generation Master Thesis Opportunity @IBM (Frankfurt/Germany)Generative AI Engineer @CAI (Remote)AI Strategist @Navy Federal Credit Union (Multiple US locations)New College Grad, Hardware Integration Engineer @Western Digital (San Jose, CA, USA)Software Development Engineer @Siemens Digital Industries Software (New Cairo, Al Qahirah, Egypt)Interested in sharing a job opportunity here? Contact [emailprotected].Think a friend would enjoy this too? Share the newsletter and let them join the conversation.Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming asponsor. Published via Towards AI0 Comments 0 Shares 10 Views