• Turmoil at OpenAI: whats next for the creator of ChatGPT?
    www.theverge.com
    The worlds hottest AI company went through three CEOs in under a week and ended up with the same one it had at the start so what happened, and whats next?On November 17th, 2023, OpenAIs nonprofit board abruptly announced that co-founder and CEO Sam Altman was out. The shake-up came just shy of one year after the launch of ChatGPT, which quickly became one of the fastest-growing apps in history and initiated an industry-wide race to build generative AI.Over a period of just a few days, the CEO job shuffled between CTO Mira Murati and former Twitch boss Emmett Shear. Meanwhile, hundreds of OpenAI employees said they would leave for jobs at Microsoft, OpenAIs lead investor, unless the board reinstated Altman. In the end, Altman returned, along with co-founder Greg Brockman and a revamped board of directors.On March 8th, after an independent investigation into his sudden firing, OpenAI reinstated Altman as a member of the board, along with three other additions.That same month, OpenAI co-founder Elon Musk filed a lawsuit against OpenAI, claiming that the companys pursuit of profit has led it to abandon its founding nonprofit mission to develop artificial general intelligence technology (AGI) that will benefit humanity.All of the news and updates about OpenAI continue below.HighlightsJan 21Kylie RobisonMicrosoft is letting OpenAI get its own AI compute nowImage: Cath Virginia / The VergeMicrosoft and OpenAI announced Tuesday that they have adjusted their partnership so that OpenAI can access competitors compute.The new agreement includes changes to the exclusivity on new capacity, moving to a model where Microsoft has a right of first refusal (ROFR), Microsoft says. To further support OpenAI, Microsoft has approved OpenAIs ability to build additional capacity, primarily for research and training of models.Read Article >Jan 21Richard LawlerOpenAI and SoftBank are starting a $500 billion AI data center companyImage: The White House (YouTube)A plan to build a system of data centers for artificial intelligence has been revealed in a White House press conference, with Masayoshi Son, Sam Altman, and Larry Ellison joining Donald Trump to announce The Stargate Project. Their companies, SoftBank, OpenAI, and Oracle (respectively), along with MGX are listed as initial equity funders for $500 billion in investments over the next four years, building new AI infrastructure for OpenAI in the United States.According to a statement from OpenAI, Arm, Microsoft, NVIDIA, Oracle, and OpenAI are the initial tech partners, with a buildout currently underway starting in Texas as other sites across the country are evaluated. It also says that Oracle, NVIDIA, and OpenAI will closely collaborate to build and operate this computing system.Read Article >Dec 27, 2024Emma RothOpenAI announces plan to transform into a for-profit companyImage: The VergeOpenAI has laid out plans to become a for-profit company. In a blog post published on Friday, OpenAIs board said it will replace the companys existing structure with one that puts control into the hands of its for-profit arm.Going into 2025, OpenAI plans to become a Public Benefit Corporation (PBC), which is a for-profit company meant to operate for the good of society. This division will run and control OpenAIs operations and business, while OpenAIs nonprofit will retain a stake in the business but lose its oversight role.Read Article >Dec 21, 2024Wes DavisSo far, the vibes are off.We reported in October that OpenAI could launch its GPT-4 successor, codenamed Orion, this month. Now, The Wall Street Journal reports the behind-schedule models training is a struggle thats racking up enormous costs.When it will be ready is apparently a feel thing. WSJ writes:Its up to company executives to decide whether the model is smart enough to be called GPT-5 based in large part on gut feelings or, as many technologists say, vibes.So far, the vibes are off. The Next Great Leap in AI Is Behind Schedule and Crazy Expensive[WSJ]Dec 14, 2024Alex HeathMeta asks the government to block OpenAIs switch to a for-profitMark Zuckerberg. Image: Cath Virginia / The Verge; Getty ImagesMeta is asking California Attorney General Rob Bonta to block OpenAIs planned transition from a non-profit to for-profit entity.In a letter sent to Bontas office this week, Meta says that OpenAI should not be allowed to flout the law by taking and reappropriating assets it built as a charity and using them for potentially enormous private gains.Read Article >Dec 12, 2024Kylie RobisonInside the launch and future of ChatGPTImage: Cath Virginia / The Verge, Getty ImagesAs winter descended on San Francisco in late 2022, OpenAI quietly pushed a new service dubbed ChatGPT live with a blog post and a single tweet from CEO Sam Altman. The team labeled it a low-key research preview they had good reason to set expectations low. It couldnt even do arithmetic, Liam Fedus, OpenAIs head of post-training says. It was also prone to hallucinating or making things up, adds Christina Kim, a researcher on the mid-training team.Read Article >Dec 4, 2024Alex HeathSam Altman lowers the bar for AGIImage: Cath Virginia / The Verge; Getty ImagesNearly two years ago, OpenAI said that artificial general intelligence, or AGI the thing the company was created to build could elevate humanity and give everyone incredible new capabilities.Now, OpenAI CEO Sam Altman is trying to lower expectations.Read Article >Nov 18, 2024Kylie RobisonInside Elon Musks messy breakup with OpenAIImage: Cath Virginia / The Verge, Getty ImagesAs OpenAI was ironing out a new deal with Microsoft in 2016 one that would nab the young startup critical compute to build what would become ChatGPT Sam Altman needed the blessing of his biggest investor, Elon Musk.$60MM of compute for $10MM, and input from us on what they deploy in the cloud, Altman messaged Musk in September 2016, according to newly revealed emails. Microsoft wanted OpenAI to provide feedback on and promote (in tech circles, evangelize) Microsoft AI tools like Azure Batch. Musk hated the idea, saying it made him feel nauseous. Read Article >Oct 25, 2024Tom WarrenMicrosoftprepares for OpenAIs next model as their relationship strainsImage: MicrosoftMicrosoft is getting ready to host OpenAIs next model, just as reports emerge describing unprecedented tension in their complex relationship.We just exclusively revealed that Orion, OpenAIs next model, is set to be released by the end of the year. A source familiar with Microsofts AI plans tell me that engineers inside the company have been preparing to host OpenAIs Orion model in recent weeks.Read Article >Oct 25, 2024Kylie RobisonandTom WarrenOpenAI plans to release its next big AI model by DecemberImage: Cath Virginia / The Verge; Getty ImagesOpenAI plans to launch Orion, its next frontier model, by December, The Verge has learned.Unlike the release of OpenAIs last two models, GPT-4o and o1, Orion wont initially be released widely through ChatGPT. Instead, OpenAI is planning to grant access first to companies it works closely with in order for them to build their own products and features, according to a source familiar with the plan.Read Article >Oct 24, 2024Kylie RobisonDeparting OpenAI leader says no company is ready for AGIImage: The VergeMiles Brundage, OpenAIs senior adviser for the readiness of AGI (aka human-level artificial intelligence), delivered a stark warning as he announced his departure on Wednesday: no one is prepared for artificial general intelligence, including OpenAI itself.Neither OpenAI nor any other frontier lab is ready [for AGI], and the world is also not ready, wrote Brundage, who spent six years helping to shape the companys AI safety initiatives. To be clear, I dont think this is a controversial statement among OpenAIs leadership, and notably, thats a different question from whether the company and the world are on track to be ready at the relevant time.Read Article >Oct 20, 2024Wes DavisFormer OpenAI CTO Mira Muratis next move: another AI startup.Murati is seeking venture capital funds for a new AI startup with its own proprietary models, Reuters reported Friday.Barret Zoph, an OpenAI researcher who left the same day as Murati may join the venture, according to unnamed sources cited by the outlet.Former OpenAI technology chief Mira Murati to raise capital for new AI startup, sources say[Reuters]Sep 27, 2024Kylie RobisonOpenAI was a research lab now its just another tech companyPhoto by Jack Guez / AFP via Getty ImagesHeres the thing about asking investors for money: they want to see returns.OpenAI launched with a famously altruistic mission: to help humanity by developing artificial general intelligence. But along the way, it became one of the best-funded companies in Silicon Valley. Now, the tension between those two facts is coming to a head. Read Article >Sep 25, 2024Jay PetersandKylie RobisonOpenAI CTO Mira Murati is leavingScreenshot: YouTubeOpenAI CTO Mira Murati is leaving the company.Im stepping away because I want to create the time and space to do my own exploration, she wrote in a post on X. For now, my primary focus is doing everything in my power to ensure a smooth transition, maintaining the momentum weve built.Read Article >Sep 21, 2024Alex CranzJony Ive confirms hes working on a new device with OpenAIPhoto by Jerod Harris / Getty Images for Vox MediaJony Ive has confirmed that hes working with OpenAI CEO Sam Altman on an AI hardware project. The confirmation came today as part of a profile of Ive in The New York Times, nearly a year after the possibility of a collaboration between Altman and the longtime Apple designer was first reported on.There arent a lot of details on the project. Ive reportedly met Altman through Brian Chesky, the CEO of Airbnb, and the venture is being funded by Ive and the Emerson Collective, Laurene Powell Jobs company. The Times reports it could raise $1 billion in funding by the end of the year but makes no mention of Masayoshi Son, the SoftBank CEO rumored last year to have invested $1 billion in the project.Read Article >Sep 14, 2024Wes DavisStep four: Profit.CEO Sam Altman told employees in a company-wide meeting that OpenAIs complicated corporate structure as a for-profit endeavor under the umbrella of a non-profit is set to change, likely sometime next year, reports Fortune.The reconfiguring, which has been rumored before, would reportedly shift the company away from being controlled by a non-profit. OpenAI told the outlet that the non-profit is core to our mission and will continue to exist.Sam Altman told OpenAI staff the companys non-profit corporate structure will change next year[Fortune]Aug 30, 2024Elizabeth LopattoOpenAI searches for an answer to its copyright problemsYou mean its all copyright? Image: Cath Virginia / The Verge, Getty ImagesThe huge leaps in OpenAIs GPT model probably came from sucking down the entire written web. That includes entire archives of major publishers such as Axel Springer, Cond Nast, and The Associated Press without their permission. But for some reason, OpenAI has announced deals with many of these conglomerates anyway.At first glance, this doesnt entirely make sense. Why would OpenAI pay for something it already had? And why would publishers, some of whom are lawsuit-style angry about their work being stolen, agree?Read Article >Aug 29, 2024Emma RothChatGPTs weekly users have doubled in less than a yearIllustration: The VergeOpenAI says that more than 200 million people use ChatGPT each week, as first reported by Axios. OpenAI spokesperson Taya Christianson confirmed the number to The Verge, which is now double the 100 million weekly active users OpenAI reported last November.Additionally, Christianson says that 92 percent of Fortune 500 companies are using OpenAIs products, while API usage has doubled following the release of the companys cheaper and smarter model GPT-4o Mini.Read Article >Aug 6, 2024Alex HeathAnother OpenAI co-founder departs.John Schulman is leaving to work on alignment at Anthropic, OpenAIs chief rival. In a reply post on X, CEO Sam Altman thanked Schulman and said he laid out a significant fraction of what became OpenAIs initial strategy.In his new job, Schulman will work closely with Jan Leike, another senior leader who recently left OpenAI for Anthropic due to concerns that safety had taken a backseat to business priorities.Aug 5, 2024Jess WeatherbedElon Musk is suing OpenAI and Sam Altman againImage: The VergeElon Musk has revived his complaint against OpenAI after dropping a previous lawsuit, again alleging that the ChatGPT maker and two of its founders Sam Altman and Greg Brockman breached the companys founding mission to develop artificial intelligence technology to benefit humanity.The new lawsuit filed in federal court in Northern California on Monday says that Altman and Brockman assiduously manipulated Musk into co-founding their spurious non-profit venture by promising that OpenAI would be safer and more transparent than profit-driven alternatives. The suit claims that assurances about OpenAIs nonprofit structure were the hook for Altmans long con.Read Article >Aug 3, 2024Alex HeathMicrosoft now lists OpenAI as a competitor.CNBC spotted the update this week in Microsofts risk factors with the SEC. These are managed by lawyers to help shield companies from shareholders lawsuits and generally pretty conservative. Still, the change feels like a sign of how OpenAI and its largest investor are drifting apart.Relatedly, I couldnt help but notice the number of times Microsoft execs mentioned OpenAI during their earnings call this week: zero.Microsoft says OpenAI is now a competitor in AI and search[CNBC]Jul 10, 2024Tom WarrenMicrosoft and Apple ditch OpenAI board seats amid regulatory scrutinyImage: The VergeMicrosoft has dropped its seat as an observer on the board of OpenAI, less than eight months after securing the non-voting seat. Apple was reportedly planning to join OpenAIs nonprofit board, but now the Financial Times reports that Apple will no longer join the board.OpenAI confirmed that Microsoft has given up its seat in a statement to The Verge, following reports from Axios and the Financial Times that Microsofts deputy general counsel Keith Dolliver wrote a letter to OpenAI late on Tuesday.Read Article >Jul 2, 2024Emma RothApples Phil Schiller is reportedly joining OpenAIs boardIllustration by Kristen Radtke / The VergeApple has chosen App Store chief and former marketing head Phil Schiller to represent the company on OpenAIs nonprofit board, according to a report from Bloomberg. Schiller will reportedly get an observer role, meaning he can attend board meetings but cant vote or act as a director.Joining the board will allow Schiller to learn more about the inner workings of OpenAI as Apple works to build ChatGPT into iOS and macOS later this year. The integration will allow the AI-supercharged Siri to punt more advanced queries to ChatGPT if users grant permission. As previously reported by Bloomberg, no money is currently involved in the partnership, though Apple is expected to get a percentage of ChatGPT subscriptions made through its platforms down the road.Read Article >Jun 13, 2024Jay PetersFormer head of NSA joins OpenAI boardPhoto: NSAOpenAI has appointed Paul M. Nakasone, a retired general of the US Army and a former head of the National Security Agency (NSA), to its board of directors, the company announced on Thursday.Nakasone, who was nominated to lead the NSA by former President Donald Trump, directed the agency from 2018 until February of this year. Before Nakasone left the NSA, he wrote an op-ed supporting the renewal of Section 702 of the Foreign Intelligence Surveillance Act, the surveillance program that was ultimately reauthorized by Congress in April.Read Article >Jun 12, 2024Alex HeathOpenAIs business is booming.The company is on track to make about $3.4 billion in revenue this year, which is about double what it brought in last year, according to a new report by The Information.CEO Sam Altman reportedly told employees that $200 million of that revenue is the cut OpenAI gets from Microsoft selling its models through Azure. That means the vast majority of OpenAIs revenue is coming from ChatGPT subscriptions and its own developer platform.OpenAIs Annualized Revenue Doubles to $3.4 Billion Since Late 2023[The Information]More Stories
    0 Kommentare ·0 Anteile ·41 Ansichten
  • Google expects to spend $75 billion this year on the AI race
    www.theverge.com
    Google parent company Alphabet expects to invest approximately $75 billion in capital expenditures in 2025, according to a statement from CEO Sundar Pichai in Alphabets Q4 2024 earnings release.Capital expenditures have become a hot topic as of late as big tech companies race to build infrastructure to support their growing AI ambitions, and todays announcement from Alphabet is clearly meant to keep the company in that conversation. Alphabet spent $32.3 billion on capital expenditures in 2023, so $75 billion in 2025 would be a big jump. And while Googles press release today doesnt specifically say that the upcoming capital expenditures are all for AI, given the amount of money flowing into AI infrastructure across the industry, it seems likely that a good amount of the expense will go toward benefitting Googles AI work. AI continues to benefit Googles business as well. Overall revenues are up 12 percent year-over-year to $96.5 billion. Google Cloud revenues are up 10 percent to $12.0 billion, which Google says is led by growth in Google Cloud Platform (GCP) across core GCP products, AI Infrastructure, and Generative AI Solutions.During the quarter, the company made some big news about its AI products, including revealing Gemini 2.0, an AI agent called Project Mariner that can complete tasks in a Chrome browser, and its Deep Research tool that can research things on the web for you. It also demoed a new Android XR mixed reality OS.Alphabet-owned Waymo had a pretty good 2024 overall, though todays earnings report shows that Other Bets, which includes Waymo, had lower revenue and higher losses year-over-year.In Q4, the Department of Justice also proposed that Google potentially divest itself of Chrome as a remedy for Judge Amit Mehtas August ruling that the company is a monopolist in the search and advertising markets. The final outcome of those remedies could have a big impact on Google / Alphabets future.Alphabets investor call is happening now, and well update this story with anything notable from the call.
    0 Kommentare ·0 Anteile ·40 Ansichten
  • Fine-Tuning Llama 3.2 3B Instruct for Python Code: A Comprehensive Guide with Unsloth
    www.marktechpost.com
    In this tutorial, well walk through how to set up and perform fine-tuning on the Llama 3.2 3B Instruct model using a specially curated Python code dataset. By the end of this guide, youll have a better understanding of how to customize large language models for code-related tasks and practical insight into the tools and configurations needed to leverage Unsloth for fine-tuning.Installing Required Dependencies!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"!pip install "git+https://github.com/huggingface/transformers.git"!pip install -U trl!pip install --no-deps trl peft accelerate bitsandbytes!pip install torch torchvision torchaudio triton!pip install xformers!python -m xformers.info!python -m bitsandbytesThese commands install and update all the necessary librariessuch as Unsloth, Transformers, and xFormersneeded for fine-tuning the Llama 3.2 3B Instruct model on Python code. Finally, we run diagnostic commands to verify the successful installation of xFormers and BitsAndBytes.Essential Importsfrom unsloth import FastLanguageModelfrom trl import SFTTrainerfrom transformers import TrainingArgumentsimport torchfrom datasets import load_datasetWe import classes and functions from Unsloth, TRL, and Transformers for model training and fine-tuning. Also, we load a Python code dataset with Hugging Faces `load_dataset` to prepare training samples.Loading the Python Code Datasetmax_seq_length = 2048dataset = load_dataset("user/Llama-3.2-Python-Alpaca-143k", split="train") #Save the dataset on your user profile on HF, then load the dataset on your user idWe set the sequence length to 2048 tokens for the fine-tuned model and load a custom Python code dataset from Hugging Face. Ensure you have the dataset stored under your username for proper access.Initializing the Llama 3.2 3B Modelmodel, tokenizer = FastLanguageModel.from_pretrained( model_name = "unsloth/Llama-3.2-3B-Instruct-bnb-4bit", max_seq_length = max_seq_length, dtype = None, load_in_4bit = True)We load the Llama 3.2 3B Instruct model in 4-bit format using the Unsloth library, which reduces memory usage. To handle longer text inputs, we also set the maximum sequence length to 2048.Configuring LoRA with Unslothmodel = FastLanguageModel.get_peft_model( model, r = 16, target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj",], lora_alpha = 16, lora_dropout = 0, # Supports any, but = 0 is optimized bias = "none", # Supports any, but = "none" is optimized # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes! use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context random_state = 3407, use_rslora = False, # We support rank stabilized LoRA loftq_config = None, # And LoftQ max_seq_length = max_seq_length)We apply LoRA (Low-Rank Adaptation) to our 4-bit loaded model, specifying the rank (r), alpha (lora_alpha), and dropout settings. The use_gradient_checkpointing = unsloth enables more efficient memory usage and allows training with longer context lengths. Additional LoRA options like use_rslora and loftq_config are available for more advanced fine-tuning techniques but are disabled here for simplicity. Finally, we set the maximum sequence length to match our earlier configuration.Mounting Google Drivefrom google.colab import drivedrive.mount("/content/drive")We import the Google Colab drive module to enable access to Google Drive from within the Colab environment.Setting Up and Running the Training Looptrainer = SFTTrainer( model = model, train_dataset = dataset, dataset_text_field = "text", max_seq_length = max_seq_length, tokenizer = tokenizer, args = TrainingArguments( per_device_train_batch_size = 2, gradient_accumulation_steps = 4, warmup_steps = 10, # num_train_epochs = 1, # Set this for 1 full training run. max_steps = 60, learning_rate = 2e-4, fp16 = not torch.cuda.is_bf16_supported(), bf16 = torch.cuda.is_bf16_supported(), logging_steps = 1, optim = "adamw_8bit", weight_decay = 0.01, lr_scheduler_type = "linear", seed = 3407, output_dir = "/content/drive/My Drive/Llama-3.2-3B-Instruct-bnb-4bit" ),)trainer.train()We create an instance of SFTTrainer with our loaded model, tokenizer, and Python code dataset, specifying the text field for training. The TrainingArguments define key hyperparameters such as batch size, learning rate, maximum training steps, and hardware-specific settings like fp16 or bf16. In this example, we set the output directory to Google Drive to conveniently store checkpoints and logs. Finally, we invoke the trainer.train() method to begin the fine-tuning process.Saving the Fine-Tuned Modelmodel.save_pretrained("lora_model") # Local savingtokenizer.save_pretrained("lora_model")We save the LoRA-trained model and its tokenizer to a local folder named lora_model. This allows you to load and use the fine-tuned model later without repeating the training process.In conclusion, throughout this tutorial, we demonstrated how to fine-tune the Llama 3.2 3B Instruct model on a Python code dataset using the Unsloth library, LoRA, and efficient 4-bit quantization. By leveraging the provided scripts, you can train a smaller, memory-efficient model that excels at both generating and understanding Python code. In the process, we showcased the integration of Unsloth for optimized memory usage, LoRA for flexible model adaptation, and Hugging Face tools for dataset handling and training. This setup enables you to build and customize language models tailored to specific code-related tasks, improving accuracy and resource efficiency.Download the Colab Notebook here.All credit for this research goes to the researchers of this project. Also,dont forget to follow us onTwitterand join ourTelegram ChannelandLinkedIn Group. Dont Forget to join our75k+ ML SubReddit. Asif RazzaqWebsite| + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/Zep AI Introduces a Smarter Memory Layer for AI Agents Outperforming the MemGPT in the Deep Memory Retrieval (DMR) BenchmarkAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Anthropic Introduces Constitutional Classifiers: A Measured AI Approach to Defending Against Universal JailbreaksAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Creating a Medical Question-Answering Chatbot Using Open-Source BioMistral LLM, LangChain, Chromas Vector Storage, and RAG: A Step-by-Step GuideAsif Razzaqhttps://www.marktechpost.com/author/6flvq/An In-Depth Exploration of Reasoning and Decision-Making in Agentic AI: How Reinforcement Learning RL and LLM-based Strategies Empower Autonomous Systems [Recommended] Join Our Telegram Channel
    0 Kommentare ·0 Anteile ·37 Ansichten
  • NYU Researchers Introduce WILDCHAT-50M: A Large-Scale Synthetic Dataset for Efficient LLM Post-Training
    www.marktechpost.com
    Large language model (LLM) post-training focuses on refining model behavior and enhancing capabilities beyond their initial training phase. It includes supervised fine-tuning (SFT) and reinforcement learning to align models with human preferences and specific task requirements. Synthetic data is crucial, allowing researchers to evaluate and optimize post-training techniques. However, open research in this domain is still in its early stages, facing data availability and scalability limitations. Without high-quality datasets, analyzing the performance of different fine-tuning strategies and assessing their effectiveness in real-world applications becomes difficult.One of the primary challenges in this field is the scarcity of large-scale, publicly available synthetic datasets suitable for LLM post-training. Researchers must access diverse conversational datasets to conduct meaningful comparative analyses and improve alignment strategies. The lack of standardized datasets limits the ability to evaluate post-training performance across different models. Moreover, large-scale data generation costs and computational requirements are prohibitive for many academic institutions. These factors create barriers to improving model efficiency and ensuring fine-tuned LLMs generalize well across tasks and user interactions.Existing approaches to synthetic data collection for LLM training rely on a combination of model-generated responses and benchmark datasets. Datasets, such as WildChat-1M from Allen AI and LMSys-Chat-1M, provide valuable insights into synthetic data usage. However, they are often restricted in scale and model diversity. Researchers have developed various techniques to assess synthetic data quality, including LLM judge-based evaluations and efficiency metrics for runtime and VRAM usage. Despite these efforts, the field still lacks a comprehensive and publicly accessible dataset that allows for large-scale experimentation and optimization of post-training methodologies.Researchers from New York University (NYU) introduced WILDCHAT-50M, an extensive dataset designed to facilitate LLM post-training. The dataset builds upon the WildChat collection and expands it to include responses from over 50 open-weight models. These models range from 0.5 billion to 104 billion parameters, making WILDCHAT-50M the largest and most diverse public dataset of chat transcripts. The dataset enables a broad comparative analysis of synthetic data generation models and is a foundation for further improving post-training techniques. By making WILDCHAT-50M publicly accessible, the research team aims to bridge the gap between industry-scale post-training and academic research.The dataset was developed by synthesizing chat transcripts from multiple models, each participating in over one million multi-turn conversations. The dataset comprises approximately 125 million chat transcripts, offering an unprecedented scale of synthetic interactions. The data collection process took place over two months using a shared research cluster of 128 H100 GPUs. This setup allowed researchers to optimize runtime efficiency and ensure a diverse range of responses. The dataset also served as the basis for RE-WILD, a novel supervised fine-tuning (SFT) mix that enhances LLM training efficiency. Through this approach, researchers successfully demonstrated that WILDCHAT-50M could optimize data usage while maintaining high levels of post-training performance.The effectiveness of WILDCHAT-50M was validated through a series of rigorous benchmarks. The RE-WILD SFT approach, based on WILDCHAT-50M, outperformed the Tulu-3 SFT mixture developed by Allen AI while using only 40% of the dataset size. The evaluation included multiple performance metrics, with specific improvements in response coherence, model alignment, and benchmark accuracy. The datasets ability to enhance runtime efficiency was also highlighted, with throughput efficiency analyses indicating substantial improvements in token processing speed. Further, models fine-tuned using WILDCHAT-50M demonstrated significant enhancements in instruction-following capabilities and overall chat performance across various evaluation benchmarks.This research underscores the importance of high-quality synthetic data in LLM post-training and presents WILDCHAT-50M as a valuable resource for optimizing model alignment. By providing a large-scale, publicly available dataset, the researchers have enabled further advancements in supervised fine-tuning methodologies. The comparative analyses conducted in this study offer key insights into the effectiveness of different data generation models and post-training strategies. Moving forward, the introduction of WILDCHAT-50M is expected to support a broader range of academic and industrial research efforts, ultimately contributing to developing more efficient and adaptable language models.Check outthePaper,Dataset on Hugging Face and GitHub Page.All credit for this research goes to the researchers of this project. Also,dont forget to follow us onTwitterand join ourTelegram ChannelandLinkedIn Group. Dont Forget to join our75k+ ML SubReddit. NikhilNikhil is an intern consultant at Marktechpost. He is pursuing an integrated dual degree in Materials at the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML enthusiast who is always researching applications in fields like biomaterials and biomedical science. With a strong background in Material Science, he is exploring new advancements and creating opportunities to contribute.Nikhilhttps://www.marktechpost.com/author/nikhil0980/This AI Paper from Meta Introduces Diverse Preference Optimization (DivPO): A Novel Optimization Method for Enhancing Diversity in Large Language ModelsNikhilhttps://www.marktechpost.com/author/nikhil0980/Researchers from University of Waterloo and CMU Introduce Critique Fine-Tuning (CFT): A Novel AI Approach for Enhancing LLM Reasoning with Structured Critique LearningNikhilhttps://www.marktechpost.com/author/nikhil0980/Researchers from Stanford, UC Berkeley and ETH Zurich Introduces WARP: An Efficient Multi-Vector Retrieval Engine for Faster and Scalable SearchNikhilhttps://www.marktechpost.com/author/nikhil0980/Intel Labs Explores Low-Rank Adapters and Neural Architecture Search for LLM Compression [Recommended] Join Our Telegram Channel
    0 Kommentare ·0 Anteile ·38 Ansichten
  • DeepSeek-TS+: A Unified Framework for Multi-Product Time Series Forecasting
    towardsai.net
    LatestMachine LearningDeepSeek-TS+: A Unified Framework for Multi-Product Time Series Forecasting 1 like February 4, 2025Share this postAuthor(s): Shenggang Li Originally published on Towards AI. Leveraging State-Space Enhanced Multi-Head Latent Attention and Group Relative Policy Optimization (GRPO) for Adaptive ForecastingThis member-only story is on us. Upgrade to access all of Medium.Photo by Solen Feyissa on UnsplashI was impressed by DeepSeeks technology its efficient Multi-Head Latent Attention (MLA) and Group Relative Policy Optimization (GRPO) techniques inspired me to apply them to multi-product time series forecasting.In our approach, we extend MLA into what we call MLA-Mamba, allowing the latent features to evolve dynamically over time using a state-space model with non-linear activations. This gives our model an adaptive memory that adjusts to trends much like a sales team adapting its strategy during market surges.At the same time, GRPO introduces a smart decision-making process that continuously refines forecasts by comparing predictions against a baseline, similar to a manager tweaking forecasts on the fly. This dynamic adjustment helps our model respond effectively to sudden changes in sales patterns.We compare our approach with classical ARMA models and standard GRU-based networks. While ARMA handles linear trends and GRUs capture temporal dependencies, our DeepSeek-TS framework is designed to model complex inter-product relationships and adapt to non-linear dynamics, resulting in more accurate and robust forecasts.In the following sections, we break down the technical details of our extended MLA (MLA-Mamba) and GRPO frameworks, and demonstrate how their Read the full blog for free on Medium.Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming asponsor. Published via Towards AITowards AI - Medium Share this post
    0 Kommentare ·0 Anteile ·39 Ansichten
  • Anime Vangaurds Update 3.5 Adds Starting Questline and Boss Rush Multiplayer Support
    www.ign.com
    Anime Vanguards update 3.5 is here with four new units, multiplayer functionality for boss rush, a starting questline, and a collection of bug and quality-of-life adjustments for Roblox players.Developer Kitawari unveiled everything in its Fate Boss Event patch notes, pulling back the curtain with even more changes following the arrival of update 3.0 late last month. Topping the list of new content are four units Sokora, Saber, Lilia, and Isdead which can be found across different in-game challenges. Theres also a Saber boss event, which comes with its own rewards, including two Secret units, trait rerolls, gold and gems, and more that youll have to log in to see for yourself.The Boss Rush overhaul includes multiplayer functionality for Roblox players looking to take on challenging tasks with friends, as well as normal and elite difficulty modes, tournament support, and boss event cycle changes. The Anime Vanguards Sand Village map has also gotten a full-on facelift that gives it a new design and enhanced visuals, and Worldlines have been reset, with additional resets coming every .5 update. Another new feature is one Kitawari says it added to address fan feedback: a starting player questline.Anime Vanguards update 3.5 brings more fundamental changes alongside bug fixes.Weve acknowledged how overwhelming the early game may be for our new players, Kitawari said. To help with this, all new players will be greeted with a uniquely created questline which will guide them through the various elements of the game, and reward them with a Mythic Vogita Super upon quest end to start their journey in Anime Vanguards! We intend to create a great experience for all, and will continue looking into the various ways we can improve the gameplay forAnime Vanguards update 3.5 brings plenty to keep players fighting off waves of enemies, but its only Kitawaris latest content delivery. Last month, the team rolled out update 3.0, which focused on more winter-themed goodies to enjoy but also introduced a brand-new lobby and the Portals game mode. These changes and more are here to bolster the Roblox experience around the one-year anniversary of its original launch in January 2024.Theres currently no word regarding when Anime Vanguards update 4.0 will arrive, but at this rate, players may only have to wait a few weeks to learn more. For everything else on Kitawaris anime tower defense game, you can see all active codes here. Finally, you can see the full update 3.5 patch notes, below.Anime Vanguards Update 3.5 Patch NotesFate Boss EventFeatures- NEW! 4 NEW UNITS!This update is coming out with 4 new units! These units will be found in:New Boss Rush Sokora, Sokora (Angra Mainyu)New Secret Evolution Saber (Alternate), Saber (Black Tyrant)New Mythic Banner Unit Lilia, Lilia And BerserkerNew Worldlines Floor 50 Reward Isdead, Isdead (Romantic)- NEW! Saber (Alternate) Boss EventDark Sokora and the formidable Tyrant Saber (Alternate) have appeared! Dark Sokora plans to fuse with the malevolent Angra Mainyu; do you have what it takes to stop them?Boss Rush Rewards Include:2 New Secret UnitsTrait RerollsGold & Gems and more!How to Obtain Saber (Alternate)You can obtain Saber (Alternate) by placing a Saber (King Of Knights) unit next to the A.W.E Boss that remains stationary at enemy spawn. The A.W.E Boss will spawn at Wave 15, and Wave 25, staying for 5 waves each time. Upon Unit Placement or Boss Spawn, all Saber (King Of Knights) units within range will become corrupted and transform into Saber (A.W.E.), with the interaction having a small chance of dropping the Saber (Alternate) unit, up to a maximum of 3 tries per match (up to 6 while using the Saber Servant Mechanic).Servants MechanicBring forth a fabled figure of choice to assist you in battle - each Servant offers unique mechanics, so be sure to read their ability descriptions with care, some may be better suited for your team than others. - Unlocking ServantsThe powerful Berserker Servant is available for free, while the others must be purchased using the servants corresponding units evolution item. Once acquired, they are permanently available for use in battle. Choose carefully, or acquire them all and unleash their power!- Command Seal ActivationYou can Activate or Empower your chosen Servants mechanic up to 3 times by pressing the red Command Seal button near the inventory. Use them wisely to turn the tide of battle!- NEW! Boss Rush OverhaulWeve made several changes to make playing boss rushes much more fun and meaningful! These include:Multiplayer FunctionalityYoure now able to play Boss Rush with your friends, or alternatively use Public Matchmaking to join other players from across the world!Normal & Elite DifficultyWe have split Boss Rushes into two separate difficulties, Normal which is a CO-OP gamemode that does not have access to tournaments, as well as an Elite Solo gamemode for those of you looking for a challenge, and to gain access to the Elite Tournament.Elite TournamentYoure now able to participate in the Boss Rush Tournaments which will reset alongside the boss rush cycle. Aim to clear the Event in the least time possible, and compete against other players in your local bracket!Be careful when the bosses spawnthe waves automatically and immediately skip whenever the boss is alive!Boss Event Cycle ChangesAlong with boss events cycling, your boss event currency of that cycleboss event stock will now also reset! We will most likely include evolution items and other important items to the legacy shop, making your legacy tokens more valuable and immune to cycle resets.- NEW! Public MatchmakingYoure now able to find matches with players to play your favorite game modes regardless of what server they are on through our brand new Public Lobbies system! To join others, simply press the Public Lobbies button within the fractures UI and select any of the player hosted matches there. To host your own match, press the Create Match button instead, and then press the Find Match button within the stage you have selected.- NEW! Worldlines ResetAll players have been reset to Floor 1, and Soburo is no longer obtainable. However, all the corresponding floor rewards have now returned, with Isdead replacing Soburo as the new Floor 50 reward. As previously announced in the Discord server, Worldlines will reset every X.5 update, with the next reset planned for Update 4.5.- NEW! Sand Village RemakeThe sand village remake has finally arrived, featuring entirely brand new designs and enhanced visuals! As one of the last remaining original maps from release, this overhaul finally brings it up to the standard of our latest maps. We hope you enjoy exploring the new and improved Sand Village map as much as we do!- NEW! Starting Player QuestlineWeve acknowledged how overwhelming the early game may be for our new players. To help with this, all new players will be greeted with a uniquely created questline which will guide them through the various elements of the game, and reward them with a Mythic Vogita Super upon quest end to start their journey in Anime Vanguards!We intend to create a great experience for all, and will continue looking into the various ways we can improve the gameplay for new as well as returning players alike in future updates.Changes & QoLAdded a visual timer when a unit gets stunnedAdded Visual Indicators for Enemy Spawn & Friendly Base/End Of Path during the voting phaseWorldlines Leaderboard incorrectly had the same rewards as tournaments, this has now been fixed to match the other global leaderboardsAn In-Game Elemental Reactions table has been added, making figuring out what each reaction does easier than ever! Furthermore, it comes with a Show Only Elements In-Match setting, which when on, will only show the reactions that can happen between your units and the enemiesYou can now delete multiple portals at onceWeve heard your requests! Sukuno now applies bleed up until upgrade 9, which is then replaced by burn (this is unrelated to the plans stated in Sub Announcements on our server)Added Amount Sliders for purchasing items or summoning unitsIncreased game start time from 30 to 60 seconds. This helps players who take a longer time to load in, need more time to strategize and change their lineup, or just want to look around the map!Added a NEW flair to recently obtained units, which goes away after interacting with them, in order to make newly-obtained units easier to spotAdded NEW! text next to world markers in areas where new major content has been addedSlightly revamped the Unit Traits Index UI, to easier tell what each trait does at a glance.Unit circle indicators now change to purple when a unit is maxed.The opening animation for windows is now smoother.Added Sasori viewmodel to the Pass iconAdded background frames to all stat previews when hovering over units and more!Bug Fixes Fixed Valentine (AU) clones costing Yen to place Fixed Rogita 4 (Super) also taking money for his clone, effectively doubling his placement costEmmie & Emmie (Ice Witch) no longer freeze friendly summonsFixed Sosora missing a cosmetic on both the Shiny and Non-Shiny variants.Fixed Padoru familiar buffs stackingFixed being able to buy a mount more than once in shopFixed being able to apply the same potion effect despite already having one active. It wouldnt stack the effect further, but would still remove all effects after the first one endsFixed having ability auto-use enabled after a match ended spamming you with notificationsFixed Soburos cosmetic idle animation not loopingFixed +0% buffs being displayed, e.g. when Haruka Rin buffs a unit after her already being maxedFixed Panda mount being very small in viewportFixed an issue where mobile players were unable to change inputs in sandbox mode! and many more!Michael Cripe is a freelance contributor with IGN. He's best known for his work at sites like The Pitch, The Escapist, and OnlySP. Be sure to give him a follow on Bluesky (@mikecripe.bsky.social) and Twitter (@MikeCripe).
    0 Kommentare ·0 Anteile ·39 Ansichten
  • LEGO Wicked Sets Drop to Their Lowest Ever Prices on Amazon Today
    www.ign.com
    Since its debut on Broadway back in 2003, Wicked has stolen the hearts of adults and kids alike. With the release of the Wicked movie, the fandom has only grown, to the point of becoming big enough that Wicked now has its own line of merchandise. Just a month before the film hit theaters in November, LEGO released a whole line of Wicked sets you can build. As of right now, some of these are discounted on Amazon.Wicked LEGO Sets on Sale at AmazonLowest Ever Price LEGO Wicked Elphaba & Glinda Figures LEGO Wicked Welcome to Emerald CityAlthough four LEGO Wicked sets were released in October 2024, only two of them are currently on sale on Amazon. The first is the Elphaba and Glinda figures, which features 558 pieces and is meant for ages 10 and up. This is the first time this set has been discounted on Amazon and the price has dropped by 15% today. The second set receiving a price reduction is the much larger Welcome to Emerald City build. This particular set also includes a Glinda and Elphaba miniature as well as The Wizard, Madame Morrible, and Fiyero. It's a 945-piece build and the city sits at just over 15 inches high.More Wicked merch to check outWicked (4K and Blu-ray)Wicked Collectors EditionMonopoly Wicked EditionFunko Pop Elphaba$25.18 at AmazonAre There Any Upcoming LEGO Wicked Sets?We don't know yet whether there will be more LEGO Wicked sets on the horizon, but it seems quite likely. The first four sets that were released last year came out in time with the new movie, and we already know that the next Wicked movie will be released on November 21, 2025. If new sets follow a similar timeline, the potential release of new builds would land sometime in October 2025. Some fans are already speculating what new sets could be released with the new film, with some sources mentioning a potential 18+ set targeted at adult collectors.Looking for more LEGO deals? Amazon is also currently running a sale on LEGO flower sets ahead of Valentine's day.
    0 Kommentare ·0 Anteile ·39 Ansichten
  • Lore of Eora The Gods | A Pillars of Eternity and Avowed Visual Guide
    www.youtube.com
    The post Lore of Eora The Gods | A Pillars of Eternity and Avowed Visual Guide appeared first on Xbox Wire.
    0 Kommentare ·0 Anteile ·40 Ansichten
  • This could explain why Apple isnt making AR glasses yet
    9to5mac.com
    Last week, Mark Gurman reported that Apple had canceled development of its AR glasses project following some unimpressive demos for executives. With Vision Pro struggling to build its own sales momentum, on the surface AR glasses seemed like the the best path forward for Vision products. But heres why that may not be the caseat least not yet.Vision Pro pointed to a future that only AR glasses could fulfillWhen Vision Pro launched last year, reviewers had two common critiques. They said:it needed to be lighter, with glasses the ideal form factorand it was too expensiveOther than those two issues, general consensus was that Apple had accomplished some very impressive feats with visionOS and the Vision Pro hardware.To many, the device felt like the futurebut with some present constraints holding it back.Apples AR glasses project seemed like it would be the answer.But now its been canceled.Leaving many of us wondering: why?Two potential reasons Apple Glasses arent coming anytime soonWhen considering the prospect of Apple Glasses, there are two important factors that will prevent any such product from arriving in the near-term.Physical limitationsand Apples other wearablesThe first point is pretty simple.Apple already struggles to make its current AR productVision Prolightweight enough for comfortable use.AR glasses may be the dream, but are they realistic? Not anytime soon.Once Apple can ship a Vision Pro that weighs significantly less, and doesnt inspire medical expert-designed accessories to make it more comfortable, then and only then can AR glasses truly be on the table.But some might say, If Apple cant ship proper AR glasses, why not something like a Meta Ray-Ban competitor?Ive heard lots of good things about Metas smart glasses, and similar offerings from competitors.The thing they all have in common though? Their functionality is largely limited to what Apples other wearables already offeror could offer soon.AirPods and Apple Watch are already massive hits, and together they can do so much of what current smart glasses offer.Siri and ChatGPT are always available with AirPods in your ears, notifications can be announced by Siri or read on your Apple Watch, playback controls, translation, and more is on your wrist, and there are even more capabilities coming.Multiple reports have indicated that Apple is working on adding cameras to future AirPods modelsa feature that takes care of one of the only remaining advantages smart glasses have.With existing wearables offering such functionality, theres little reason for Apple to create smart glasses until it can include true AR support.Apple Glasses: wrap-upIs a day coming when we all walk around wearing AR glasses? Maybe.But first, tech needs to advance enough to create a compelling product that spurs such a cultural shift. And by all indications, were years away from such change.A less capable product, like smart glasses, could grow widespread much sooner. But if AirPods and Apple Watch can offer similar functionality, wont most of us opt for those accessories and avoid wearing something on the face?Theres a lot of potential in the AR/VR/headset/glasses space. But for now, Apple punting its AR glasses project makes sense. Ongoing improvements to Vision Pro, AirPods, and Apple Watch will help the company get to the glasses dream eventuallyits just going to take some time.Best iPhone accessoriesAdd 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.Youre reading 9to5Mac experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Dont know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
    0 Kommentare ·0 Anteile ·39 Ansichten
  • Opera Air is a new web browser with built-in mindfulness features
    9to5mac.com
    You may be familiar with Opera, a well-known cross-platform web browser. The company is now introducing Opera Air, a new version of the web browser with multiple mindfulness features aimed at helping users relax while they browse the web.Opera Air offers meditation sessions and break remindersAccording to the company, Opera Air is the first web browser built around the concept of mindfulness. While the app lets users access any website just like other web browsers, Opera Air has a minimalist interface and features to help users take care of their mental health. This includes meditation sessions, break reminders, and calming background sounds.Web browsers have become powerful super apps that allow you to do anything, from working, studying, entertaining yourself, shopping, or even running other apps. [] Ultimately, with Opera Air, we want to help you feel better and become more mindful about your environment, the company said in a blog post.It all starts with a super-clean interface reminiscent of frozen glass. The interface can be customized with different backgrounds and layouts. Users can find the wellness features in the floating toolbar on the left.For instance, Opera Air includes some breathing exercises that help reduce stress and blood pressure in the companys words. There are also neck exercises to relieve tension and reduce pain, and Opera recommends that users do at least one session every day. Users will also find meditation sessions and full body scans for those looking for a longer break.Another feature is Boosts, a library of sounds designed to stimulate brain waves to boost creativity and relieve stress. Opera explains that Boosts are not simple background sounds, as users can adjust the music level, ambient sound, and frequency of the binaural beats individually.Of course, each user has their own way of meditating and relaxing so these features may not work for you, but Opera seems to really believe that theres a demand for a web browser with such features.Opera Air also includes popular features from the regular version of the web browser, such as Aria AI and a free VPN. You can try it for free, and its now available to download for both macOS and Windows.Add 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.Youre reading 9to5Mac experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Dont know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
    0 Kommentare ·0 Anteile ·39 Ansichten