• New Morphing Meerkat Phishing Kit Mimics 114 Brands Using Victims DNS Email Records
    thehackernews.com
    Mar 27, 2025Ravie LakshmananEmail Security / MalwareCybersecurity researchers have shed light on a new phishing-as-a-service (PhaaS) platform that leverages the Domain Name System (DNS) mail exchange (MX) records to serve fake login pages that impersonate about 114 brands.DNS intelligence firm Infoblox is tracking the actor behind the PhaaS, the phishing kit, and the related activity under the moniker Morphing Meerkat."The threat actor behind the campaigns often exploits open redirects on adtech infrastructure, compromises domains for phishing distribution, and distributes stolen credentials through several mechanisms, including Telegram," the company said in a report shared with The Hacker News.One such campaign leveraging the PhaaS toolkit was documented by Forcepoint in July 2024, where phishing emails contained links to a purported shared document that, when clicked, directed the recipient to a fake login page hosted on Cloudflare R2 with the end goal of collecting and exfiltrating the credentials via Telegram.Morphing Meerkat is estimated to have delivered thousands of spam emails, with the phishing messages using compromised WordPress websites and open redirect vulnerabilities on advertising platforms like Google-owned DoubleClick to bypass security filters.It's also capable of translating phishing content text dynamically into over a dozen different languages, including English, Korean, Spanish, Russian, German, Chinese, and Japanese, to target users across the world.In addition to complicating code readability via obfuscation and inflation, the phishing landing pages incorporate anti-analysis measures that prohibit the use of mouse right-click as well as keyboard hotkey combinations Ctrl + S (save the web page as HTML), Ctrl + U (open the web page source code).But what makes the threat actor truly stand out is its use of DNS MX records obtained from Cloudflare or Google to identify the victim's email service provider (e.g., Gmail, Microsoft Outlook, or Yahoo!) and dynamically serve fake login pages. In the event, that the phishing kit is unable to recognize the MX record, it defaults to a Roundcube login page."This attack method is advantageous to bad actors because it enables them to carry out targeted attacks on victims by displaying web content strongly related to their email service provider," Infoblox said. ""The overall phishing experience feels natural because the design of the landing page is consistent with the spam email's message. This technique helps the actor trick the victim into submitting their email credentials via the phishing web form."Found this article interesting? Follow us on Twitter and LinkedIn to read more exclusive content we post.SHARE
    0 Reacties ·0 aandelen ·63 Views
  • What FedRAMP Automation Means for CIOs at Government Contractors
    www.informationweek.com
    Carrie Pallardy, Contributing ReporterMarch 27, 20255 Min ReadMichael Ventura via Alamy Stock PhotoThe US General Services Administration (GSA) announced plans for an overhaul of the Federal Risk and Authorization Management Program (FedRAMP). The new approach, dubbed FedRAMP 20x, will lean into automation to make authorization simpler, easier, and cheaper while continuously improving security, according to the GSA press release.InformationWeek spoke to four leaders in the private sector about the anticipated changes to FedRAMP, the potential impact, and how CIOs at government contractors can prepare.The ChangesFedRAMP was first established in 2011, about midway through Jonathan Alboums 11-year government career. He held multiple senior IT positions within the government, including CIO of the United States Department of Agriculture (USDA) before making the switch to the private sector in 2018, giving him exposure to FedRAMP as both buyer and service provider.Since the inception of the program, GSA has been trying to continue to make it better.I really see these changes as a continuation of those overarching efforts, Alboum, currently the Federal CTO at ServiceNow, tells InformationWeek. ServiceNow provides an AI platform, and it has 100 authority to operate (ATO) letters on file with FedRAMP.FedRAMP 20x has five main goals. The first focuses on automating the validation of FedRAMP security requirements. Under this new framework, more than 80% of requirements could transition to automated validation.Related:The second goal aims to reduce documentation requirements if companies pursuing FedRAMP authorization can demonstrate their existing best practices and security policies.Continuous monitoring is also one of the primary objectives of FedRAMP 20x. The updated model is promising a simple, hands-off approach that that leverages secure by design principles and automated enforcement.Through FedRAMP, GSA has played a role between contractors and government agencies. FedRAMP 20xs fourth goal emphasizes more direct relationships.A major objective is to reduce third-party involvement of the FedRAMP team in favor of more direct agency-provider interactions, Shrav Mehta, CEO of Secureframe, an automated compliance platform, explains in an email interview. Secureframe intends to pursue authorization under the new FedRAMP model.The final goal centers on innovation. Under FedRAMP 20x, companies will undergo automated checks and be able to make changes without additional oversight, granted they follow an approved process for doing so.As is often the case, more automation comes with the possibility of fewer staff. Federal News Network reports that FedRAMPs program management will be staffed by a few federal employees.Related:The Potential ImpactWhile the FedRAMP authorization process could look quite different with more automation, the underlying intent remains the same.You're always going to have a set of guardrails, a set of compliance rules that everybody's going to have to play by, says Kevin Orr, federal president for RSA, an identity security solutions company.RSA ID Plus for Government is FedRAMP authorized, and Orr has coached a number of companies through the process. He has seen firsthand how long it can take. It's anywhere from 18 to 24 months, he shares. I've been through this four times.Increased automation that cuts down on the amount of paperwork, time, and labor involved in achieving FedRAMP authorization could result in a less expensive endeavor.Today, there are nearly 400 FedRAMP authorized services, according to the FedRAMP marketplace. If the process becomes more efficient, and less expensive, more companies might be interested in pursuing authorization.The byproduct of that could be greater competition. [It] could be greater availability of capabilities that just don't exist today in the government sphere, says Alboum.Related:Continuous monitoring could offer advantages over a manual audit-based approach. We develop software and capabilities in a continuous manner. We're constantly improving them. So, a continuous authorization management approach is really much more appropriate, says Alboum.The hope is that continuous monitoring will lead to a more robust cybersecurity posture across the cloud-based tools in use within government agencies.There is optimism among companies that have achieved FedRAMP certification in the past. Sumo Logic, a cloud-native, machine data analytics platform, achieved FedRAMP Ready designation in 2019 and FedRAMP Moderate authorization in 2021.We need to maintain rigor in how we're evaluating technology to ensure that it's a secure solution for government agencies. But ultimately we're very welcoming of efficiencies gained throughout the process, Seth Williams, the companys field CTO, tells InformationWeek.What Comes Next?The promise of a less burdensome FedRAMP authorization process is exciting for government contractors, but there are still unknowns.We're a little bit in the wait and see [mode] because the devils in the details Exactly how are we going to do continuous monitoring? Orr asks. I don't think anybody really wants the government inside your network telling you what you do. But at the same time, we all stand up and sign up for a security pledge to make the nation a [safer] place. So, somewhere in between is probably the truth, and we'll see what comes out of it.It also remains to be seen how automation is applied and how it works in practice. What will the impact of reduced FedRAMP staffing be? What will more direct relationships between government agencies and contractors look like?The future of FedRAMP is likely going to be shaped with input from industry stakeholders. FedRAMP working groups will gather input from industry, ensure equal access to information, encourage pilot programs, and provide technical guidance before formal public comment and release, according to the GSA press release.GSA notes that low-impact service offerings will not require agency sponsorship under FedRAMP 20x, but relationship building will still be important as FedRAMP evolves. Some of that connection will be formed within those working groups. And contractors who want to work with government agencies will need to demonstrate the value of their service offerings.It's one thing to say, I want to work with the government, or I have the capability to work with government. Well, how does it provide value to a government agency? says Alboum. Relationships are still going to be very important, especially as we go through this period of significant change.How can government contractors, and companies eager to secure government customers for the first time, prepare?For government contractors, success will depend on their ability to provide immediate, comprehensive security insights and adapt to more dynamic compliance expectations, says Mehta.About the AuthorCarrie PallardyContributing ReporterCarrie Pallardy is a freelance writer and editor living in Chicago. She writes and edits in a variety of industries including cybersecurity, healthcare, and personal finance.See more from Carrie PallardyReportsMore ReportsNever Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.SIGN-UPYou May Also Like
    0 Reacties ·0 aandelen ·73 Views
  • One Battle After Another Trailer: DiCaprio and PTA Team Up
    screencrush.com
    Leonardo DiCaprio and Paul Thomas Andersons mysterious film project has been revealed.The first trailer forAndersonsOne Battle After Another is here, revealing that the movie is a dark comedy with what looks to be a fair amount of action including some intense car stunts.While Warner Bros. has said very little about the films plot, various reports online claim it is inspired by the Thomas Pynchon novelVineland.(Anderson previously adapted Pynchons 2009 novelInherent Viceinto a movie starring Joaquin Phoenix in 2014.)Watch the full trailer forOne Battle After Another below, which also includes several appearances by Benicio del Toro as DiCaprios sensei:READ MORE: Paul Thomas Anderson Reveals His Favorite PTA MovieAs Anderson and DiCaprio have worked on the project, there have been various reports online about the size of the movies budget; some claiming it is upwards of $100 million, or maybe even $140 million. Even on the low end, that would makeOne Battle After Another, Andersons largest movie to date in terms of scale. (The epicThere Will Be Bloodonly cost a reported $25 million to make; PTAs latest feature,Licorice Pizza, supposedly cost $40 million while grossing less than $35 million in theaters.)Here is the films official synopsis:From Warner Bros. Pictures and Academy Award-nominated, BAFTA-winning filmmaker Paul Thomas Anderson comes One Battle After Another, starring Academy Award and BAFTA winner Leonardo DiCaprio. Oscar and BAFTA winners Benicio del Toro and Sean Penn also star alongside Regina Hall, Teyana Taylor and Chase Infiniti, as well as Wood Harris and Alana Haim. Anderson directs from his own screenplay, and produces alongside Oscar and BAFTA nominees Adam Somner and Sara Murphy, with Will Weiske executive producing.One Battle After Anotheris scheduled to open in theaters (as well as IMAX) on September 26.Get our free mobile appThe 10 Best Movies of the Last 10 Years (2015-2024)Our film critic picks the ten best films of the last ten years.
    0 Reacties ·0 aandelen ·72 Views
  • OnTheGoSystems: Full-Stack RoR Developer
    weworkremotely.com
    We are looking for expert RoR and React developers who are quick learners and passionate about their work.Why Join Us?Stable and self-funded: Founded in 2008, OTGS is proudly profitable and not dependent on external investments.Fully remote global team: Collaborate with talented colleagues worldwide.Industry leaders: Creators of WPML, the 1 multilingual plugin for WordPress, powering over 1.5 million sites.Innovative projects: Currently building PTC (Private Translation Cloud), our next-generation SaaS platform for seamless software localization.Professional growth: Structured mentorship programs for new team members.Excellent benefits: Designed to support your well-being both inside and outside of work.Positive culture: Genuinely kind, collaborative, and professional work environment.Career advancement: Real opportunities for internal promotion and professional development.Indicators You're a Great FitYou have strong experience with Ruby on Rails and at least one modern JavaScript framework.You're passionate about writing clean, maintainable, and high-quality code.You have proven experience building and scaling robust systems.You thrive on solving complex challenges independently within a collaborative team environment.What You'll DoAs a valued developer on our team, youll participate in:Feature planning and implementationMaintenance and optimizationTest automationLearn more about our PTC project, its current status, technology stack, and upcoming milestones.Ready to make a meaningful impact at a company that values innovation and teamwork? Apply today and join our global team!
    0 Reacties ·0 aandelen ·55 Views
  • Greyworks: Entrepreneurial Partner / Chief of Staff
    weworkremotely.com
    About the RoleWorking directly with a technical executive (usually) based in New York, NY, you'll handle complex analytical projects, building models, and conducting in-depth research that requires independent thinking and problem-solving. From there, you will drive and support the execution of those research projects. This is not a traditional role but rather a second brain and another set of hands for an executive looking for a sharp, analytical partner. In fact, he has struggled with what to call this role. Claude.ai suggested "Chief of Staff Lite", but that doesn't feel right, either.You have the opportunity to name this role something better.Core ResponsibilitiesPartner with the technical executive to discover and execute against new business opportunitiesConduct thorough research projects across various business domainsBuild and maintain financial and operational modelsAnalyze data and synthesize findings into actionable insightsManage special projects requiring critical thinking and independenceSupport business development initiatives with market researchOccasional administrative supportIdeal Candidate ProfileStrong analytical capabilities and comfort with data analysisExperience building financial or operational modelsExcellent research skills and ability to distill complex informationIndependent problem-solver who can work with minimal supervisionVery strong written and verbal communication skills in English (regional accents welcomed!)Internet-native with a high proficiency using myriad software productsWork ArrangementFully remote position open to candidates worldwideFlexible hours with some overlap with Eastern Time (UTC-4)Ideally full-time, but open to part-time arrangements with potential to growCompetitive compensation based on experience, location, and skillsWhy This Role Is DifferentThis position offers the chance to work directly with a technical and entrepreneurial executive on substantive projects. You'll be valued for your analytical capabilities and independent thinking rather than fitting into a job.The right candidate will have significant growth opportunities as the partnership develops.The executive is looking to grow his business interests both inside and outside of the US, and is looking for a diverse set of partners to work with.Potential business areas to explore include (but not limited to):Legacy business open to technology investment and/or modernization (both newco and acquisition)Real estate (commercial, residential, industrial, mixed-use)Industry- and regional-specific vertical SaaS (both newco and acquisition)Video games and electronic entertainment productsContent, news, and mediaApplication Processthe job application to the best of your ability. The entrepreneur does not anticipate high candidate volume as he is prioritizing quality over quantity, and expects to hire more than one individual in this role on a rolling basis.All well-written applications will be read and responded to. Following the written application, the next step is a 1-on-1 video interview with the executive to determine mutual fit.(No recruiters nor agencies, please.)
    0 Reacties ·0 aandelen ·58 Views
  • What is Signal? The messaging app, explained.
    www.technologyreview.com
    MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand whats coming next. You can read more from the series here. With the recent news that the Atlantics editor in chief was accidentally added to a group Signal chat for American leaders planning a bombing in Yemen, many people are wondering: What is Signal? Is it secure? If government officials arent supposed to use it for military planning, does that mean I shouldnt use it either? The answer is: Yes, you should use Signal, but government officials having top-secret conversations shouldnt use Signal. Read on to find out why. What is Signal? Signal is an app you can install on your iPhone or Android phone, or on your computer. It lets you send secure texts, images, and phone or video chats with other people or groups of people, just like iMessage, Google Messages, WhatsApp, and other chat apps. Installing Signal is a two-minute processagain, its designed to work just like other popular texting apps. Why is it a problem for government officials to use Signal? Signal is very secureas well see below, its the best option out there for having private conversations with your friends on your cell phone. But you shouldnt use it if you have a legal obligation to preserve your messages, such as while doing government business, because Signal prioritizes privacy over ability to preserve data. Its designed to securely delete data when youre done with it, not to keep it. This makes it uniquely unsuited for following public record laws. You also shouldnt use it if your phone might be a target of sophisticated hackers, because Signal can only do its job if the phone it is running on is secure. If your phone has been hacked, then the hacker can read your messages regardless of what software you are running. This is why you shouldnt use Signal to discuss classified material or military plans. For military communication your civilian phone is always considered hacked by adversaries, so you should instead use communication equipment that is saferequipment that is physically guarded and designed to do only one job, making it harder to hack. What about everyone else? Signal is designed from bottom to top as a very private space for conversation. Cryptographers are very sure that as long as your phone is otherwise secure, no one can read your messages. Why should you want that? Because private spaces for conversation are very important. In the US, the First Amendment recognizes, in the right to freedom of assembly, that we all need private conversations among our own selected groups in order to function. And you dont need the First Amendment to tell you that. You know, just like everyone else, that you can have important conversations in your living room, bedroom, church coffee hour, or meeting hall that you could never have on a public stage. Signal gives us the digital equivalent of thatits a space where we can talk, among groups of our choice, about the private things that matter to us, free of corporate or government surveillance. Our mental health and social functioning require that. So if youre not legally required to record your conversations, and not planning secret military operations, go ahead and use Signalyou deserve the privacy. How do we know Signal is secure? People often give up on finding digital privacy and end up censoring themselves out of caution. So are there really private ways to talk on our phones, or should we just assume that everything is being read anyway? The good news is: For most of us who arent individually targeted by hackers, we really can still have private conversations. Signal is designed to ensure that if you know your phone and the phones of other people in your group havent been hacked (more on that later), you dont have to trust anything else. It uses many techniques from the cryptography community to make that possible. Most important and well-known is end-to-end encryption, which means that messages can be read only on the devices involved in the conversation and not by servers passing the messages back and forth. But Signal uses other techniques to keep your messages private and safe as well. For example, it goes to great lengths to make it hard for the Signal server itself to know who else you are talking to (a feature known as sealed sender), or for an attacker who records traffic between phones to later decrypt the traffic by seizing one of the phones (perfect forward secrecy). These are only a few of many security properties built into the protocol, which is well enough designed and vetted for other messaging apps, such as WhatsApp and Google Messages, to use the same one. Signal is also designed so we dont have to trust the people who make it. The source code for the app is available online and, because of its popularity as a security tool, is frequently audited by experts. And even though its security does not rely on our trust in the publisher,it does come from a respected source: the Signal Technology Foundation, a nonprofit whose mission is to protect free expression and enable secure global communication through open-source privacy technology. The app itself, and the foundation, grew out of a community of prominent privacy advocates. The foundation was started by Moxie Marlinspike, a cryptographer and longtime advocate of secure private communication, and Brian Acton, a cofounder of WhatsApp. Why do people use Signal over other text apps? Are other ones secure? Many apps offer end-to-end encryption, and its not a bad idea to use them for a measure of privacy. But Signal is a gold standard for private communication because it is secure by default: Unless you add someone you didnt mean to, its very hard for a chat to accidentally become less secure than you intended. Thats not necessarily the case for other apps. For example, iMessage conversations are sometimes end-to-end encrypted, but only if your chat has blue bubbles, and they arent encrypted in iCloud backups by default. Google Messages are sometimes end-to-end encrypted, but only if the chat shows a lock icon. WhatsApp is end-to-end encrypted but logs your activity, including how you interact with others using our Services. Signal is careful not to record who you are talking with, to offer ways to reliably delete messages, and to keep messages secure even in online phone backups. This focus demonstrates the benefits of an app coming from a nonprofit focused on privacy rather than a company that sees security as a nice to have feature alongside other goals. (Conversely, and as a warning, using Signal makes it rather easier to accidentally lose messages! Again, it is not a good choice if you are legally required to record your communication.) Applications like WhatsApp, iMessage, and Google Messages do offer end-to-end encryption and can offer much better security than nothing. The worst option of all is regular SMS text messages (green bubbles on iOS)those are sent unencrypted and are likely collected by mass government surveillance. Wait, how do I know that my phone is secure? Signal is an excellent choice for privacy if you know that the phones of everyone youre talking with are secure. But how do you know that? Its easy to give up on a feeling of privacy if you never feel good about trusting your phone anyway. One good place to start for most of us is simply to make sure your phone is up to date. Governments often do have ways of hacking phones, but hacking up-to-date phones is expensive and risky and reserved for high-value targets. For most people, simply having your software up to date will remove you from a category that hackers target. If youre a potential target of sophisticated hacking, then dont stop there. Youll need extra security measures, andguides from the Freedom of the Press Foundation and the Electronic Frontier Foundation are a good place to start. But you dont have to be a high-value target to value privacy. The rest of us can do our part to re-create that private living room, bedroom, church, or meeting hall simply by using an up-to-date phone with an app that respects our privacy. Jack Cushman is a fellow of the Berkman Klein Center for Internet and Society and directs the Library Innovation Lab at Harvard Law School Library. He is an appellate lawyer, computer programmer, and former board member of the ACLU of Massachusetts.
    0 Reacties ·0 aandelen ·85 Views
  • Anthropic can now track the bizarre inner workings of a large language model
    www.technologyreview.com
    The AI firm Anthropic has developed a way to peer inside a large language model and watch what it does as it comes up with a response, revealing key new insights into how the technology works. The takeaway: LLMs are even stranger than we thought. The Anthropic team was surprised by some of the counterintuitive workarounds that large language models appear to use to complete sentences, solve simple math problems, suppress hallucinations, and more, says Joshua Batson, a research scientist at the company. Its no secret that large language models work in mysterious ways. Fewif anymass-market technologies have ever been so little understood. That makes figuring out what makes them tick one of the biggest open challenges in science. But its not just about curiosity. Shedding some light on how these models work would expose their weaknesses, revealing why they make stuff up and can be tricked into going off the rails. It would help resolve deep disputes about exactly what these models can and cant do. And it would show how trustworthy (or not) they really are. Batson and his colleagues describe their new work in two reports published today. The first presents Anthropics use of a technique called circuit tracing, which lets researchers track the decision-making processes inside a large language model step by step. Anthropic used circuit tracing to watch its LLM Claude 3.5 Haiku carry out various tasks. The second (titled On the Biology of a Large Language Model) details what the team discovered when it looked at 10 tasks in particular. I think this is really cool work, says Jack Merullo, who studies large language models at Brown University in Providence, Rhode Island, and was not involved in the research. Its a really nice step forward in terms of methods. Circuit tracing is not itself new. Last year Merullo and his colleagues analyzed a specific circuit in a version of OpenAIs GPT-2, an older large language model that OpenAI released in 2019. But Anthropic has now analyzed a number of different circuits as a far larger and far more complex model carries out multiple tasks. Anthropic is very capable at applying scale to a problem, says Merullo. Eden Biran, who studies large language models at Tel Aviv University, agrees. Finding circuits in a large state-of-the-art model such as Claude is a nontrivial engineering feat, he says. And it shows that circuits scale up and might be a good way forward for interpreting language models. Circuits chain together different partsor componentsof a model. Last year, Anthropic identified certain components inside Claude that correspond to real-world concepts. Some were specific, such as Michael Jordan or greenness; others were more vague, such as conflict between individuals. One component appeared to represent the Golden Gate Bridge. Anthropic researchers found that if they turned up the dial on this component, Claude could be made to self-identify not as a large language model but as the physical bridge itself. The latest work builds on that research and the work of others, including Google DeepMind, to reveal some of the connections between individual components. Chains of components are the pathways between the words put into Claude and the words that come out. Its tip-of-the-iceberg stuff. Maybe were looking at a few percent of whats going on, says Batson. But thats already enough to see incredible structure. Growing LLMs Researchers at Anthropic and elsewhere are studying large language models as if they were natural phenomena rather than human-built software. Thats because the models are trained, not programmed. They almost grow organically, says Batson. They start out totally random. Then you train them on all this data and they go from producing gibberish to being able to speak different languages and write software and fold proteins. There are insane things that these models learn to do, but we dont know how that happened because we didnt go in there and set the knobs. Sure, its all math. But its not math that we can follow. Open up a large language model and all you will see is billions of numbersthe parameters, says Batson. Its not illuminating. Anthropic says it was inspired by brain-scan techniques used in neuroscience to build what the firm describes as a kind of microscope that can be pointed at different parts of a model while it runs. The technique highlights components that are active at different times. Researchers can then zoom in on different components and record when they are and are not active. Take the component that corresponds to the Golden Gate Bridge. It turns on when Claude is shown text that names or describes the bridge or even text related to the bridge, such as San Francisco or Alcatraz. Its off otherwise. Yet another component might correspond to the idea of smallness: We look through tens of millions of texts and see its on for the word small, its on for the word tiny, its on for the word petite, its on for words related to smallness, things that are itty-bitty, like thimblesyou know, just small stuff, says Batson. Having identified individual components, Anthropic then follows the trail inside the model as different components get chained together. The researchers start at the end, with the component or components that led to the final response Claude gives to a query. Batson and his team then trace that chain backwards. Odd behavior So: What did they find? Anthropic looked at 10 different behaviors in Claude. One involved the use of different languages. Does Claude have a part that speaks French and another part that speaks Chinese, and so on? The team found that Claude used components independent of any language to answer a question or solve a problem and then picked a specific language when it replied. Ask it What is the opposite of small? in English, French, and Chinese and Claude will first use the language-neutral components related to smallness and opposites Anthropic also looked at how Claude solved simple math problems. The team found that the model seems to have developed its own internal strategies that are unlike those it will have seen in its training data. Ask Claude to add 36 and 59 and the model will go through a series of odd steps, including first adding a selection of approximate values (add 40ish and 60ish, add 57ish and 36ish). Towards the end of its process, it comes up with the value 92ish. Meanwhile, another sequence of steps focuses on the last digits, 6 and 9, and determines that the answer must end in a 5. Putting that together with 92ish gives the correct answer of 95. And yet if you then ask Claude how it worked that out, it will say something like: I added the ones (6+9=15), carried the 1, then added the 10s (3+5+1=9), resulting in 95. In other words, it gives you a common approach found everywhere online rather than what it actually did. Yep! LLMs are weird. (And not to be trusted.) The steps that Claude 3.5 Haiku used to solve a simple math problem were not what Anthropic expectedthey're not the steps Claude claimed it took either. ANTHROPIC This is clear evidence that large language models will give reasons for what they do that do not necessarily reflect what they actually did. But this is true for people too, says Batson: You ask somebody, Why did you do that? And theyre like, Um, I guess its because I was . You know, maybe not. Maybe they were just hungry and thats why they did it. Biran thinks this finding is especially interesting. Many researchers study the behavior of large language models by asking them to explain their actions. But that might be a risky approach, he says: As models continue getting stronger, they must be equipped with better guardrails. I believeand this work also showsthat relying only on model outputs is not enough. A third task that Anthropic studied was writing poems. The researchers wanted to know if the model really did just wing it, predicting one word at a time. Instead they found that Claude somehow looked ahead, picking the word at the end of the next line several words in advance. For example, when Claude was given the prompt A rhyming couplet: He saw a carrot and had to grab it, the model responded, His hunger was like a starving rabbit. But using their microscope, they saw that Claude had already hit upon the word rabbit when it was processing grab it. It then seemed to write the next line with that ending already in place. This might sound like a tiny detail. But it goes against the common assumption that large language models always work by picking one word at a time in sequence. The planning thing in poems blew me away, says Batson. Instead of at the very last minute trying to make the rhyme make sense, it knows where its going. I thought that was cool, says Merullo. One of the joys of working in the field is moments like that. Theres been maybe small bits of evidence pointing toward the ability of models to plan ahead, but its been a big open question to what extent they do. Anthropic then confirmed its observation by turning off the placeholder component for rabbitness. Claude responded with His hunger was a powerful habit. And when the team replaced rabbitness with greenness, Anthropic also explored why Claude sometimes made stuff up, a phenomenon known as hallucination. Hallucination is the most natural thing in the world for these models, given how theyre just trained to give possible completions, says Batson. The real question is, How in Gods name could you ever make it not do that? The latest generation of large language models, like Claude 3.5 and Gemini and GPT-4o, hallucinate far less than previous versions, thanks to extensive post-training (the steps that take an LLM trained on the internet and turn it into a usable chatbot). But Batsons team was surprised to find that this post-training seems to have made Claude refuse to speculate as a default behavior. When it did respond with false information, it was because some other component had overridden the dont speculate component. This seemed to happen most often when the speculation involved a celebrity or other well-known entity. Its as if the amount of information available pushed the speculation through, despite the default setting. When Anthropic overrode the dont speculate component to test this, Claude produced lots of false statements about individuals, including claiming that Batson was famous for inventing the Batson principle (he isnt). Still unclear Because we know so little about large language models, any new insight is a big step forward. A deep understanding of how these models work under the hood would allow us to design and train models that are much better and stronger, says Biran. But Batson notes there are still serious limitations. Its a misconception that weve found all the components of the model or, like, a Gods-eye view, he says. Some things are in focus, but other things are still uncleara distortion of the microscope. And it takes several hours for a human researcher to trace the responses to even very short prompts. Whats more, these models can do a remarkable number of different things, and Anthropic has so far looked at only 10 of them. Batson also says there are big questions that this approach wont answer. Circuit tracing can be used to peer at the structures inside a large language model, but it wont tell you how or why those structures formed during training. Thats a profound question that we dont address at all in this work, he says. But Batson sees this as the start of a new era in which it is possible, at last, to find real evidence for how these models work: We dont have to be, like: Are they thinking? Are they reasoning? Are they dreaming? Are they memorizing? Those are all analogies. But if we can literally see step by step what a model is doing, maybe now we dont need analogies.
    0 Reacties ·0 aandelen ·83 Views
  • The Bulgarian Pavilion will explore the paradox of artificial intelligence and human intervention
    worldarchitecture.org
    html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"The Bulgarian Pavilion has revealed its theme for the 2025 Venice Architecture Biennale. The exhibition, titled Pseudonature, is an experimental installation that traverses the boundaries between simulation and reality, technology and nature.The idea, which was conceived by architect Iassen Markov, examines sustainability in a world where artificial intelligence and human intervention are increasingly mediating natural processes.Image Pseudonature, 2025, concept image, courtesy of Iassen MarkovA Climate Paradox: When the Sun Creates SnowA snow-covered courtyard in the middle of summer in Venice is the central contradiction of Pseudonature. Artificial snowfall is created by a solar-powered snow-making machinery, but the panels that fuel it are buried. The sun drives the system and erases its own creation, acting as both a creator and a destroyer.The delicate balance of sustainable technology is exposed by this self-regulating cycle: solar strength increases system efficiency, but snow accumulation slows the production of energy, requiring constant adaptation to changing conditions. The initiative poses important queries, such as whether we have any influence over nature or if we are only surviving on the brink of its ever shifting forces.The curator, Iassen Markov, portrait image, courtesy of Iassen MarkovInterior: A Reimagined Traditional SpaceWithin the pavilion, an abstract interpretation of the traditional Bulgarian odaya (living room) invites visitors into a realm of contemplation, dialogue, and collective reflection. The odaya, once the heart of the Bulgarian home, has always functioned as a place for dialogue, engagement, and the sharing of thoughts. It is now redefined as a nexus where natural, artificial, and collective intelligence converge to forge new prospects for the future.A virtual fireplace, created by AI but lacking warmth, flickers at the center, highlighting the artificial nature of this reconstructed environment. Artist Rosie Eisors handcrafted carpet intertwines tradition and digital aesthetics, strengthening the conversation between the organic and synthetic.Just as the delicate equilibrium outside is in constant flux among sunshine, snowfall, and energy, the odaya mirrors the similarly complex struggle to achieve balance among various types of intelligence.This tension is depicted on Eisors carpet by three mythological creatures engaged in combat and connected by a serpentsymbolizing the complex and at times contradictory relationship among human, artificial, and environmental intelligence.Image Iassen Markov, photo by Zlatimir Arakliev, courtesy of Iassen MarkovCatalog: Radical Recipes for a Better ClimateIn addition to the physical installation, Pseudonature showcases Radical Recipes for a Better Climate, which serves as an intellectual cornerstone of the Bulgarian Pavilion. This catalog collects various "recipes" from architects, designers, and scientistsunique approaches to sustainable adaptation.These inputs are combined into a collective recipe produced by artificial intelligence, demonstrating how AI can integrate human insight into a unified vision for resilience. The outcome is a dynamic interaction between individual know-how and machine-generated synthesis, investigating novel methods to find ones way in an unpredictable climate future. The atmosphere of the project is further amplified by a series of curated image sequences.They combine speculative visuals with documentary fragments to evoke the fragility and potential of our developing relationship with nature and technology.Image Pseudonature, 2025, concept image, courtesy of Iassen MarkovBeyond Installation: A Platform for ReflectionThe installations design intentionally brings to mind a construction site - bricks, aluminum profiles, and polycarbonate elements highlight the artificial aspect of the climate paradox. This unrefined, do-it-yourself look lays bare the temporary and at times careless manner in which we affect natural systems, emphasizing the conflict between control and unpredictability.Outdoors, physical measures disturb natural equilibria, exposing the delicate relationship between technology and the environment. Within, the area transforms into a zone of reflection, and the task of reestablishing balance turns into a cognitive and communal undertaking.Pseudonature clarifies: Our capacity to navigate and resolve the contradictions we generate starts from within, just as we shape the external world.Pseudonature: A Call to Rethink Our RoleAccording to the organizers, Pseudonature is more than just an architectural experiment; it invites us to rethink our role in shaping the world.They said that as the lines between nature and technology become increasingly blurred, this project exposes the paradoxes we create and the delicate balances we must maintain. Sustainability now involves not only preservation but also adaptation, reinvention, and a dynamic interaction between human creativity, artificial intelligence, and natural forces.Image Pseudonature, 2025, concept image, courtesy of Iassen MarkovThe 2025 Venice Architecture Biennale will take place fromMay 10 to November23 November 2025 at the Giardini, the Arsenale and various venues in Venice, Italy.Besides Bulgaria's contribution, other contributions at the Venice Architecture Biennale include the Romanian Pavilion's Human Scale exhibition, the Luxembourg Pavilion's Sonic Investigations exhibition, the Albanian Pavilion's "Building Architecture Culture" exhibition, the Turkey Pavilion's "Grounded" exhibition, the Pavilion of the United Arab Emirates's Pressure Cooker exhibition, the Finland Pavilion's The Pavilion Architecture of Stewardship exhibition.Find out all exhibition news on WAC's Venice Architecture Biennale page.Exhibition factsExhibition name: PseudonatureCommissioner:Alexander StaynovCurator:Iassen Markov; Exhibitors: Technobeton, Rosie EisorVenue: Sala Tiziano-Centro Culturale Don Orione Artigianelli, Fondamenta Delle Zattere Ai Gesuati 919The top image in the article: Image Pseudonature, 2025, concept image, courtesy of Iassen Markov.> via Bulgarian Pavilion
    0 Reacties ·0 aandelen ·75 Views
  • 'Pop the Balloon' Dating Show Is Coming to Netflix in April
    www.cnet.com
    Made famous on TikTok and YouTube, the hit viral series will air live.
    0 Reacties ·0 aandelen ·37 Views
  • Prime Members: Save Up to 50% on Alpaka Bags During Amazons Big Spring Sale
    www.cnet.com
    Alpaka bags are known for their rugged construction and durability. This top quality can be had for up to half off if you're an Amazon Prime Member.
    0 Reacties ·0 aandelen ·37 Views