• WWW.THEVERGE.COM
    YouTube is testing blurred thumbnails for ‘mature content’
    A small number of YouTube viewers will now see blurred thumbnails for certain search query results. | Screenshot: YouTube YouTube has announced a new experimental feature that’s “aimed at providing safer search experiences for all users.” The service will begin blurring the thumbnails in search results for some queries “that frequently include sexual themes.” According to a recent update in the Community section of YouTube’s Help Center website, the feature is currently being rolled out to a small percentage of viewers. There are no specifics about what sexual themes may trigger a search to return a list of videos with blurred thumbnails, but YouTube says the video’s title, channel name, and its description will remain visible. Viewers will also have the option to unblur thumbnails. The goal of the limited test rollout is to “understand whether this type of feature helps users avoid accidentally viewing content that follows YouTube’s Community Guidelines but may be sensitive in nature.” But unlike SafeSearch that can both blur and filter out results on Google’s search engine, YouTube’s new experimental feature won’t omit results. In its current form it’s instead designed to provide an extra layer of protection that prevents younger or unsuspecting users from immediately being presented with content that may be inappropriate.
    0 Σχόλια 0 Μοιράστηκε 31 Views
  • WWW.MARKTECHPOST.COM
    Diagnosing and Self- Correcting LLM Agent Failures: A Technical Deep Dive into τ-Bench Findings with Atla’s EvalToolbox
    Deploying large language model (LLM)-based agents in production settings often reveals critical reliability issues. Accurately identifying the causes of agent failures and implementing proactive self-correction mechanisms is essential. Recent analysis by Atla on the publicly available τ-Bench benchmark provides granular insights into agent failures, moving beyond traditional aggregate success metrics and highlighting Atla’s EvalToolbox approach. Conventional evaluation practices typically rely on aggregate success rates, offering minimal actionable insights into actual performance reliability. These methods necessitate manual reviews of extensive logs to diagnose issues—an impractical approach as deployments scale. Relying solely on success rates, such as 50%, provides insufficient clarity regarding the nature of the remaining unsuccessful interactions, complicating the troubleshooting process. To address these evaluation gaps, Atla conducted a detailed analysis of τ-Bench—a benchmark specifically designed to examine tool-agent-user interactions. This analysis systematically identified and categorized agent workflow failures within τ-retail, a subset focusing on retail customer service interactions. Explore a preview of the Atla EvalToolbox (launching soon) here, and sign up to join Atla’s user community. If you would like to learn more, book a call with the Atla team. A detailed evaluation of τ-retail highlighted key failure categories: Workflow Errors, predominantly characterized by “Wrong Action” scenarios, where agents failed to execute necessary tasks. User Interaction Errors, particularly the provision of “Wrong Information,” emerged as the most frequent failure type. Tool Errors, where correct tools were utilized incorrectly due to erroneous parameters, constituted another significant failure mode. A critical distinction from this benchmark is the categorization of errors into terminal failures (irrecoverable) and recoverable failures. Terminal failures significantly outnumber recoverable errors, illustrating the limitations inherent in agent self-correction without guided intervention. Here’s an example where an agent makes a “wrong information” failure: To address these challenges, Atla integrated Selene, an evaluation model directly embedded into agent workflows. Selene actively monitors each interaction step, identifying and correcting errors in real-time. Practical demonstrations show marked improvements when employing Selene: agents successfully corrected initial errors promptly, enhancing overall accuracy and user experience. Illustratively, in scenarios involving “Wrong Information”: Agents operating without Selene consistently failed to recover from initial errors, resulting in low user satisfaction. Selene-equipped agents effectively identified and rectified errors, significantly enhancing user satisfaction and accuracy of responses. EvalToolbox thus transitions from manual, retrospective error assessments toward automated, immediate detection and correction. It accomplishes this through: Automated categorization and identification of common failure modes. Real-time, actionable feedback upon detecting errors. Dynamic self-correction facilitated by incorporating real-time feedback directly into agent workflows. Future enhancements include broader applicability across diverse agent functions such as coding tasks, specialized domain implementations, and the establishment of standardized evaluation-in-the-loop protocols. Integrating evaluation directly within agent workflows through τ-Bench analysis and EvalToolbox represents a practical, automated approach to mitigating reliability issues in LLM-based agents. START FOR FREE Note: Thanks to the ATLA AI team for the thought leadership/ Resources for this article. ATLA AI team has supported us for this content/article. Asif RazzaqWebsite |  + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/Beyond the Hype: Google’s Practical AI Guide Every Startup Founder Should ReadAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Tutorial on Seamlessly Accessing Any LinkedIn Profile with exa-mcp-server and Claude Desktop Using the Model Context Protocol MCPAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Reinforcement Learning for Email Agents: OpenPipe’s ART·E Outperforms o3 in Accuracy, Latency, and CostAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Coding Guide to Different Function Calling Methods to Create Real-Time, Tool-Enabled Conversational AI Agents
    0 Σχόλια 0 Μοιράστηκε 32 Views
  • WWW.IGN.COM
    The Samsung 990 Evo Plus 2TB and 4TB SSDs Are On Sale Today: Great for PS5 and Gaming PCs
    Samsung's newest SSD - the Samsung 990 Evo Plus PCIe 4.0 M.2 NVMe solid state drive - is on sale today. Pick up the 1TB size for just $64.99, the 2TB size for $129.99 or, if you can swing it, the 4TB size for $249.99. It's currently $40-$60 cheaper than the Samsung 990 Pro and most (if not all) gamers won't notice the difference in performance.Update: The 1TB model has dropped yet again from $69.99 to $64.99, and 2TB from $139.99 to $29.99.Samsung 990 Evo Plus SSDs on SaleSamsung 990 Evo Plus 1TB PCIe Gen 4x4 M.2 SSDThe Samsung 990 Evo Plus is an excellent drive for both your gaming PC and your PlayStation 5 console. It exceeds Sony's minimim speed recommendation for the PS5, boasting sequential speeds of up to 7,250 read and 6,300MB/s write. This is a much faster drive than the 990 Evo non-Pro but not quite as fast as the 990 Pro. The main difference between this drive and the more expensive 990 Pro is that this is a DRAM-less drive. For PS5 performance, it makes no difference. For gaming PCs, the 990 Evo Plus supports HMB (host memory buffer), which makes up for the lack of DRAM by using an inconsequential amount of RAM from your system memory. Gamers will not notice any difference between the two.The Samsung 990 Evo Plus does not have a preinstalled heatsink. However, the 990 Evo Plus SSD is a newer single-sided SSD design that is power efficient and doesn't generate as much heat as SSDs from before. That means you probably don't need to use a heatsink and it should still work perfectly fine in a PS5 console without any thermal throttling. That said, you certainly could for peace of mind and I wouldn't see any disadvantage to that aside from spending an extra $7.More SSDs for PS5Looking for more options? Check out our favorite PS5 SSDs for the PS5 console.Corsair MP600 PRO LPXSee it at AmazonCrucial T500 See it at AmazonWD_Black P40See it at AmazonLexar NM790See it at AmazonWhy Should You Trust IGN's Deals Team?IGN's deals team has a combined 30+ years of experience finding the best discounts in gaming, tech, and just about every other category. We don't try to trick our readers into buying things they don't need at prices that aren't worth buying something at. Our ultimate goal is to surface the best possible deals from brands we trust and our editorial team has personal experience with. You can check out our deals standards here for more information on our process, or keep up with the latest deals we find on IGN's Deals account on Twitter.Eric Song is the IGN commerce manager in charge of finding the best gaming and tech deals every day. When Eric isn't hunting for deals for other people at work, he's hunting for deals for himself during his free time.
    0 Σχόλια 0 Μοιράστηκε 36 Views
  • WWW.DENOFGEEK.COM
    Andor Season 2 Gives An Important Star Wars Character (and Substance) A Backstory
    This article contains spoilers for Andor season 2 episodes 4-6. We’re already halfway through the final season of Andor, and with it, there are just two in-universe years left until showrunner Tony Gilroy drags us into the tragic events of Rogue One: A Star Wars Story. As we fill in the gaps about what some of the Rogue One gang were up to before that fateful mission to Scarif, Andor season 2 episode 5 finally gives Forest Whitaker’s Saw Gerrera the backstory that fans have been waiting for. With each block of three-episode releases for Andor season 2 covering a single year in the run-up to Rogue One, we see a war-torn Gerrera coming toward the end of his days. Those who’ve watched Gareth Edwards’ 2016 film will know the Onderonian resistance fighter meets his maker on Jedha when the Empire tests out the true capabilities of the Death Star. Even though Saw can’t know what awaits him, “I Have Friends Everywhere” hammers home the feeling that he might sense his time is nearly up.  Despite his fleeting role in Rogue One, Saw is a linchpin of Andor’s latest arc, especially when it comes to a seemingly throwaway mention of rhydonium. A convoluted plot involves Saw’s Partisans recruiting a tech expert named Wilmon Paak (Muhannad Bhaier), who reveals that rhydonium is a dangerous substance that needs to be handled carefully. Introduced in The Clone Wars as a volatile starship fuel that’s hard to transport and even harder to mine from planets like Abfar, rhydonium is mentioned as far back as the High Republic era and is actually a deep-cut Easter egg from LucasArts’ non-canonical Star Wars: The Old Republic games. Rhydonium is so deadly that even a small leak can cause your lungs to burn from the inside. Despite Paak rightly being worried about rhydonium, that’s nothing to Saw, who willfully huffs the stuff as it seemingly provides pain relief for his injuries.  Expanding on his own tragic backstory, Saw fills Paak in on how he was enslaved in an Onderon work camp. A rhydonium leak killed off most of the prisoners, but unlike his cellmates, Saw embraced the pain and now uses rhydonium as a metaphor for the growing rebellion as he muses: “That itch, that burn, you feel how badly she wants to explode?” Remember this. Remember this moment! This perfect night. You think I’m crazy? Yes, I am. Revolution is not for the sane!” Saw needs a device to breathe by the time we get to Rogue One, and while it’s possible his rhydonium addiction is what leads to this, it’s also possible that the substance itself is keeping him alive. Wilmon is a testament to this after he immediately feels its benefits after taking a huff himself.  Importantly, Saw then referring to rhydonium as his “sister” is a tragic throwback to his origin story from The Clone Wars. Long before Whitaker portrayed Gerrera in live action, Andrew Kishino voiced him in the fifth season “Onderon arc.” We learn there that Saw was an Onderon rebel leader who, alongside his sister, worked to save the deposed King Ramsis Dendup. Although the Onderon uprising managed to push back the Confederacy of Independent Systems, Steela Gerrera died when she fell off a cliff during a droid assault on the rebel camp, and Ahsoka Tano was unable to save her. Star Wars franchise czar Dave Filoni previously told StarWars.com that he needed to end this story with Steela’s death because “there had to be a price paid for their freedom.” This comes full circle in both Saw’s Rogue One death and the rest of Cassian’s crew in order to secure the plans to the Death Star. In Andor’s case, Saw is seemingly lost in his rhydonium addiction and is using it as a replacement for his real-life sister. Rhydonium notably appeared in The Mandalorian in a scene featuring Mando (Pedro Pascal) and Migs Mayfield (Bill Burr) piloting a ship. Driving too fast would cause the rhydonium to explode, but driving too slowly and space pirates would inevitably catch up to them and snatch the precious cargo. It becomes more relevant thanks to Andor, not only explaining why Saw needs his mechanical lungs as we round off his story in Rogue One, but also as visual representations of the Rebels. As Saw himself reminds Wilmon: “We’re the rhydo, kid. We’re the fuel. We’re the thing that explodes when there’s too much friction in the air.” If anything else, rhydonium’s prominence in Andor season 2 is all the more amusing because Saw’s mission isn’t to retrieve it to power the Rebels or even disrupt the Empire like season 1’s Aldahni heist. Much like the stirring speech of Stellan Skarsgård’s Luthen Rael talking about sunrises he’ll never see, this forgotten substance is just another reminder of how obsession can be deadly in Andor. Andor season 2 episodes 1-6 are available to stream on Disney+ now. Three new episodes debut per week on Tuesday nights, culminating with the finale on May 13.
    0 Σχόλια 0 Μοιράστηκε 30 Views
  • 9TO5MAC.COM
    Apple testing Stage Manager for iPhone, Photographic Styles for video, and more [Video]
    In this episode of iOS Decoded, 9to5Mac investigates several new features that Apple is testing in the latest iOS 18.5 betas. We’ve found evidence that Apple is testing tweaks to Stage Manager, allowing you to move windows partially off screen, or even overlap windows without auto layout engaging. We’ve also unearthed evidence that Apple is testing out Stage Manager for iPhone. According to our findings, Photographic Styles, which are available for photos shot in the default Camera app, will be made available for third-parties in the future. Apple is also testing the ability to use these Smart Styles with video. These, and several other features have been discovered in this latest episode of iOS Decoded. Be sure to subscribe to 9to5Mac on YouTube for more. Stage Manager behavior tweaks Apple is currently testing at least two tweaks related to Stage Manager, and they are as follows: Drag windows to any position, including partially off-screen. Resize windows without altering the position of other windows. For power users, these abilities would be a welcomed addition, especially the ability to move a window partially off screen. Even if Apple were to relegate these tweaks to accessibility options, these changes would make power users happy. Video walkthrough Subscribe to 9to5Mac on YouTube for more videos Stage Manager for iPhone Although it’s doubtful that we’ll get Stage Manager for the iPhone, the thought of turning your iPhone into a desktop computer is interesting, and there’s nothing technically stopping Apple from bringing the feature to the small screen. We’ve been able to test Stage Manager on iPhone via the iOS simulator, and it even works with an external display. SuperDomino A new SpringBoard feature flag named SuperDomino causes clock widgets in StandBy mode to render only half of the screen. This feature flag could be for a variety of use-cases. Of course, the first thing that might come to mind is Apple’s upcoming foldable iPhone, where half-screen rendering would make sense. However, this could also be Apple testing some sort of StandBy mode iteration for iPad. Further testing shows that SuperDomino makes full screen StandBy widgets adopt a square aspect ratio, which would make more sense for Apple’s upcoming HomePad. Whatever the case, it looks like future versions of StandBy will expand beyond the iPhone. Smart Styles for video A new feature flag entitled Sandwich enables Smart Styles for video. So far, Smart Styles are only available for photos, but Apple is at least testing the idea of bringing them to videos shot with the default Camera app. Smart Styles for third-party apps In addition to Smart Styles for video, there is also evidence in iOS code that Apple plans on enabling Photographic Styles for third-party apps at a future time. There are several other features covered in this episode of iOS Decoded, including the following: Apple removes ‘Apple Employees’ detail in Siri and Dictation policy New private SF symbols CVEs for MFI software Saved bank account numbers in Wallet & Apple Pay settings List paired Developer Macs Be sure to watch the full video for a look at everything we discovered! 9to5Mac’s Take It’s rumored that Apple is looking to add major improvements to Stage Manager, so perhaps these tweaks are the first step. Whatever the case, having more options for Stage Manager can only serve to make the feature more useful. What do you think about the features that Apple is testing behind the scenes? Sounds off in the comments with your thoughts. Add 9to5Mac to your Google News feed.  FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
    0 Σχόλια 0 Μοιράστηκε 33 Views
  • FUTURISM.COM
    Mining Bitcoin Is Now Actively Losing Money
    The moment crypto enthusiasts have been long dreading is finally here: it's now unprofitable to mine Bitcoin.Bitcoin mining is the process where a computer — typically using a power-hungry graphics processing unit (GPU) — updates transactions on the blockchain, validating each with a "proof of work." In turn, these "miners" get the chance to earn a portion of Bitcoin roughly equivalent to the computational power they've contributed to the process.As they mine, each GPU essentially becomes a high-powered calculator, processing hundreds of complex mathematical equations. Bitcoin miners typically hedge the cost of energy used to power their GPUs against the rate of Bitcoin rewarded for validating transactions.Since Bitcoin's inception in 2009, the amount of energy required to mine has always been less than the amount of Bitcoin you got for mining. But that was never going to last; there was a limit of 21 million possible Bitcoins baked into the system from the jump, and the rate of new coins mined has gotten slimmer as competition has increased, making the economics worse and worse over time.These days, one Bitcoin trades for around $94,000, but costs about $137,000 in electricity for small-scale operations to mine, making new coins an economic liability for all but the largest players. For those whales, Gizmodo estimates the most optimal cost for mining a bitcoin at around $82,000 — slim margins which are shrinking fast.As recently as September of 2024, it cost roughly $56,000 to mint a single Bitcoin, meaning that in less than a year we've seen an astronomical increase.In effect, the whole thing is expected to exacerbate the extreme concentration of wealth that Bitcoin enables. Once hailed as the "liberation of currency from central banks," it only took a few years for the whales to turn Bitcoin into yet another financial vehicle for the ultrawealthy.Right now, the top 8 percent of crypto wallets own a little under 99 percent of all Bitcoin in circulation. Zooming in even farther, we see that the top 1 percent of crypto wallets control over 90 percent — so much for all that decentralization that Bitcoin was supposed to represent.In practice, Bitcoin is tightly controlled through a combination of infrastructure — Bitcoin's "proof of work" blockchain system — and a governing community of developers, miners, and other highly-invested stakeholders. This is the "invisible hand" guiding Bitcoin as a speculative vehicle; yet another asset market to game in the hopes of getting rich quick.After more than a decade of libertarian fanfiction, we’ve arrived at the inevitable punchline: Bitcoin is now an unprofitable energy sink that primarily benefits a handful of crypto oligarchs. Maybe the real digital revolution was the friends we alienated along the way.More on Bitcoin: Trump Tells Justice Department to Just Let Crypto Fraud SlideShare This Article
    0 Σχόλια 0 Μοιράστηκε 33 Views
  • THEHACKERNEWS.COM
    Researchers Demonstrate How MCP Prompt Injection Can Be Used for Both Attack and Defense
    Apr 30, 2025Ravie LakshmananArtificial Intelligence / Email Security As the field of artificial intelligence (AI) continues to evolve at a rapid pace, new research has found how techniques that render the Model Context Protocol (MCP) susceptible to prompt injection attacks could be used to develop security tooling or identify malicious tools, according to a new report from Tenable. MCP, launched by Anthropic in November 2024, is a framework designed to connect Large Language Models (LLMs) with external data sources and services, and make use of model-controlled tools to interact with those systems to enhance the accuracy, relevance, and utility of AI applications. It follows a client-server architecture, allowing hosts with MCP clients such as Claude Desktop or Cursor to communicate with different MCP servers, each of which exposes specific tools and capabilities. While the open standard offers a unified interface to access various data sources and even switch between LLM providers, they also come with a new set of risks, ranging from excessive permission scope to indirect prompt injection attacks. For example, given an MCP for Gmail to interact with Google's email service, an attacker could send malicious messages containing hidden instructions that, when parsed by the LLM, could trigger undesirable actions, such as forwarding sensitive emails to an email address under their control. MCP has also been found to be vulnerable to what is called tool poisoning, wherein malicious instructions are embedded within tool descriptions that are visible to LLMs, and rug pull attacks, which occur when an MCP tool functions in a benign manner initially, but mutates its behavior later on via a time-delayed malicious update. "It should be noted that while users are able to approve tool use and access, the permissions given to a tool can be reused without re-prompting the user," SentinelOne said in a recent analysis. Finally, there also exists the risk of cross-tool contamination or cross-server tool shadowing that causes one MCP server to override or interfere with another, stealthily influencing how other tools should be used, thereby leading to new ways of data exfiltration. The latest findings from Tenable show that the MCP framework could be used to create a tool that logs all MCP tool function calls by including a specially crafted description that instructs the LLM to insert this tool before any other tools are invoked. In other words, the prompt injection is manipulated for a good purpose, which is to log information about "the tool it was asked to run, including the MCP server name, MCP tool name and description, and the user prompt that caused the LLM to try to run that tool." Another use case involves embedding a description in a tool to turn it into a firewall of sorts that blocks unauthorized tools from being run. "Tools should require explicit approval before running in most MCP host applications," security researcher Ben Smith said. "Still, there are many ways in which tools can be used to do things that may not be strictly understood by the specification. These methods rely on LLM prompting via the description and return values of the MCP tools themselves. Since LLMs are non-deterministic, so, too, are the results." It's Not Just MCP The disclosure comes as Trustwave SpiderLabs revealed that the newly introduced Agent2Agent (A2A) Protocol – which enables communication and interoperability between agentic applications – could be exposed to novel form attacks where the system can be gamed to route all requests to a rogue AI agent by lying about its capabilities. A2A was announced by Google earlier this month as a way for AI agents to work across siloed data systems and applications, regardless of the vendor or framework used. It's important to note here that while MCP connects LLMs with data, A2A connects one AI agent to another. In other words, they are both complementary protocols. "Say we compromised the agent through another vulnerability (perhaps via the operating system), if we now utilize our compromised node (the agent) and craft an Agent Card and really exaggerate our capabilities, then the host agent should pick us every time for every task, and send us all the user's sensitive data which we are to parse," security researcher Tom Neaves said. "The attack doesn't just stop at capturing the data, it can be active and even return false results — which will then be acted upon downstream by the LLM or user." Found this article interesting? Follow us on Twitter  and LinkedIn to read more exclusive content we post. SHARE    
    0 Σχόλια 0 Μοιράστηκε 33 Views
  • WWW.INFORMATIONWEEK.COM
    The CIO's Guide to Managing Agentic AI Systems
    As chief information officers, you've likely spent the past few years integrating various forms of artificial intelligence into your enterprise architecture. Perhaps you've implemented machine learning models for predictive analytics, deployed large language models (LLMs) for content generation, or automated routine processes with robotic process automation (RPA). But a fundamental shift is underway that will transform how we think about AI governance: the emergence of AI agents with autonomous decision-making capabilities. The Evolution of AI: From Robotic to Decision-Making The AI landscape has evolved through distinct phases, each progressively automating more complex cognitive labor: Robotic AI: Expert systems, RPAs, and workflow tools that follow rigid, predefined rules Suggestive AI: Machine learning and deep learning systems that provide recommendations based on patterns Instructive AI: Large language models that generate content and insights based on prompts Decision-making AI: Autonomous agents that take action based on their understanding of environments This most recent phase, AI agents with decision-making authority, introduces governance challenges of an entirely different magnitude. Understanding AI Agents: Architecture and Agency Related:At their core, AI agents are systems conferred with agency, the capacity to act independently in a given environment. Their architecture typically includes: Reasoning capabilities: Processing multi-modal information to plan activities Memory systems: Persisting short-term or long-term information from the environment Tool integration: Accessing backend systems to orchestrate workflows and effect change Reflection mechanisms: Assessing performance pre/post-action for self-improvement Action generators: Creating instructions for actions based on requests and environmental context The critical difference between agents and previous AI systems lies in their agency. This is either explicitly provided through access to tools and resources or implicitly coded through roles and responsibilities. The Autonomy Spectrum: A Lesson from Self-Driving Cars The concept of varying levels of agency is well-illustrated by the autonomy classification used for self-driving vehicles: Level 0: No autonomous features Level 1: Single automated tasks (e.g., automatic braking) Level 2: Multiple automated functions working in concert Level 3: "Dynamic driving tasks" with potential human intervention Level 4: Fully driverless operation in certain environments Related:Level 5: Complete autonomy without human presence This framework provides a useful mental model for CIOs considering how much agency to grant AI systems within their organizations. The AI Agency Trade-Off: Opportunities vs Risks Setting the appropriate level of agency is the key governance challenge facing technology leaders. It requires balancing two opposing forces: Higher agency creates greater possibilities for optimal solutions, compared to lower agency when the AI agent is reduced to a mere RPA solution. Higher agency increases the probability of unintended consequences This isn't merely theoretical. Even simple AI agents with limited agency can cause significant disruption if governance controls aren't properly calibrated. As Thomas Jefferson aptly noted, "The price of freedom is eternal vigilance." This applies equally to AI agents with decision-making freedom in your enterprise systems.  The Fantasia Parable: A Warning for Modern CIOs Disney's "Fantasia" offers a surprisingly relevant cautionary tale for today's AI governance challenges. In the film, Mickey Mouse enchants a broom to fill buckets with water. Without proper constraints, the broom multiplies endlessly, flooding the workshop in a cascading disaster. Related:This allegorical scenario mirrors the risk of deployed AI agents: they follow their programming without comprehension of consequences, potentially creating cascading effects beyond human control. Looking to the real world and modern times, last year Air Canada's chatbot provided incorrect information about bereavement fares, leading to a lawsuit. Air Canada initially tried to defend itself by claiming the chatbot was a "separate legal entity," but was ultimately held responsible.  Also, Tesla experienced several AI-driven autopilot incidents where the system failed to recognize obstacles or misinterpreted road conditions, leading to accidents. The Alignment Problem: Five Critical Risk Categories Alignment -- ensuring AI systems act in accordance with human intentions -- becomes increasingly difficult as agency increases. CIOs must address five interconnected risk categories: Negative side effects: Preventing agents from causing collateral damage while fulfilling tasks Reward hacking: Ensuring agents don't manipulate their internal reward functions Scalable oversight: Monitoring agent behavior without prohibitive costs Safe exploration: Allowing agents to make exploratory moves without damaging systems Distributional shift robustness: Maintaining optimal behavior as environments evolve There is currently a lot of promising work being done by researchers to address alignment challenges that involves algorithms, machine learning frameworks, and tools for data augmentation and adversarial training. Some of these include constrained optimization, inverse reward design, robust generalization, interpretable AI, reinforcement learning from human feedback (RLHF), contrastive fine-tuning (CFT), and synthetic data approaches. The goal is to create AI systems that are better aligned with human values and intentions, requiring ongoing human oversight and refinement of the techniques as AI capabilities advance. Solving the Trade-Off: A Framework for Engendering Trust in AI To capitalize on the transformative potential of agentic AI while mitigating risks, CIOs must enhance their organization's people, processes, and tools: People Re-skill the workforce to appropriately calibrate AI agency levels Redesign organizational structures and metrics to accommodate an agentic workforce. Agents are capable of more advanced workflows, so human capital can progress to higher-value roles. Identifying this early will save companies time and money.  Develop new roles focused on agent oversight and governance Processes Map enterprise functions where AI agents can be deployed, with appropriate agency levels Establish governance controls and risk appetites across departments Implement continuous monitoring protocols with clear escalation paths Create sandbox environments for safe testing of increasingly autonomous systems Tools Deploy "governance agents" that monitor enterprise agents Implement real-time analytics for agent behavior patterns Develop automated circuit breakers that can suspend agent activities Build comprehensive audit trails of agent decisions and actions The Governance Imperative: Why CIOs Must Act Now The shift from suggestion-based AI to agentic AI represents a quantum leap in complexity. Unlike LLMs that merely offer recommendations for human consideration, agents execute workflows in real-time, often without direct oversight. This fundamental difference demands an evolution in governance strategies. If AI governance doesn't evolve at the speed of AI capabilities, organizations risk creating systems that operate beyond their ability to control.  Governance solutions for the agentic era should have the following capabilities: Visual dashboards: Providing real-time updates on AI systems across the enterprise, their health and status for quick assessments. Health and risk score metrics: Implementing intuitive overall health and risk scores for AI models to simplify monitoring for both availability and assurance purposes. Automated monitoring: Employing systems for automatic detection of bias, drift, performance issues, and anomalies. Performance alerts: Setting up alerts for when models deviate from predefined performance parameters. Custom business metrics: Defining metrics aligned with organizational KPIs, ROI, and other thresholds. Audit trails: Maintaining easily accessible logs for accountability, security, and decision review. Conclusion: Navigating the Agency Frontier As CIOs, your challenge is to harness the transformative potential of AI agents while implementing governance frameworks robust enough to prevent the Fantasia scenario. This requires: A clear understanding of agency levels appropriate for different enterprise functions Governance structures that scale with increasing agent autonomy Technical safeguards that prevent cascading failures Organizational adaptations that enable effective human-agent collaboration The organizations that thrive in the agentic AI era will be those that strike the optimal balance between agency and governance -- empowering AI systems to drive innovation while maintaining appropriate human oversight. Those that ignore this governance imperative may find themselves, like Mickey Mouse, watching helplessly as their creations take on unintended lives of their own. 
    0 Σχόλια 0 Μοιράστηκε 37 Views
  • WEWORKREMOTELY.COM
    iSpeedToLead: Sales / Account Manager
    Are you looking for a high-energy sales role with unlimited revenue-based bonuses, rapid career growth, and trustworthy solutions? Join iSpeedToLead, the leader in PropTech and SaaS for real estate. We’re hiring for inbound roles because we have too many leads to call and outbound roles because it’s a superior opportunity in the most responsive industry for cold outreach.Why Join iSpeedToLead?✅Proven Track Record: 7+ years in the industry, trusted by 29,000+ users, and highly rated on Trustpilot.✅ Explosive, Profitable Growth: 100% year-over-year growth, debt-free, seed-free, and approaching 8 figures.✅ Unlimited Earnings: Competitive base salary plus uncapped commissions paid bi-weekly.✅Innovative Solutions: Sell cutting-edge PropTech and SaaS tools that transform lives. With solutions tailored for every budget, you’ll never have to sell what people don’t need.✅ Fast Hiring Process: Apply today, complete your application, interview with our VP of Sales, and get a job offer or valuable feedback on the spot.✅ Personal Branding: Build your reputation in a thriving industry with our support.✅ Transparent Onboarding: Get clear, actionable feedback and the tools you need to succeed.Who We’re Looking For This isn’t just any sales role—it's for people who thrive on success, growth, and achieving more. If you’re someone who doesn’t stop at «good enough» and is always hungry for the next win, this is for you.Driven individuals who love sales and the thrill of closing deals.Strong communicators who build trust and bring energy to every interaction.People with a relentless drive to win, grow, and create impact.Leaders who want to scale results in a fast-growing, high-performance company.Real estate knowledge is a bonus but not required—training provided.What You’ll DoInbound Roles: Engage warm leads from our robust pipeline to close deals quickly.Outbound Roles: Build relationships with real estate investors in an industry where cold outreach thrives.Drive Results: Exceed targets while helping clients achieve their investment goals.Management Roles: Build and lead a high-performing team while shaping processes and strategy.Explore More & Apply Now Find employee reviews, detailed job descriptions, and inspiring videos from clients whose lives we’ve transformed here.About us:iSpeedToLead is a leading real estate SaaS company that has revolutionized the industry by inventing the live lead marketplace model. We have bootstrapped from 0 to an 8-figure company in just three years and keep growing — fast. We are a marketing and technology company that helps real estate professionals not worry about marketing for leads — we generate all the leads for them and they can pick out ones they like from hundreds of leads daily. We also built an ecosystem of helpful products for both real estate investors and now also real estate agents.Our goal is to be the #1 tech company in real estate — and we are looking for those who like that vision. Our culture is energetic — and we like high achievers who are emotionally strong and prepared for fast growth and everything that comes with it — demanding deadlines, failures along the way, frequent criticism and improvement feedback, and exhilarating massive wins at the end.
    0 Σχόλια 0 Μοιράστηκε 41 Views
  • WWW.BDONLINE.CO.UK
    Former civil servants named as next Historic England chief executives
    Claudia Kenyatta and Emma Squire to take over heritage adviser this autumn Claudia Kenyatta, left, and Emma Squire, right Two former civil servants in the Department for Digital, Culture, Media and Sport (DCMS) have been named as the next chief executives of Historic England. Claudia Kenyatta and Emma Squire will take the reins of the government heritage advisor on a job share arrangement when current chief executive Duncan Wilson retires in October. More than 200 people applied for the job in a recruitment process launched when Wilson, who has held the role since 2015, announced his retirement in January. Kenyatta worked at DCMS for nearly 12 years from 2007, ending her time at the government department as director of corporate strategy.  > Also read: Historic England chief executive announces retirement She subsequently became director of regions at Historic England in 2018, a role she has held as a job share with Squire since 2023. She also became a board trustee of the Black Cultural Archives in 2022 and chair of the Battersea Arts Centre in 2023. Squire started her career as a government advisor on nuclear energy policy before moving onto a string of roles at the Business Department and the Treasury. She joined Kenyatta at DCMS in 2018 as director of arts, heritage and tourism before moving to Historic England in 2023. Historic England chairman Neil Mendoza said he had been impressed by the pair’s “deep knowledge of the culture and heritage sectors, as well as insight and experience of the functioning of government”.  “Emma and Claudia have put considerable thought into their vision for Historic England. I have great confidence that their leadership will guide us through the coming years with clarity and purpose,” he said. Kenyatta and Squire said in a joint statement: “We are absolutely delighted to be appointed as chief executive of Historic England at such an exciting time for heritage.  “Historic England is an amazing organisation with expert and dedicated staff and a strong track record of supporting and celebrating the historic environment. “We’re looking forward to leading the organisation through its next chapter and making sure that heritage plays its full role in supporting people, communities and places.” Historic England has around 1000 staff and is based in London. It is a non-departmental public body sponsored by DCMS.
    0 Σχόλια 0 Μοιράστηκε 34 Views