0 Commentarios
·0 Acciones
·117 Views
-
How a stubborn computer scientist accidentally launched the deep learning boomarstechnica.comDeep learning How a stubborn computer scientist accidentally launched the deep learning boom "Youve taken this idea way too far," a mentor told Prof. Fei-Fei Li. Timothy B. Lee Nov 11, 2024 7:00 am | 13 Credit: Aurich Lawson | Getty Images Credit: Aurich Lawson | Getty Images Story textSizeSmallStandardLargeWidth *StandardWideLinksStandardOrange* Subscribers only Learn moreDuring my first semester as a computer science graduate student at Princeton, I took COS 402: Artificial Intelligence. Toward the end of the semester, there was a lecture about neural networks. This was in the fall of 2008, and I got the distinct impressionboth from that lecture and the textbookthat neural networks had become a backwater.Neural networks had delivered some impressive results in the late 1980s and early 1990s. But then progress stalled. By 2008, many researchers had moved on to mathematically elegant approaches such as support vector machines.I didnt know it at the time, but a team at Princetonin the same computer science building where I was attending lectureswas working on a project that would upend the conventional wisdom and demonstrate the power of neural networks. That team, led by Prof. Fei-Fei Li, wasnt working on a better version of neural networks. They were hardly thinking about neural networks at all.Rather, they were creating a new image dataset that would be far larger than any that had come before: 14 million images, each labeled with one of nearly 22,000 categories.Li tells the story of ImageNet in her recent memoir, The Worlds I See. As she worked on the project, she faced plenty of skepticism from friends and colleagues.I think youve taken this idea way too far, a mentor told her a few months into the project in 2007. The trick is to grow with your field. Not to leap so far ahead of it.It wasnt just that building such a large dataset was a massive logistical challenge. People doubted that the machine learning algorithms of the day would benefit from such a vast collection of images.Pre-ImageNet, people did not believe in data, Li said in a September interview at the Computer History Museum. Everyone was working on completely different paradigms in AI with a tiny bit of data.Ignoring negative feedback, Li pursued the project for more than two years. It strained her research budget and the patience of her graduate students. When she took a new job at Stanford in 2009, she took several of those studentsand the ImageNet projectwith her to California.ImageNet received little attention for the first couple of years after its release in 2009. But in 2012, a team from the University of Toronto trained a neural network on the ImageNet dataset, achieving unprecedented performance in image recognition. That groundbreaking AI model, dubbed AlexNet after lead author Alex Krizhevsky, kicked off the deep learning boom that has continued to the present day.AlexNet would not have succeeded without the ImageNet dataset. AlexNet also would not have been possible without a platform called CUDA, which allowed Nvidias graphics processing units (GPUs) to be used in non-graphics applications. Many people were skeptical when Nvidia announced CUDA in 2006.So the AI boom of the last 12 years was made possible by three visionaries who pursued unorthodox ideas in the face of widespread criticism. One was Geoffrey Hinton, a University of Toronto computer scientist who spent decades promoting neural networks despite near-universal skepticism. The second was Jensen Huang, the CEO of Nvidia, who recognized early that GPUs could be useful for more than just graphics.The third was Fei-Fei Li. She created an image dataset that seemed ludicrously large to most of her colleagues. But it turned out to be essential for demonstrating the potential of neural networks trained on GPUs.Geoffrey HintonA neural network is a network of thousands, millions, or even billions of neurons. Each neuron is a mathematical function that produces an output based on a weighted average of its inputs.Suppose you want to create a network that can identify handwritten decimal digits like the number two in the red square above. Such a network would take in an intensity value for each pixel in an image and output a probability distribution over the ten possible digits0, 1, 2, and so forth.To train such a network, you first initialize it with random weights. You then run it on a sequence of example images. For each image, you train the network by strengthening the connections that push the network toward the right answer (in this case, a high-probability value for the 2 output) and weakening connections that push toward a wrong answer (a low probability for 2 and high probabilities for other digits). If trained on enough example images, the model should start to predict a high probability for 2 when shown a twoand not otherwise.In the late 1950s, scientists started to experiment with basic networks that had a single layer of neurons. However, their initial enthusiasm cooled as they realized that such simple networks lacked the expressive power required for complex computations.Deeper networksthose with multiple layershad the potential to be more versatile. But in the 1960s, no one knew how to train them efficiently. This was because changing a parameter somewhere in the middle of a multi-layer network could have complex and unpredictable effects on the output.So by the time Hinton began his career in the 1970s, neural networks had fallen out of favor. Hinton wanted to study them, but he struggled to find an academic home in which to do so. Between 1976 and 1986, Hinton spent time at four different research institutions: Sussex University, the University of California San Diego (UCSD), a branch of the UK Medical Research Council, and finally Carnegie Mellon, where he became a professor in 1982. Geoffrey Hinton speaking in Toronto in June. Credit: Photo by Mert Alper Dervis/Anadolu via Getty Images Geoffrey Hinton speaking in Toronto in June. Credit: Photo by Mert Alper Dervis/Anadolu via Getty Images In a landmark 1986 paper, Hinton teamed up with two of his former colleagues at UCSD, David Rumelhart and Ronald Williams, to describe a technique called backpropagation for efficiently training deep neural networks.Their idea was to start with the final layer of the network and work backward. For each connection in the final layer, the algorithm computes a gradienta mathematical estimate of whether increasing the strength of that connection would push the network toward the right answer. Based on these gradients, the algorithm adjusts each parameter in the models final layer.The algorithm then propagates these gradients backward to the second-to-last layer. A key innovation here is a formulabased on the chain rule from high school calculusfor computing the gradients in one layer based on gradients in the following layer. Using these new gradients, the algorithm updates each parameter in the second-to-last layer of the model. The gradients then get propagated backward to the third-to-last layer, and the whole process repeats once again.The algorithm only makes small changes to the model in each round of training. But as the process is repeated over thousands, millions, billions, or even trillions of training examples, the model gradually becomes more accurate.Hinton and his colleagues werent the first to discover the basic idea of backpropagation. But their paper popularized the method. As people realized it was now possible to train deeper networks, it triggered a new wave of enthusiasm for neural networks.Hinton moved to the University of Toronto in 1987 and began attracting young researchers who wanted to study neural networks. One of the first was the French computer scientist Yann LeCun, who did a year-long postdoc with Hinton before moving to Bell Labs in 1988.Hintons backpropagation algorithm allowed LeCun to train models deep enough to perform well on real-world tasks like handwriting recognition. By the mid-1990s, LeCuns technology was working so well that banks started to use it for processing checks.At one point, LeCuns creation read more than 10 percent of all checks deposited in the United States, wrote Cade Metz in his 2022 book Genius Makers.But when LeCun and other researchers tried to apply neural networks to larger and more complex images, it didnt go well. Neural networks once again fell out of fashion, and some researchers who had focused on neural networks moved on to other projects.Hinton never stopped believing that neural networks could outperform other machine learning methods. But it would be many years before hed have access to enough data and computing power to prove his case.Jensen Huang Jensen Huang speaking in Denmark in October. Credit: Photo by MADS CLAUS RASMUSSEN/Ritzau Scanpix/AFP via Getty Images Jensen Huang speaking in Denmark in October. Credit: Photo by MADS CLAUS RASMUSSEN/Ritzau Scanpix/AFP via Getty Images The brain of every personal computer is a central processing unit (CPU). These chips are designed to perform calculations in order, one step at a time. This works fine for conventional software like Windows and Office. But some video games require so many calculations that they strain the capabilities of CPUs. This is especially true of games like Quake, Call of Duty, and Grand Theft Auto, which render three-dimensional worlds many times per second.So gamers rely on GPUs to accelerate performance. Inside a GPU are many execution unitsessentially tiny CPUspackaged together on a single chip. During gameplay, different execution units draw different areas of the screen. This parallelism enables better image quality and higher frame rates than would be possible with a CPU alone.Nvidia invented the GPU in 1999 and has dominated the market ever since. By the mid-2000s, Nvidia CEO Jensen Huang suspected that the massive computing power inside a GPU would be useful for applications beyond gaming. He hoped scientists could use it for compute-intensive tasks like weather simulation or oil exploration.So in 2006, Nvidia announced the CUDA platform. CUDA allows programmers to write kernels, short programs designed to run on a single execution unit. Kernels allow a big computing task to be split up into bite-sized chunks that can be processed in parallel. This allows certain kinds of calculations to be completed far faster than with a CPU alone.But there was little interest in CUDA when it was first introduced, wrote Steven Witt in The New Yorker last year:When CUDA was released, in late 2006, Wall Street reacted with dismay. Huang was bringing supercomputing to the masses, but the masses had shown no indication that they wanted such a thing.They were spending a fortune on this new chip architecture, Ben Gilbert, the co-host of Acquired, a popular Silicon Valley podcast, said. They were spending many billions targeting an obscure corner of academic and scientific computing, which was not a large market at the timecertainly less than the billions they were pouring in.Huang argued that the simple existence of CUDA would enlarge the supercomputing sector. This view was not widely held, and by the end of 2008, Nvidias stock price had declined by seventy percentDownloads of CUDA hit a peak in 2009, then declined for three years. Board members worried that Nvidias depressed stock price would make it a target for corporate raiders.Huang wasnt specifically thinking about AI or neural networks when he created the CUDA platform. But it turned out that Hintons backpropagation algorithm could easily be split up into bite-sized chunks. So training neural networks turned out to be a killer app for CUDA.According to Witt, Hinton was quick to recognize the potential of CUDA:In 2009, Hintons research group used Nvidias CUDA platform to train a neural network to recognize human speech. He was surprised by the quality of the results, which he presented at a conference later that year. He then reached out to Nvidia. I sent an e-mail saying, Look, I just told a thousand machine-learning researchers they should go and buy Nvidia cards. Can you send me a free one? Hinton told me. They said no.Despite the snub, Hinton and his graduate students, Alex Krizhevsky and Ilya Sutskever, obtained a pair of Nvidia GTX 580 GPUs for the AlexNet project. Each GPU had 512 execution units, allowing Krizhevsky and Sutskever to train a neural network hundreds of times faster than would be possible with a CPU. This speed allowed them to train a larger modeland to train it on many more training images. And they would need all that extra computing power to tackle the massive ImageNet dataset.Fei-Fei Li Fei-Fei Li at the SXSW conference in 2018. Credit: Photo by Hubert Vestil/Getty Images for SXSW Fei-Fei Li at the SXSW conference in 2018. Credit: Photo by Hubert Vestil/Getty Images for SXSW Fei-Fei Li wasnt thinking about either neural networks or GPUs as she began a new job as a computer science professor at Princeton in January of 2007. While earning her PhD at Caltech, she had built a dataset called Caltech 101 that had 9,000 images across 101 categories.That experience had taught her that computer vision algorithms tended to perform better with larger and more diverse training datasets. Not only had Li found her own algorithms performed better when trained on Caltech 101, but other researchers also started training their models using Lis dataset and comparing their performance to one another. This turned Caltech 101 into a benchmark for the field of computer vision.So when she got to Princeton, Li decided to go much bigger. She became obsessed with an estimate by vision scientist Irving Biederman that the average person recognizes roughly 30,000 different kinds of objects. Li started to wonder if it would be possible to build a truly comprehensive image datasetone that included every kind of object people commonly encounter in the physical world.A Princeton colleague told Li about WordNet, a massive database that attempted to catalog and organize 140,000 words. Li called her new dataset ImageNet, and she used WordNet as a starting point for choosing categories. She eliminated verbs and adjectives, as well as intangible nouns like truth. That left a list of 22,000 countable objects ranging from ambulance to zucchini.She planned to take the same approach shed taken with the Caltech 101 dataset: use Googles image search to find candidate images, then have a human being verify them. For the Caltech 101 dataset, Li had done this herself over the course of a few months. This time she would need more help. She planned to hire dozens of Princeton undergraduates to help her choose and label images.But even after heavily optimizing the labeling processfor example, pre-downloading candidate images so theyre instantly available for students to reviewLi and her graduate student Jia Deng calculated that it would take more than 18 years to select and label millions of images.The project was saved when Li learned about Amazon Mechanical Turk, a crowdsourcing platform Amazon had launched a couple of years earlier. Not only was AMTs international workforce more affordable than Princeton undergraduates, but the platform was also far more flexible and scalable. Lis team could hire as many people as they needed, on demand, and pay them only as long as they had work available.AMT cut the time needed to complete ImageNet down from 18 to two years. Li writes that her lab spent two years on the knife-edge of our finances as the team struggled to complete the ImageNet project. But they had enough funds to pay three people to look at each of the 14 million images in the final data set.ImageNet was ready for publication in 2009, and Li submitted it to the Conference on Computer Vision and Pattern Recognition, which was held in Miami that year. Their paper was accepted, but it didnt get the kind of recognition Li hoped for.ImageNet was relegated to a poster session, Li writes. This meant that we wouldnt be presenting our work in a lecture hall to an audience at a predetermined time but would instead be given space on the conference floor to prop up a large-format print summarizing the project in hopes that passersby might stop and ask questions After so many years of effort, this just felt anticlimactic.To generate public interest, Li turned ImageNet into a competition. Realizing that the full dataset might be too unwieldy to distribute to dozens of contestants, she created a much smaller (but still massive) dataset with 1,000 categories and 1.4 million images.The first years competition in 2010 generated a healthy amount of interest, with 11 teams participating. The winning entry was based on support vector machines. Unfortunately, Li writes, it was only a slight improvement over cutting-edge work found elsewhere in our field.The second year of the ImageNet competition attracted fewer entries than the first. The winning entry in 2011 was another support vector machine, and it just barely improved on the performance of the 2010 winner. Li started to wonder if the critics had been right. Maybe ImageNet was too much for most algorithms to handle.For two years running, well-worn algorithms had exhibited only incremental gains in capabilities, while true progress seemed all but absent, Li writes. If ImageNet was a bet, it was time to start wondering if wed lost.But when Li reluctantly staged the competition a third time in 2012, the results were totally different. Geoff Hintons team was the first to submit a model based on a deep neural network. And its top-5 accuracy was 85 percent10 percentage points better than the 2011 winner.Lis initial reaction was incredulity: Most of us saw the neural network as a dusty artifact encased in glass and protected by velvet ropes.This is proof Yann LeCun testifies before the US Senate in September. Credit: Photo by Kevin Dietsch/Getty Images Yann LeCun testifies before the US Senate in September. Credit: Photo by Kevin Dietsch/Getty Images The ImageNet winners were scheduled to be announced at the European Conference on Computer Vision in Florence, Italy. Li, who had a baby at home in California, was planning to skip the event. But when she saw how well AlexNet had done on her dataset, she realized this moment would be too important to miss: I settled reluctantly on a twenty-hour slog of sleep deprivation and cramped elbow room.On an October day in Florence, Alex Krizhevsky presented his results to a standing-room-only crowd of computer vision researchers. Fei-Fei Li was in the audience. So was Yann LeCun.Cade Metz reports that after the presentation, LeCun stood up and called AlexNet an unequivocal turning point in the history of computer vision. This is proof.The success of AlexNet vindicated Hintons faith in neural networks, but it was arguably an even bigger vindication for LeCun.AlexNet was a convolutional neural network, a type of neural network that LeCun had developed 20 years earlier to recognize handwritten digits on checks. (For more details on how CNNs work, see the in-depth explainer I wrote for Ars in 2018.) Indeed, there were few architectural differences between AlexNet and LeCuns image recognition networks from the 1990s.AlexNet was simply far larger. In a 1998 paper, LeCun described a document-recognition network with seven layers and 60,000 trainable parameters. AlexNet had eight layers, but these layers had 60 million trainable parameters.LeCun could not have trained a model that large in the early 1990s because there were no computer chips with as much processing power as a 2012-era GPU. Even if LeCun had managed to build a big enough supercomputer, he would not have had enough images to train it properly. Collecting those images would have been hugely expensive in the years before Google and Amazon Mechanical Turk.And this is why Fei-Fei Lis work on ImageNet was so consequential. She didnt invent convolutional networks or figure out how to make them run efficiently on GPUs. But she provided the training data that large neural networks needed to reach their full potential.The technology world immediately recognized the importance of AlexNet. Hinton and his students formed a shell company with the goal to be acquihired by a big tech company. Within months, Google purchased the company for $44 million. Hinton worked at Google for the next decade while retaining his academic post in Toronto. Ilya Sutskever spent a few years at Google before becoming a cofounder of OpenAI.AlexNet also made Nvidia GPUs the industry standard for training neural networks. In 2012, the market valued Nvidia at less than $10 billion. Today, Nvidia is one of the most valuable companies in the world, with a market capitalization north of $3 trillion. That high valuation is driven mainly by overwhelming demand for GPUs like the H100 that are optimized for training neural networks.Sometimes the conventional wisdom is wrongThat moment was pretty symbolic to the world of AI because three fundamental elements of modern AI converged for the first time, Li said in a September interview at the Computer History Museum. The first element was neural networks. The second element was big data, using ImageNet. And the third element was GPU computing.Today, leading AI labs believe the key to progress in AI is to train huge models on vast data sets. Big technology companies are in such a hurry to build the data centers required to train larger models that theyve started to lease out entire nuclear power plants to provide the necessary power.You can view this as a straightforward application of the lessons of AlexNet. But I wonder if we ought to draw the opposite lesson from AlexNet: that its a mistake to become too wedded to conventional wisdom.Scaling laws have had a remarkable run in the 12 years since AlexNet, and perhaps well see another generation or two of impressive results as the leading labs scale up their foundation models even more.But we should be careful not to let the lessons of AlexNet harden into dogma. I think theres at least a chance that scaling laws will run out of steam in the next few years. And if that happens, well need a new generation of stubborn nonconformists to notice that the old approach isnt working and try something different.Tim Lee was on staff at Ars from 2017 to 2021. Last year, he launched a newsletter,Understanding AI,that explores how AI works and how it's changing our world. You can subscribehere.Timothy B. LeeSenior tech policy reporterTimothy B. LeeSenior tech policy reporter Timothy is a senior reporter covering tech policy and the future of transportation. He lives in Washington DC. 13 Comments Prev story0 Commentarios ·0 Acciones ·99 Views
-
How a stubborn computer scientist accidentally launched the deep learning boomarstechnica.comDeep learning How a stubborn computer scientist accidentally launched the deep learning boom "Youve taken this idea way too far," a mentor told Prof. Fei-Fei Li. Timothy B. Lee Nov 11, 2024 7:00 am | 13 Credit: Aurich Lawson | Getty Images Credit: Aurich Lawson | Getty Images Story textSizeSmallStandardLargeWidth *StandardWideLinksStandardOrange* Subscribers only Learn moreDuring my first semester as a computer science graduate student at Princeton, I took COS 402: Artificial Intelligence. Toward the end of the semester, there was a lecture about neural networks. This was in the fall of 2008, and I got the distinct impressionboth from that lecture and the textbookthat neural networks had become a backwater.Neural networks had delivered some impressive results in the late 1980s and early 1990s. But then progress stalled. By 2008, many researchers had moved on to mathematically elegant approaches such as support vector machines.I didnt know it at the time, but a team at Princetonin the same computer science building where I was attending lectureswas working on a project that would upend the conventional wisdom and demonstrate the power of neural networks. That team, led by Prof. Fei-Fei Li, wasnt working on a better version of neural networks. They were hardly thinking about neural networks at all.Rather, they were creating a new image dataset that would be far larger than any that had come before: 14 million images, each labeled with one of nearly 22,000 categories.Li tells the story of ImageNet in her recent memoir, The Worlds I See. As she worked on the project, she faced plenty of skepticism from friends and colleagues.I think youve taken this idea way too far, a mentor told her a few months into the project in 2007. The trick is to grow with your field. Not to leap so far ahead of it.It wasnt just that building such a large dataset was a massive logistical challenge. People doubted that the machine learning algorithms of the day would benefit from such a vast collection of images.Pre-ImageNet, people did not believe in data, Li said in a September interview at the Computer History Museum. Everyone was working on completely different paradigms in AI with a tiny bit of data.Ignoring negative feedback, Li pursued the project for more than two years. It strained her research budget and the patience of her graduate students. When she took a new job at Stanford in 2009, she took several of those studentsand the ImageNet projectwith her to California.ImageNet received little attention for the first couple of years after its release in 2009. But in 2012, a team from the University of Toronto trained a neural network on the ImageNet dataset, achieving unprecedented performance in image recognition. That groundbreaking AI model, dubbed AlexNet after lead author Alex Krizhevsky, kicked off the deep learning boom that has continued to the present day.AlexNet would not have succeeded without the ImageNet dataset. AlexNet also would not have been possible without a platform called CUDA, which allowed Nvidias graphics processing units (GPUs) to be used in non-graphics applications. Many people were skeptical when Nvidia announced CUDA in 2006.So the AI boom of the last 12 years was made possible by three visionaries who pursued unorthodox ideas in the face of widespread criticism. One was Geoffrey Hinton, a University of Toronto computer scientist who spent decades promoting neural networks despite near-universal skepticism. The second was Jensen Huang, the CEO of Nvidia, who recognized early that GPUs could be useful for more than just graphics.The third was Fei-Fei Li. She created an image dataset that seemed ludicrously large to most of her colleagues. But it turned out to be essential for demonstrating the potential of neural networks trained on GPUs.Geoffrey HintonA neural network is a network of thousands, millions, or even billions of neurons. Each neuron is a mathematical function that produces an output based on a weighted average of its inputs.Suppose you want to create a network that can identify handwritten decimal digits like the number two in the red square above. Such a network would take in an intensity value for each pixel in an image and output a probability distribution over the ten possible digits0, 1, 2, and so forth.To train such a network, you first initialize it with random weights. You then run it on a sequence of example images. For each image, you train the network by strengthening the connections that push the network toward the right answer (in this case, a high-probability value for the 2 output) and weakening connections that push toward a wrong answer (a low probability for 2 and high probabilities for other digits). If trained on enough example images, the model should start to predict a high probability for 2 when shown a twoand not otherwise.In the late 1950s, scientists started to experiment with basic networks that had a single layer of neurons. However, their initial enthusiasm cooled as they realized that such simple networks lacked the expressive power required for complex computations.Deeper networksthose with multiple layershad the potential to be more versatile. But in the 1960s, no one knew how to train them efficiently. This was because changing a parameter somewhere in the middle of a multi-layer network could have complex and unpredictable effects on the output.So by the time Hinton began his career in the 1970s, neural networks had fallen out of favor. Hinton wanted to study them, but he struggled to find an academic home in which to do so. Between 1976 and 1986, Hinton spent time at four different research institutions: Sussex University, the University of California San Diego (UCSD), a branch of the UK Medical Research Council, and finally Carnegie Mellon, where he became a professor in 1982. Geoffrey Hinton speaking in Toronto in June. Credit: Photo by Mert Alper Dervis/Anadolu via Getty Images Geoffrey Hinton speaking in Toronto in June. Credit: Photo by Mert Alper Dervis/Anadolu via Getty Images In a landmark 1986 paper, Hinton teamed up with two of his former colleagues at UCSD, David Rumelhart and Ronald Williams, to describe a technique called backpropagation for efficiently training deep neural networks.Their idea was to start with the final layer of the network and work backward. For each connection in the final layer, the algorithm computes a gradienta mathematical estimate of whether increasing the strength of that connection would push the network toward the right answer. Based on these gradients, the algorithm adjusts each parameter in the models final layer.The algorithm then propagates these gradients backward to the second-to-last layer. A key innovation here is a formulabased on the chain rule from high school calculusfor computing the gradients in one layer based on gradients in the following layer. Using these new gradients, the algorithm updates each parameter in the second-to-last layer of the model. The gradients then get propagated backward to the third-to-last layer, and the whole process repeats once again.The algorithm only makes small changes to the model in each round of training. But as the process is repeated over thousands, millions, billions, or even trillions of training examples, the model gradually becomes more accurate.Hinton and his colleagues werent the first to discover the basic idea of backpropagation. But their paper popularized the method. As people realized it was now possible to train deeper networks, it triggered a new wave of enthusiasm for neural networks.Hinton moved to the University of Toronto in 1987 and began attracting young researchers who wanted to study neural networks. One of the first was the French computer scientist Yann LeCun, who did a year-long postdoc with Hinton before moving to Bell Labs in 1988.Hintons backpropagation algorithm allowed LeCun to train models deep enough to perform well on real-world tasks like handwriting recognition. By the mid-1990s, LeCuns technology was working so well that banks started to use it for processing checks.At one point, LeCuns creation read more than 10 percent of all checks deposited in the United States, wrote Cade Metz in his 2022 book Genius Makers.But when LeCun and other researchers tried to apply neural networks to larger and more complex images, it didnt go well. Neural networks once again fell out of fashion, and some researchers who had focused on neural networks moved on to other projects.Hinton never stopped believing that neural networks could outperform other machine learning methods. But it would be many years before hed have access to enough data and computing power to prove his case.Jensen Huang Jensen Huang speaking in Denmark in October. Credit: Photo by MADS CLAUS RASMUSSEN/Ritzau Scanpix/AFP via Getty Images Jensen Huang speaking in Denmark in October. Credit: Photo by MADS CLAUS RASMUSSEN/Ritzau Scanpix/AFP via Getty Images The brain of every personal computer is a central processing unit (CPU). These chips are designed to perform calculations in order, one step at a time. This works fine for conventional software like Windows and Office. But some video games require so many calculations that they strain the capabilities of CPUs. This is especially true of games like Quake, Call of Duty, and Grand Theft Auto, which render three-dimensional worlds many times per second.So gamers rely on GPUs to accelerate performance. Inside a GPU are many execution unitsessentially tiny CPUspackaged together on a single chip. During gameplay, different execution units draw different areas of the screen. This parallelism enables better image quality and higher frame rates than would be possible with a CPU alone.Nvidia invented the GPU in 1999 and has dominated the market ever since. By the mid-2000s, Nvidia CEO Jensen Huang suspected that the massive computing power inside a GPU would be useful for applications beyond gaming. He hoped scientists could use it for compute-intensive tasks like weather simulation or oil exploration.So in 2006, Nvidia announced the CUDA platform. CUDA allows programmers to write kernels, short programs designed to run on a single execution unit. Kernels allow a big computing task to be split up into bite-sized chunks that can be processed in parallel. This allows certain kinds of calculations to be completed far faster than with a CPU alone.But there was little interest in CUDA when it was first introduced, wrote Steven Witt in The New Yorker last year:When CUDA was released, in late 2006, Wall Street reacted with dismay. Huang was bringing supercomputing to the masses, but the masses had shown no indication that they wanted such a thing.They were spending a fortune on this new chip architecture, Ben Gilbert, the co-host of Acquired, a popular Silicon Valley podcast, said. They were spending many billions targeting an obscure corner of academic and scientific computing, which was not a large market at the timecertainly less than the billions they were pouring in.Huang argued that the simple existence of CUDA would enlarge the supercomputing sector. This view was not widely held, and by the end of 2008, Nvidias stock price had declined by seventy percentDownloads of CUDA hit a peak in 2009, then declined for three years. Board members worried that Nvidias depressed stock price would make it a target for corporate raiders.Huang wasnt specifically thinking about AI or neural networks when he created the CUDA platform. But it turned out that Hintons backpropagation algorithm could easily be split up into bite-sized chunks. So training neural networks turned out to be a killer app for CUDA.According to Witt, Hinton was quick to recognize the potential of CUDA:In 2009, Hintons research group used Nvidias CUDA platform to train a neural network to recognize human speech. He was surprised by the quality of the results, which he presented at a conference later that year. He then reached out to Nvidia. I sent an e-mail saying, Look, I just told a thousand machine-learning researchers they should go and buy Nvidia cards. Can you send me a free one? Hinton told me. They said no.Despite the snub, Hinton and his graduate students, Alex Krizhevsky and Ilya Sutskever, obtained a pair of Nvidia GTX 580 GPUs for the AlexNet project. Each GPU had 512 execution units, allowing Krizhevsky and Sutskever to train a neural network hundreds of times faster than would be possible with a CPU. This speed allowed them to train a larger modeland to train it on many more training images. And they would need all that extra computing power to tackle the massive ImageNet dataset.Fei-Fei Li Fei-Fei Li at the SXSW conference in 2018. Credit: Photo by Hubert Vestil/Getty Images for SXSW Fei-Fei Li at the SXSW conference in 2018. Credit: Photo by Hubert Vestil/Getty Images for SXSW Fei-Fei Li wasnt thinking about either neural networks or GPUs as she began a new job as a computer science professor at Princeton in January of 2007. While earning her PhD at Caltech, she had built a dataset called Caltech 101 that had 9,000 images across 101 categories.That experience had taught her that computer vision algorithms tended to perform better with larger and more diverse training datasets. Not only had Li found her own algorithms performed better when trained on Caltech 101, but other researchers also started training their models using Lis dataset and comparing their performance to one another. This turned Caltech 101 into a benchmark for the field of computer vision.So when she got to Princeton, Li decided to go much bigger. She became obsessed with an estimate by vision scientist Irving Biederman that the average person recognizes roughly 30,000 different kinds of objects. Li started to wonder if it would be possible to build a truly comprehensive image datasetone that included every kind of object people commonly encounter in the physical world.A Princeton colleague told Li about WordNet, a massive database that attempted to catalog and organize 140,000 words. Li called her new dataset ImageNet, and she used WordNet as a starting point for choosing categories. She eliminated verbs and adjectives, as well as intangible nouns like truth. That left a list of 22,000 countable objects ranging from ambulance to zucchini.She planned to take the same approach shed taken with the Caltech 101 dataset: use Googles image search to find candidate images, then have a human being verify them. For the Caltech 101 dataset, Li had done this herself over the course of a few months. This time she would need more help. She planned to hire dozens of Princeton undergraduates to help her choose and label images.But even after heavily optimizing the labeling processfor example, pre-downloading candidate images so theyre instantly available for students to reviewLi and her graduate student Jia Deng calculated that it would take more than 18 years to select and label millions of images.The project was saved when Li learned about Amazon Mechanical Turk, a crowdsourcing platform Amazon had launched a couple of years earlier. Not only was AMTs international workforce more affordable than Princeton undergraduates, but the platform was also far more flexible and scalable. Lis team could hire as many people as they needed, on demand, and pay them only as long as they had work available.AMT cut the time needed to complete ImageNet down from 18 to two years. Li writes that her lab spent two years on the knife-edge of our finances as the team struggled to complete the ImageNet project. But they had enough funds to pay three people to look at each of the 14 million images in the final data set.ImageNet was ready for publication in 2009, and Li submitted it to the Conference on Computer Vision and Pattern Recognition, which was held in Miami that year. Their paper was accepted, but it didnt get the kind of recognition Li hoped for.ImageNet was relegated to a poster session, Li writes. This meant that we wouldnt be presenting our work in a lecture hall to an audience at a predetermined time but would instead be given space on the conference floor to prop up a large-format print summarizing the project in hopes that passersby might stop and ask questions After so many years of effort, this just felt anticlimactic.To generate public interest, Li turned ImageNet into a competition. Realizing that the full dataset might be too unwieldy to distribute to dozens of contestants, she created a much smaller (but still massive) dataset with 1,000 categories and 1.4 million images.The first years competition in 2010 generated a healthy amount of interest, with 11 teams participating. The winning entry was based on support vector machines. Unfortunately, Li writes, it was only a slight improvement over cutting-edge work found elsewhere in our field.The second year of the ImageNet competition attracted fewer entries than the first. The winning entry in 2011 was another support vector machine, and it just barely improved on the performance of the 2010 winner. Li started to wonder if the critics had been right. Maybe ImageNet was too much for most algorithms to handle.For two years running, well-worn algorithms had exhibited only incremental gains in capabilities, while true progress seemed all but absent, Li writes. If ImageNet was a bet, it was time to start wondering if wed lost.But when Li reluctantly staged the competition a third time in 2012, the results were totally different. Geoff Hintons team was the first to submit a model based on a deep neural network. And its top-5 accuracy was 85 percent10 percentage points better than the 2011 winner.Lis initial reaction was incredulity: Most of us saw the neural network as a dusty artifact encased in glass and protected by velvet ropes.This is proof Yann LeCun testifies before the US Senate in September. Credit: Photo by Kevin Dietsch/Getty Images Yann LeCun testifies before the US Senate in September. Credit: Photo by Kevin Dietsch/Getty Images The ImageNet winners were scheduled to be announced at the European Conference on Computer Vision in Florence, Italy. Li, who had a baby at home in California, was planning to skip the event. But when she saw how well AlexNet had done on her dataset, she realized this moment would be too important to miss: I settled reluctantly on a twenty-hour slog of sleep deprivation and cramped elbow room.On an October day in Florence, Alex Krizhevsky presented his results to a standing-room-only crowd of computer vision researchers. Fei-Fei Li was in the audience. So was Yann LeCun.Cade Metz reports that after the presentation, LeCun stood up and called AlexNet an unequivocal turning point in the history of computer vision. This is proof.The success of AlexNet vindicated Hintons faith in neural networks, but it was arguably an even bigger vindication for LeCun.AlexNet was a convolutional neural network, a type of neural network that LeCun had developed 20 years earlier to recognize handwritten digits on checks. (For more details on how CNNs work, see the in-depth explainer I wrote for Ars in 2018.) Indeed, there were few architectural differences between AlexNet and LeCuns image recognition networks from the 1990s.AlexNet was simply far larger. In a 1998 paper, LeCun described a document-recognition network with seven layers and 60,000 trainable parameters. AlexNet had eight layers, but these layers had 60 million trainable parameters.LeCun could not have trained a model that large in the early 1990s because there were no computer chips with as much processing power as a 2012-era GPU. Even if LeCun had managed to build a big enough supercomputer, he would not have had enough images to train it properly. Collecting those images would have been hugely expensive in the years before Google and Amazon Mechanical Turk.And this is why Fei-Fei Lis work on ImageNet was so consequential. She didnt invent convolutional networks or figure out how to make them run efficiently on GPUs. But she provided the training data that large neural networks needed to reach their full potential.The technology world immediately recognized the importance of AlexNet. Hinton and his students formed a shell company with the goal to be acquihired by a big tech company. Within months, Google purchased the company for $44 million. Hinton worked at Google for the next decade while retaining his academic post in Toronto. Ilya Sutskever spent a few years at Google before becoming a cofounder of OpenAI.AlexNet also made Nvidia GPUs the industry standard for training neural networks. In 2012, the market valued Nvidia at less than $10 billion. Today, Nvidia is one of the most valuable companies in the world, with a market capitalization north of $3 trillion. That high valuation is driven mainly by overwhelming demand for GPUs like the H100 that are optimized for training neural networks.Sometimes the conventional wisdom is wrongThat moment was pretty symbolic to the world of AI because three fundamental elements of modern AI converged for the first time, Li said in a September interview at the Computer History Museum. The first element was neural networks. The second element was big data, using ImageNet. And the third element was GPU computing.Today, leading AI labs believe the key to progress in AI is to train huge models on vast data sets. Big technology companies are in such a hurry to build the data centers required to train larger models that theyve started to lease out entire nuclear power plants to provide the necessary power.You can view this as a straightforward application of the lessons of AlexNet. But I wonder if we ought to draw the opposite lesson from AlexNet: that its a mistake to become too wedded to conventional wisdom.Scaling laws have had a remarkable run in the 12 years since AlexNet, and perhaps well see another generation or two of impressive results as the leading labs scale up their foundation models even more.But we should be careful not to let the lessons of AlexNet harden into dogma. I think theres at least a chance that scaling laws will run out of steam in the next few years. And if that happens, well need a new generation of stubborn nonconformists to notice that the old approach isnt working and try something different.Tim Lee was on staff at Ars from 2017 to 2021. Last year, he launched a newsletter,Understanding AI,that explores how AI works and how it's changing our world. You can subscribehere.Timothy B. LeeSenior tech policy reporterTimothy B. LeeSenior tech policy reporter Timothy is a senior reporter covering tech policy and the future of transportation. He lives in Washington DC. 13 Comments Prev story0 Commentarios ·0 Acciones ·126 Views
-
Next Steps to Secure Open Banking Beyond Regulatory Compliancewww.informationweek.comFinal rules from the Consumer Financial Protection Bureau further the march towards open banking. What will it take to keep such data sharing secure?0 Commentarios ·0 Acciones ·121 Views
-
Getting a Handle on AI Hallucinationswww.informationweek.comJohn Edwards, Technology Journalist & AuthorNovember 11, 20244 Min ReadCarloscastilla via Alamy Stock PhotoAI hallucination occurs when a large language model (LLM) -- frequently a generative AI chatbot or computer vision tool -- perceives patterns or objects that are nonexistent or imperceptible to human observers, generating outputs that are either inaccurate or nonsensical.AI hallucinations can pose a significant challenge, particularly in high-stakes fields where accuracy is crucial, such as the energy industry, life sciences and healthcare, technology, finance, and legal sectors, says Beena Ammanath, head of technology trust and ethics at business advisory firm Deloitte. With generative AI's emergence, the importance of validating outputs has become even more critical for risk mitigation and governance, she states in an email interview. "While AI systems are becoming more advanced, hallucinations can undermine trust and, therefore, limit the widespread adoption of AI technologies."Primary CausesAI hallucinations are primarily caused by the nature of generative AI and LLMs, which rely on vast amounts of data to generate predictions, Ammanath says. "When the AI model lacks sufficient context, it may attempt to fill in the gaps by creating plausible sounding, but incorrect, information." This can occur due to incomplete training data, bias in the training data, or ambiguous prompts, she notes.Related:LLMs are generally trained for specific tasks, such as predicting the next word in a sequence, observes Swati Rallapalli, a senior machine learning research scientist in the AI division of the Carnegie Mellon University Software Engineering Institute. "These models are trained on terabytes of data from the Internet, which may include uncurated information," she explains in an online interview. "When generating text, the models produce outputs based on the probabilities learned during training, so outputs can be unpredictable and misrepresent facts."Detection ApproachesDepending on the specific application, hallucination metrics tools, such as AlignScore, can be trained to capture any similarity between two text inputs. Yet automated metrics don't always work effectively. "Using multiple metrics together, such as AlignScore, with metrics like BERTScore, may improve the detection," Rallapalli says.Another established way to minimize hallucinations is by using retrieval augmented generation (RAG), in which the model references the text from established databases relevant to the output. "There's also research in the area of fine-tuning models on curated datasets for factual correctness," Rallapalli says.Related:Yet even using existing multiple metrics may not fully guarantee hallucination detection. Therefore, further research is needed to develop more effective metrics to detect inaccuracies, Rallapalli says. "For example, comparing multiple AI outputs could detect if there are parts of the output that are inconsistent across different outputs or, in case of summarization, chunking up the summaries could better detect if the different chunks are aligned with facts within the original article." Such methods could help detect hallucinations better, she notes.Ammanath believes that detecting AI hallucinations requires a multi-pronged approach. She notes that human oversight, in which AI-generated content is reviewed by experts who can cross-check facts, is sometimes the only reliable way to curb hallucinations. "For example, if using generative AI to write a marketing e-mail, the organization might have a higher tolerance for error, as faults or inaccuracies are likely to be easy to identify and the outcomes are lower stakes for the enterprise," Ammanath explains. Yet when it comes to applications that include mission-critical business decisions, error tolerance must be low. "This makes a 'human-in the-loop', someone who validates model outputs, more important than ever before."Related:Hallucination TrainingThe best way to minimize hallucinations is by building your own pre-trained fundamental generative AI model, advises Scott Zoldi, chief AI officer at credit scoring service FICO. He notes, via email, that many organizations are now already using, or planning to use, this approach utilizing focused-domain and task-based models. "By doing so, one can have critical control of the data used in pre-training -- where most hallucinations arise -- and can constrain the use of context augmentation to ensure that such use doesn't increase hallucinations but re-enforces relationships already in the pre-training."Outside of building your own focused generative models, one needs to minimize harm created by hallucinations, Zoldi says. "[Enterprise] policy should prioritize a process for how the output of these tools will be used in a business context and then validate everything," he suggests.A Final ThoughtTo prepare the enterprise for a bold and successful future with generative AI, it's necessary to understand the nature and scale of the risks, as well as the governance tactics that can help mitigate them, Ammanath says. "AI hallucinations help to highlight both the power and limitations of current AI development and deployment."About the AuthorJohn EdwardsTechnology Journalist & AuthorJohn Edwards is a veteran business technology journalist. His work has appeared in The New York Times, The Washington Post, and numerous business and technology publications, including Computerworld, CFO Magazine, IBM Data Management Magazine, RFID Journal, and Electronic Design. He has also written columns for The Economist's Business Intelligence Unit and PricewaterhouseCoopers' Communications Direct. John has authored several books on business technology topics. His work began appearing online as early as 1983. Throughout the 1980s and 90s, he wrote daily news and feature articles for both the CompuServe and Prodigy online services. His "Behind the Screens" commentaries made him the world's first known professional blogger.See more from John EdwardsNever Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.SIGN-UPYou May Also LikeWebinarsMore WebinarsReportsMore Reports0 Commentarios ·0 Acciones ·122 Views
-
How IT Can Show Business Value From GenAI Investmentswww.informationweek.comNishad Acharya, Head of Talent Network, TuringNovember 11, 20244 Min ReadNicoElNino via Alamy StockAs IT leaders, were facing increasing pressure to prove that our generative AI investments translate into measurable and meaningful business outcomes. It's not enough to adopt the latest cutting-edge technology; we have a responsibility to show that AI delivers tangible results that directly support our business objectives.To truly maximize ROI from GenAI, IT leaders need to take a strategic approach -- one that seamlessly integrates AI into business operations, aligns with organizational goals, and generates quantifiable outcomes. Lets explore advanced strategies for overcoming GenAI implementation challenges, integrating AI with existing systems, and measuring ROI effectively.Key Challenges in Implementing GenAIIntegrating GenAI into enterprise systems isnt always straightforward. There are several hurdles IT leaders face, especially surrounding data and system complexity. Data governance and infrastructure. AI is only as good as the data its trained on. Strong data governance enforces better accuracy and compliance, especially when AI models are trained on vast, unstructured data sets. Building AI-friendly infrastructure that can handle both the scale and complexity of AI data pipelines is another challenge, as these systems must be resilient and adaptable.Related:Model accuracy and hallucinations. GenAI models can produce non-deterministic results, sometimes generating content that is inaccurate or entirely fabricated. Unlike traditional software with clear input-output relationships that can be unit-tested, GenAI models require a different approach to validation. This issue introduces risks that must be carefully managed through model testing, fine-tuning, and human-in-the-loop feedback.Security, privacy, and legal concerns. The widespread use of publicly and privately sourced data in training GenAI models raises critical security and legal questions. Enterprises must navigate evolving legal landscapes. Data privacy and security concerns must also be addressed to avoid potential breaches or legal issues, especially when dealing with heavily regulated industries like finance or healthcare.Strategies for Measuring and Maximizing AI ROIAdopting a comprehensive, metrics-driven approach to AI implementation is necessary for assessing your investments business impact. To ensure GenAI delivers meaningful business results, here are some effective strategies:Define high-impact use cases and objectives: Start with clear, measurable objectives that align with core business priorities. Whether its improving operational efficiency or streamlining customer support, identifying use cases with direct business relevance ensures AI projects are focused and impactful.Quantify both tangible and intangible benefits: Beyond immediate cost savings, GenAI drives value through intangible benefits like improved decision-making or customer satisfaction. Quantifying these benefits gives a fuller picture of the overall ROI.Focus on getting the use case right, before optimizing costs: LLMs are still evolving. It is recommended that you first use the best model (likely most expensive), prove that the LLM can achieve the end goal, and then identify ways to reduce cost to serve that use case. This will make sure that the business need is not left unmet.Run pilot programs before full rollout: Test AI in controlled environments first to validate use cases and refine your ROI model. Pilot programs allow organizations to learn, iterate, and de-risk before full-scale deployment, as well as pinpoint areas where AI delivers the greatest value, learn, iterate, and de-risk before full-scale deployment.Track and optimize costs throughout the lifecycle: One of the most overlooked elements of AI ROI is the hidden costs of data preparation, integration, and maintenance that can spiral if left unchecked. IT leaders should continuously monitor expenses related to infrastructure, data management, training, and human resources.Continuous monitoring and feedback: AI performance should be tracked continuously against KPIs and adjusted based on real-world data. Regular feedback loops allow for continuous fine-tuning, ensuring your investment aligns with evolving business needs and delivers sustained value. Related:Overcoming GenAI Implementation RoadblocksRelated:Successful GenAI implementations depend on more than adopting the right technologythey require an approach that maximizes value while minimizing risk. For most IT leaders, success depends on addressing challenges like data quality, model reliability, and organizational alignment. Heres how to overcome common implementation hurdles:Align AI with high-impact business goals. GenAI projects should directly support business objectives and deliver sustainable value like streamlining operations, cutting costs, or generating new revenue streams. Define priorities based on their impact and feasibility.Prioritize data integrity. Poor data quality prevents effective AI. Take time to establish data governance protocols from the start to manage privacy, compliance, and integrity while minimizing risk tied to faulty data.Start with pilot projects. Pilot projects allow you to test and iterate real-world impact before committing to large-scale rollouts. They offer valuable insights and mitigate risk.Monitor and measure continuously. Ongoing performance tracking ensures AI remains aligned with evolving business goals. Continuous adjustments are key for maximizing long-term value.About the AuthorNishad AcharyaHead of Talent Network, TuringNishad Acharya leads initiatives focused on the acquisition and experience of the 3M global professionals on Turing's Talent Cloud. At Turing, he has led critical roles in Strategy and Product that helped scale the company to a Unicorn. With a B.Tech from IIT Madras and an MBA from Wharton, Nishad has a strong foundation in both technology and business. Previously, he led strategy & digital transformation projects at The Boston Consulting Group. Nishad brings a passion for AI and expertise in tech services coupled with extensive experience in sectors like financial services and energy.See more from Nishad AcharyaNever Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.SIGN-UPYou May Also LikeWebinarsMore WebinarsReportsMore Reports0 Commentarios ·0 Acciones ·124 Views
-
Any delay in reaching net zero will influence climate for centurieswww.newscientist.comIce collapsing into the water at Perito Moreno Glacier in Los Glaciares National Park, ArgentinaR.M. Nunes/AlamyEven a few years delay in reaching net-zero emissions will have repercussions for hundreds or even thousands of years, leading to warmer oceans, more extensive ice loss in Antarctica and higher temperatures around the world.Nations around the world have collectively promised to prevent more than 2C of global warming, a goal that can only be achieved by reaching net-zero emissions effectively ending almost all human-caused greenhouse gas emissions before the end of the century. But once that hugely challenging goal0 Commentarios ·0 Acciones ·109 Views
-
Any delay in reaching net zero will influence climate for centurieswww.newscientist.comIce collapsing into the water at Perito Moreno Glacier in Los Glaciares National Park, ArgentinaR.M. Nunes/AlamyEven a few years delay in reaching net-zero emissions will have repercussions for hundreds or even thousands of years, leading to warmer oceans, more extensive ice loss in Antarctica and higher temperatures around the world.Nations around the world have collectively promised to prevent more than 2C of global warming, a goal that can only be achieved by reaching net-zero emissions effectively ending almost all human-caused greenhouse gas emissions before the end of the century. But once that hugely challenging goal0 Commentarios ·0 Acciones ·114 Views
-
The Download: AI in Africa, and reporting in the age of Trumpwww.technologyreview.comThis is today's edition ofThe Download,our weekday newsletter that provides a daily dose of what's going on in the world of technology. What Africa needs to do to become a major AI player Africa is still early in the process of adopting AI technologies. But researchers say the continent is uniquely hospitable to it for several reasons, including a relatively young and increasingly well-educated population, a rapidly growing ecosystem of AI startups, and lots of potential consumers. However, ambitious efforts to develop AI tools that answer the needs of Africans face numerous hurdles. The biggest are inadequate funding and poor infrastructure. Limited internet access and a scarcity of domestic data centers also mean that developers might not be able to deploy cutting-edge AI capabilities. Complicating this further is a lack of overarching policies or strategies for harnessing AIs immense benefitsand regulating its downsides. Taken together, researchers worry, these issues will hold Africas AI sector back and hamper its efforts to pave its own pathway in the global AI race. Read the full story. Abdullahi Tsanni Science and technology stories in the age of Trump Mat Honan Ive spent most of this year being pretty convinced that Donald Trump would be the 47th president of the United States. Even so, like most people, I was completely surprised by the scope of his victory. This level of victory will certainly provide the political capital to usher in a broad sweep of policy changes. Some of these changes will be well outside our lane as a publication. But very many of President-elect Trumps stated policy goals will have direct impacts on science and technology. So I thought I would share some of my remarks from our edit meeting on Wednesday morning, when we woke up to find out that the world had indeed changed. Read the full story. This story is from The Debrief, the weekly newsletter from our editor in chief Mat Honan. Sign up to receive it in your inbox every Friday. The must-reads Ive combed the internet to find you todays most fun/important/scary/fascinating stories about technology. 1 Canada has recorded its first known bird flu case in a human Officials are investigating how the teenager was exposed to the virus. (NPR)+ Canada insists that the risk to the public remains low. (Reuters)+ Why virologists are getting increasingly nervous about bird flu. (MIT Technology Review)2 How MAGA became a rallying call for young men The Republicans online strategy tapped into the desires of disillusioned Gen Z men. (WP $)+ Elon Musk is assembling a list of favorable would-be Trump advisors. (FT $) 3 Trumps victory is a win for the US defense industry Palmer Luckeys Anduril is anticipating a lucrative next four years. (Insider $)+ Heres what Luckey has to say about the Pentagons future of mixed reality. (MIT Technology Review)+ Traditional weapons are being given AI upgrades. (Wired $)4 This year is highly likely to be the hottest on recordThis weeks Cop29 climate summit will thrash out future policies. (The Guardian) + A little-understood contributor to the weather? Microplastics. (Wired $)+ Trumps win is a tragic loss for climate progress. (MIT Technology Review)5 Ukraine is scrambling to repair its power stations Workers are dismantling plants to repair other stations hit by Russian attacks. (WSJ $)+ Meet the radio-obsessed civilian shaping Ukraines drone defense. (MIT Technology Review)6 We need better ways to evaluate LLMs Tech giants are coming up with better methods of measuring these systems. (FT $)+ The improvements in the tech behind ChatGPT appear to be slowing. (The Information $)+ AI hype is built on high test scores. Those tests are flawed. (MIT Technology Review)7 FTX is suing crypto exchange BinanceIt claims Sam Bankman-Fried fraudulently transferred close to $1.8 billion to Binance in 2021. (Bloomberg $) + Meanwhile, bitcoin is surging to new record heights. (Reuters)8 What we know about tech and lonelinessWhile theres little evidence tech directly makes us lonely, theres a strong correlation between the two. (NYT $) 9 Whats next for space policy in the US If one persons interested in the cosmos, its Elon Musk. (Ars Technica)10 Could you save the Earth from a killer asteroid? Its a game thats part strategy, part luck. (New Scientist $)+ Earth is probably safe from a killer asteroid for 1,000 years. (MIT Technology Review) Quote of the day Conflict of interest seems rather quaint. Gita Johar, a professor at Columbia Business School, tells the Guardian about Donald Trump and Elon Musks openly transactional relationship. The big story Quartz, cobalt, and the waste we leave behind May 2024 It is easy to convince ourselves that we now live in a dematerialized ethereal world, ruled by digital startups, artificial intelligence, and financial services. Yet there is little evidence that we have decoupled our economy from its churning hunger for resources. We are still reliant on the products of geological processes like coal and quartz, a mineral thats a rich source of the silicon used to build computer chips, to power our world. Three recent books aim to reconnect readers with the physical reality that underpins the global economy. Each one fills in dark secrets about the places, processes, and lived realities that make the economy tick, and reveals just how tragic a toll the materials we rely on take for humans and the environment. Read the full story. Matthew Ponsford We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or tweet 'em at me.)+ Oscars buzz has already begun, and this years early contenders are an interesting bunch.+ This sweet art project shows how toys age with love + Who doesnt love pretzels? Heres how to make sure they end up with the perfect fluffy interior and a glossy, chewy crust.+ These images of plankton are really quite something.0 Commentarios ·0 Acciones ·110 Views
-
Science and technology stories in the age of Trumpwww.technologyreview.comRather than analyzing the news this week, I thought Id lift the hood a bit on how we make it. Ive spent most of this year being pretty convinced that Donald Trump would be the 47th president of the United States. Even so, like most people, I was completely surprised by the scope of his victory. By taking the lions share not just in the Electoral College but also the popular vote, coupled with the wins in the Senate (and, as I write this, seemingly the House) and ongoing control of the courts, Trump has done far more than simply eke out a win. This level of victory will certainly provide the political capital to usher in a broad sweep of policy changes. Some of these changes will be well outside our lane as a publication. But very many of President-elect Trumps stated policy goals will have direct impacts on science and technology. Some of the proposed changes would have profound effects on the industries and innovations weve covered regularly, and for years. When he talks about his intention toend EV subsidies, hit the brakes on FTC enforcement actions on Big Tech, ease the rules on crypto, or impose a 60 percent tariff on goods from China, these are squarely in our strike zone and we would be remiss not to explore the policies and their impact in detail. And so I thought I would share some of my remarks from our edit meeting on Wednesday morning, when we woke up to find out that the world had indeed changed. I think its helpful for our audience if we are transparent and upfront about how we intend to operate, especially over the next several months that will likely be, well, chaotic. This is a moment when our jobs are more important than ever. There will be so much noise and heat out there in the coming weeks and months, and maybe even years. The next six months in particular will be a confusing time for a lot of people. We should strive to be the signal in that noise. We have extremely important stories to write about the role of science and technology in the new administration. There are obvious stories for us to take on in regards to climate, energy, vaccines, womens health, IVF, food safety, chips, China, and Im sure a lot more, that people are going to have all sorts of questions about. Lets start by making a list of questions we have ourselves. Some of the people and technologies we cover will be ascendant in all sorts of ways. We should interrogate that power. Its important that we take care in those stories not to be speculative or presumptive. To always have the facts buttoned up. To speak the truth and be unassailable in doing so. Do we drop everything and only cover this? No. But it will certainly be a massive story that affects nearly all others. This election will be a transformative moment for society and the world. Trump didnt just win, he won a mandate. And hes going to change the country and the global order as a result. The next few weeks will see so much speculation as to what it all means. So much fear, uncertainty, and doubt. There is an enormous amount of bullshit headed down the line. People will be hungry for sources they can trust. We should be there for that. Lets leverage our credibility, not squander it. We are not the resistance. We just want to tell the truth. So lets take a breath, and then go out there and do our jobs. I like to tell our reporters and editors that our coverage should be free from either hype or cynicism. I think thats especially true now. Im also very interested to hear from our readers: What questions do you have? What are the policy changes or staffing decisions you are curious about? Please drop me a line atmat.honan@technologyreview.comIm eager to hear from you. If someone forwarded you this edition of The Debrief, you cansubscribe here. Now read the rest of The Debrief The News Palmer Luckey, who was ousted from Facebook over his support for the last Trump administration and went into defense contracting, is poised to grow in influence under a second administration. He recently talked to MIT Technology Review about how the Pentagon is using mixed reality. What does Donald Trumps relationship with Elon Musk mean for the global EV industry? The Biden administration was perceived as hostile to crypto. The industry can likely expect friendlier waters under Trump Some counter-programming: Life seeking robots could punch through Europas icy surface And for one more big take thats not related to the election: AI vs quantum. AI could solve some of the most interesting scientific problems before big quantum computers become a reality The Chat Every week Ill talk to one of MIT Technology Reviews reporters or editors to find out more about what theyve been working on. This week, I chatted with Melissa Heikkil about her story on how ChatGPT search paves the way for AI agents. Mat: Melissa, OpenAI rolled out web search for ChatGPT last week. It seems pretty cool. But you got at a really interesting bigger picture point about it paving the way for agents. What does that mean? Melissa: Microsoft tried to chip away at Googles search monopoly with Bing, and that didnt really work. Its unlikely OpenAI will be able to make much difference either. Their best bet is try to get users used to a new way of finding information and browsing the web through virtual assistants that can do complex tasks. Tech companies call these agents. ChatGPTs usefulness is limited by the fact that it cant access the internet and doesnt have the most up to date information. By integrating a really powerful search engine into the chatbot, suddenly you have a tool that can help you plan things and find information in a far more comprehensive and immersive way than traditional search, and this is a key feature of the next generation of AI assistants. Mat: What will agents be able to do? Melissa: AI agents can complete complex tasks autonomously and the vision is that they will work as a human assistant would book your flights, reschedule your meetings, help with research, you name it. But I wouldnt get too excited yet. The cutting-edge of AI tech can retrieve information and generate stuff, but it still lacks the reasoning and long-term planning skills to be really useful. AI tools like ChatGPT and Claude also cant interact with computer interfaces, like clicking at stuff, very well. They also need to become a lot more reliable and stop making stuff up, which is still a massive problem with AI. So were still a long way away from the vision becoming reality! I wrote anexplainer on agentsa little while ago with more details. Mat: Is search as we know it going away? Are we just moving to a world of agents that not only answer questions but also accomplish tasks? Melissa: Its really hard to say. We are so used to using online search, and its surprisingly hard to change peoples behaviors. Unless agents become super reliable and powerful, I dont think search is going to go away. Mat: By the way, I know you are in the UK. Did you hear we had an election over here in the US? Melissa: LOL The Recommendation Im just back from a family vacation in New York City, where I was in town to run the marathon. (I get to point this out for like one or two more weeks before the bragging gets tedious, I think.) While there, we went to see The Outsiders. Chat, it was incredible. (Which maybe should go without saying given that it won the Tony for best musical.) But wow. I loved the book and the movie as a kid. But this hit me on an entirely other level. Im not really a cries-at-movies (or especially at musicals) kind of person but I was wiping my eyes for much of the second act. So were very many people sitting around me. Anyway. If youre in New York, or if it comes to your city, go see it. And until then, the soundtrack is pretty amazing on its own. (Heres a great example.)0 Commentarios ·0 Acciones ·114 Views